Fazal Majid's low-intensity blog

Sporadic pontification

Fazal

Temboz 0.4 released

I have released version 0.4 of my web-based aggregator, Temboz. The new version focuses on performance, by adding an index and rewriting some queries to gain almost an order of magnitude performance on the two most common operations, viewing unread articles and the “all feeds” summary page. Upgraders will need to read the UPGRADE file to add the index to their existing database.

Is this a sign RSS is finally going mainstream?

Dialogue from The Librarian, a TV movie on TNT. Sheltered “professional student” Flynn Carsen (played by Noah Wyle of “E.R.” fame) is kicked out of school and is interviewing for a job at a vaguely supernatural library:

Charlene (Jane Curtin, coldly): What makes you think you could be the Librarian?

Flynn Carsen (Noah Wyle): I know the Dewey decimal system, Library of Congress, research paper orthodoxy, web searching, I can set up a RSS feed…

Charlene (stifling a sigh): Everybody can do that. They’re librarians.

Pointless referrer spamming

Q: What happens when you cross a mobster with a cell phone company?
A: Someone who makes you an offer you can’t understand.

The HTTP protocol used by web browsers specifies an optional Referer: (sic) header that allows them to tell the server where the link to a page came from. This was originally intended as a courtesy, so webmasters could ask people with obsolete links to update their pages, but it is also a valuable source of information for webmasters who can find out which sites link to them, and in most cases what keywords were used on a search engine. Unfortunately, spammers have found another well to poison on the Internet.

Over the past month, referrer spam on my site has graduated from nuisance to menace, and I am writing scripts that attempt to filter that dross automatically out of my web server log reports. In recent days, it seems most of the URLs spammers are pushing on me point to servers with names that aren’t even registered in the DNS. This seems completely asinine, even for spammers: why bother spamming someone without a profit motive? I was beginning to wonder whether this was just a form of vandalism like graffiti, but it seems the situation is more devious than it seems at first glance.

Referrer spam is very hard to fight (although not quite as difficult as email spam). I am trying to combine a number of heuristics, including behavioral analysis (e.g. whether the purported browser is downloading my CSS files or not), WHOIS lookups, reverse lookups for the client IP address, and so on. Unfortunately, if any of these filtering methods become widespread, the spammers can easily apply countermeasures to make their requests look more legitimate. This looks like another long-haul arms race…

Pay for the razor, pay for the blades

King Gillette is famous for his invention of the disposable-blade razor, and the associated business model, “give away the razor, sell the blades”. This strategy was widely imitated, but it seems marketers have struck an even better one: why give away the razor when you can make the chumps pay for it?

There are a number of products, some high-tech and some not where you actually pay handsomely for a device that is a doorstop without proprietary refills or service. Some examples:

  • In the US, most cell phones are either hard-wired to a specific service provider (CDMA) or SIM-locked (GSM). A consumers’ group is fighting in court to ban or at least limit in time the practice, which is either outlawed or strictly regulated in most other countries.

    Sure, the carrier is subsidizing the handset, but that is offset by extra profit margins in the contract. Once the contract’s minimum term is over, there is no justification whatsoever for maintaining the SIM lock. AT&T was one of the most egregious offenders, it is not clear if their policy will change after their takeover by Cingular.

    I suspect one of the big reasons for SIM lock is so carriers can charge extortionate international roaming charges, since without SIM lock, it would be cheaper to just pop in a prepaid SIM card in the country you are visiting. Actually, roaming charges are so overpriced that it is cheaper to just buy a new phone for the prepaid card and toss it away afterwards.

    There are real externality costs to society due to distortions in consumer behavior from carrier policies. Many people throw away their old cell phones when they change service or renew a contract, as the subsidy is only applicable towards a new phone purchase, never granted as a rebate to people opting to keep their older but perfectly serviceable phone. In California alone, 44,650 cell phones are discarded each day, usually ending up in landfill, at tremendous cost to the environment.

  • MP3.com founder Michael Robertson is suing Vonage for trying to extend the same despicable lock-in model to VoIP, with what he claims is deceptive advertising. Most commentators have rushed to Vonage’s defense — apparently, for many geeks the company can do no wrong, like Google. I have no such compunctions, as I have in the past received completely unsolicited spam from them, and thus as far as I am concerned, they fit in the “scum” category.

  • In a great illustration of the power of cognitive dissonance, TiVo is another company with rabid and uncritical fans. Originally, TiVo PVRs would remain somewhat functional even without the TiVo service. Sure, you would have to program shows manually, but that is no worse than most VCRs. Over successive software updates, TiVo have reduced their PVRs’ autonomy until they are now effectively useless without the service.

  • Inkjet printer manufacturers use all sorts of tricks to protect their racket, including putting in microchips designed to foil refilling or the use of third-party cartridges. Lexmark even tried to abuse the DMCA to prevent a competitor from selling reverse-engineered cartridge chips. All this so inkjet ink can remain the most expensive liquid, at significantly higher cost per milliliter than Chanel No. 5 or vintage Dom Perignon.

As in most cases the utility of the machine without the overpriced refills or service is nil, the fair market price for it should be zero. The Vonage/Linksys situation is a special case as the wireless router remains partially usable, albeit without VoIP features if you switch providers. But marketers will keep trying to have it both ways until consumers push back by implementing a zero-tolerance policy, akin to the “broken-window” theory of policing. Do not accept to pay for a cell phone from a carrier that refuses to unlock it after a reasonable amount of time. Refuse to purchase digital devices that require service from a specific vendor to function.

A reader-writer lock for Python

Python offers a number of useful synchronization primitives in the threading and Queue modules. One that is missing, however, is a simple reader-writer lock (RWLock). A RWLock allows improved concurrency over a simple mutex, and is useful for objects that have high read-to-write ratios like database caches.

Surprisingly, I haven’t been able to find any implementation of these semantics, so I rolled my own in a module rwlock.py to implement a RWLock class, along with lock promotion/demotion. Hopefully it can be added to the standard library threading module. This code is hereby placed in the public domain.

"""Simple reader-writer locks in Python
Many readers can hold the lock XOR one and only one writer"""
import threading

version = """$Id: 04-1.html,v 1.3 2006/12/05 17:45:12 majid Exp $"""

class RWLock:
  """
A simple reader-writer lock Several readers can hold the lock
simultaneously, XOR one writer. Write locks have priority over reads to
prevent write starvation.
"""
  def __init__(self):
    self.rwlock = 0
    self.writers_waiting = 0
    self.monitor = threading.Lock()
    self.readers_ok = threading.Condition(self.monitor)
    self.writers_ok = threading.Condition(self.monitor)
  def acquire_read(self):
    """Acquire a read lock. Several threads can hold this typeof lock.
It is exclusive with write locks."""
    self.monitor.acquire()
    while self.rwlock < 0 or self.writers_waiting:
      self.readers_ok.wait()
    self.rwlock += 1
    self.monitor.release()
  def acquire_write(self):
    """Acquire a write lock. Only one thread can hold this lock, and
only when no read locks are also held."""
    self.monitor.acquire()
    while self.rwlock != 0:
      self.writers_waiting += 1
      self.writers_ok.wait()
      self.writers_waiting -= 1
    self.rwlock = -1
    self.monitor.release()
  def promote(self):
    """Promote an already-acquired read lock to a write lock
    WARNING: it is very easy to deadlock with this method"""
    self.monitor.acquire()
    self.rwlock -= 1
    while self.rwlock != 0:
      self.writers_waiting += 1
      self.writers_ok.wait()
      self.writers_waiting -= 1
    self.rwlock = -1
    self.monitor.release()
  def demote(self):
    """Demote an already-acquired write lock to a read lock"""
    self.monitor.acquire()
    self.rwlock = 1
    self.readers_ok.notifyAll()
    self.monitor.release()
  def release(self):
    """Release a lock, whether read or write."""
    self.monitor.acquire()
    if self.rwlock < 0:
      self.rwlock = 0
    else:
      self.rwlock -= 1
    wake_writers = self.writers_waiting and self.rwlock == 0
    wake_readers = self.writers_waiting == 0
    self.monitor.release()
    if wake_writers:
      self.writers_ok.acquire()
      self.writers_ok.notify()
      self.writers_ok.release()
    elif wake_readers:
      self.readers_ok.acquire()
      self.readers_ok.notifyAll()
      self.readers_ok.release()

if __name__ == '__main__':
  import time
  rwl = RWLock()
  class Reader(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_read()
      print self, 'acquired'
      time.sleep(5)
      print self, 'stop'
      rwl.release()
  class Writer(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_write()
      print self, 'acquired'
      time.sleep(10)
      print self, 'stop'
      rwl.release()
  class ReaderWriter(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_read()
      print self, 'acquired'
      time.sleep(5)
      rwl.promote()
      print self, 'promoted'
      time.sleep(5)
      print self, 'stop'
      rwl.release()
  class WriterReader(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_write()
      print self, 'acquired'
      time.sleep(10)
      print self, 'demoted'
      rwl.demote()
      time.sleep(10)
      print self, 'stop'
      rwl.release()
  Reader().start()
  time.sleep(1)
  Reader().start()
  time.sleep(1)
  ReaderWriter().start()
  time.sleep(1)
  WriterReader().start()
  time.sleep(1)
  Reader().start()