Fazal Majid's low-intensity blog

Sporadic pontification

Fazal

Networked storage on the cheap

As hard drives get denser, the cost of raw storage is getting ridiculously cheap – well under a dollar per gigabye as I write. The cost of managed storage, however, is an entirely different story.

Managed storage is the kind required for “enterprise applications”, i.e. when money is involved. It builds on raw storage by adding redundancy, the ability to hot-swap drives, to add capacity without disruption. In the higher-end of the market, additional manageability features include fault tolerance, the ability to take “snapshots” of data for backup purposes, and to mirror data remotely for disaster recovery purposes.

Traditionally, managed storage has been more expensive than raw disk by a factor of at least two, sometimes even an order of magnitude or more. When I started my company in 2000, for instance, we paid $300,000, almost half of our initial capital investment, for a pair of clustered Network Appliance F760 filers, with a total disk capacity of 600GB or so ($500/GB, when disk drives would cost $10/GB at the time). The investment was well worth it, as these machines have proven remarkably reliable, and the Netapps’ instant snapshot capability is vital for us, as it allows us to take instantaneous snapshots of our Oracle databases, which we can then back up in a leisurely backup window, without having to keep Oracle in the performance-sapping backup mode during that time.

Web serving workloads and the like can easily be distributed across farms of inexpensive rackmount x86 servers, an architecture pioneered by ISPs. Midrange servers (up to 4 processors), pretty much commodities nowaday, are adequate for all but the very highest transaction volume databases. Storage and databases are the backbone of any information system, however, and a CIO cannot afford to take any risks with them, that is why storage represents such a high proportion of hardware costs for most IT departments, and why specialists like EMC have the highest profit margins in the industry.

Most managed storage is networked, i.e. does not consist of hard drives directly attached to a server, but instead of disks attached to a specialized storage appliance connected to the server with a fast interconnect. There are two schools:

  • Network-Attached Storage (NAS), like our Netapps, that basically serve act as network file servers using common protocols like NFS (for UNIX) and SMB (for Windows). These are more often used for midrange applications and unstructured data, and connect using inexpensive Ethernet (Gigabit Ethernet, in our case) networks every network administrator is familiar with. NAS are available for home or small office use, at prices of $500 and up.
  • Storage Area Networks (SAN) offer a block-level interface (they behave like virtual hard drives that serve fixed-size blocks of data, without any understanding of what is in them). They currently use Fibre Channel, a fast and low latency interconnect, that is unfortunately also terribly expensive (FC switches are over ten times more expensive than equivalent Gigabit Ethernet gear). The cost of setting up a SAN usually limits them to high-end, mainframe-class data centers. Exotic cluster filesystems or databases like Oracle RAC need to be used if multiple servers are going to access the same data.

One logical way to lower the cost of SANs is to use inexpensive Ethernet connectivity. This was recently standardized as iSCSI, which is essentially SCSI running on top of TCP/IP. I recently became aware of Ximeta, a company that makes external drives that apparently implement iSCSI, at a price that is very close to that of raw disks (since iSCSI does not have to manage state for clients the way a more featured NAS does, Ximeta can shun expensive CPUs and RAM, and use a dedicated ASIC instead).

The Ximeta hardware is not a complete solution, and the driver software manages the metadata for the cluster of networked drives, such as the information that allows multiple drives to be concatenated to add capacity while keeping the illusion of a single virtual disk. The driver is also responsible for RAID, although Windows, Mac OS X and Linux all have volume managers capable of this. There are apparently some Windows-only provisions to allow multiple computers to share a drive, but I doubt they constitute a full-blown clustered filesystem. There are very few real-world cases in the target market where anything more than a cold standby is required, and it makes a lot more sense to designate one machine to share a drive for the others in the network.

I think this technology is very interesting and has the potential to finally make SANs affordable for small businesses, as well as for individuals (imagine extending the capacity of a TiVo by simply adding networked drives in a stack). Disk-to-disk Backups are replacing sluggish and relatively low-capacity tape drives, and these devices are interesting for that purpose as well.

The classical music lover’s iPod

Sony’s Norio Ohga is a classically trained musician and conductor. In contrast, Steve Jobs is clearly not a classical music lover (and indeed is reportedly partially deaf). If he were a classical aficionado, the iPod would not be as poorly designed for classical music.

I have started backing up my extensive CD collection (99% classical) using the new Apple Lossless Encoder, and switched from my original 5GB iPod (which does not support ALE) to a new 15GB model, with half the upgrade paid for with my universal upgrade plan. I had actually started with straight uncompressed PCM audio, but while the old iPod nominally supported it, its hard drive or buffering algorithm would have a hard time keeping up the 1.5 Mbps flow of data required and often skip. I only used my old iPod on flights, where the ambient noise would drown out the low quality of MP3s, but the new one is a better device, specially when coupled with high-quality earphones from Etymotic Research, and I may use it more regularly.

The simplistic Artist/Album/Song schema is completely inadequate for classical music, where you need Composer/Performer/Album/Opus/Movement. This can be kludged by dropping Album and using it for the Opus instead, and fortunately in recent versions of iTunes and the iPod software, there is a field for Composer (which wasn’t there when the iPod was first released). The Gracenote online database CDDB is not normalized in any way, and rekeying the metadata is actually the most time-consuming part of the whole process (we have Sony and Philips to thank for this monumental oversight in the CD format).

Even then there still are flaws. If you have two different interpretations of the same piece by different performers, the iPod will interleave the tracks from both of them, so you have to add numbers to the Album field (used for Opus) to distinguish between them. If you use the “keep organized” option, folders are named after the artist rather than the composer, which is rather inconvenient and illogical. It could be worse: earlier versions of the iPod would actualy force pauses between tracks, which basically sabotaged dramatic transitions, like the one in J. S. Bach’s Magnificat between the aria “Quia respexit” and the thundering chorus of “Omnes Generationes”.

In passing, I have to tip my hat to Apple for pulling off one of the greatest scams in consumer history since the Bell Telephone company made people accept time-based billing for telephone use. Compare the 99 cents you pay for an individual track on the iTunes Music Store with the $18.99 list price for a Compact Disc (in most classical albums, you want the whole album, not just some stray bits of goodness in a sea of force-bundled filler material).

Instead of a high-quality 16-bit 44.1kHz PCM audio stream (or even better if you use one of the competing multichannel high-resolution formats SACD or DVD-Audio, although the difference is very subtle), you are paying for low-quality AAC files (I am the first one to admit that for most pop music, adding noise or distortion actually improves the signal to noise ratio). You also receive the dubious benefits of Digital Rights Management (i.e. they infringe on your fair use rights to protect the record industry cartel’s and get their acquiescence). No booklet, no durable storage medium, no possibility of resale.

Update (2004-06-06):

The iPod interface also seems to be poorly internationalized, unlike iTunes. It mangles the names of Antonín Dvořák or Bohuslav Martinů, but oddly enough not those of Camille Saint-Saëns or Béla Bartók.

Is the Nikon D70 NEF (RAW) format truly lossless?

Many digital photographers (including myself) prefer shooting in so-called RAW mode. In theory, the camera saves the data exactly as it is read off the sensor, in a proprietary format that can later be processed on a PC or Mac to extract every last drop of performance, dynamic range and detail from the captured image, something the embedded processor on board the camera is hard-pressed to do when it is trying to cook the raw data into a JPEG file in real time.

The debate rages between proponents of JPEG and RAW workflows. What it really reflects is two different approaches to photography, both equally valid.

For people who favor JPEG, the creative moment is when you press the shutter release, and they would rather be out shooting more images than slaving in a darkroom or in front of a computer doing post-processing. This was Henri Cartier-Bresson’s philosophy — he was notoriously ignorant of the details of photographic printing, preferring to rely on a trusted master printmaker. This group also includes professionals like wedding photographers or photojournalists for whom the productivity of a streamlined workflow is an economic necessity (even though the overhead of a RAW workflow diminishes with the right software, it is still there).

Advocates of RAW tend to be perfectionists, almost to the point of becoming image control freaks. In the age of film, they would spend long hours in the darkroom getting their prints just right. This is the approach of Ansel Adams, who used every trick in the book (he invented quite a few of them, like the Zone System) to obtain the creative results he wanted. In his later days, he reprinted many of his most famous photographs in ways that made them darker and filled with foreboding. For RAW aficionados, the RAW file is the negative, and the finished output file, which could well be a JPEG file, the equivalent of a print.

Implicit is the assumption that the RAW files are pristine and have not been tampered with, unlike JPEGs that had post-processing such as white balance or Bayer interpolation applied to them, and certainly no lossy compression. This is why the debate can get emotional when a controversy erupts, such as whether a specific camera’s RAW format is lossless or not.

The new Nikon D70’s predecessor, the D100, had the option of using uncompressed or compressed NEFs. Uncompressed NEFs were about 10MB in size, compressed NEF between 4.5MB and 6MB. In comparison, the Canon 10D lossless CRW format images are around 6MB to 6.5MB in size. In practice, compressed NEFs were not an option as they were simply too slow (the camera would lock up for 20 seconds or so while compressing).

The D70 only offers compressed NEFs as an option, but mercifully they have improved the performance. Ken Rockwell asserts D70 compressed NEFs are lossless, while Thom Hogan claims:

Leaving off Uncompressed NEF is potentially significant–we’ve been limited in our ability to post process highlight detail, since some of it is destroyed in compression.

To find out which one is correct, I read the C language source code for Dave Coffin’s excellent reverse-engineered, open-source RAW converter, dcraw, which supports the D70. The camera has a 12-bit analog to digital converter (ADC) that digitizes the analog signal coming out of the Sony ICX413AQ CCD sensor. In theory a 12-bit sensor should yield up to 212 = 4096 possible values, but the RAW conversion reduces these 4096 values into 683 by applying a quantization curve. These 683 values are then encoded using a variable number of bits (1 to 10) with a tree structure similar to the lossless Huffmann or Lempel-Ziv compression schemes used by programs like ZIP.

The decoding curve is embedded in the NEF file (and could thus be changed by a firmware upgrade without having to change NEF converters), I used a D70 NEF file made available by Uwe Steinmuller of Digital Outback Photo.

The quantization discards information by converting 12 bits’ worth of data into into log2(683) = 9.4 bits’ worth of resolution. The dynamic range is unchanged. This is a fairly common technique – digital telephony encodes 12 bits’ worth of dynamic range in 8 bits using the so-called A-law and mu-law codecs. I modified the program to output the data for the decoding curve (Excel-compatible CSV format), and plotted the curve (PDF) using linear and log-log scales, along with a quadratic regression fit (courtesy of R). The curve resembles a gamma correction curve, linear for values up to 215, then quadratic.

In conclusion, Thom is right – there is some loss of data, mostly in the form of lowered resolution in the highlights.

Does it really matter? You could argue it does not, as most color spaces have gamma correction anyway, but highlights are precisely where digital sensors are weakest, and losing resolution there means less headroom for dynamic range compression in high-contrast scenes. Thom’s argument is that RAW mode may not be able to salvage clipped highlights, but truly lossless RAW could allow recovering detail from marginal highlights. I am not sure how practicable this would be as increasing contrast in the highlights will almost certainly yield noise and posterization. But then again, there are also emotional aspects to the lossless vs. lossy debate…

In any case, simply waving the problem away as “curve shaping” as Rockwell does is not a satisfactory answer. His argument that the cNEF compression gain is not all that high, just as with lossless ZIP compression, is risibly fallacious, and his patronizing tone out of place. Lossless compression entails modest compression ratios, but the converse is definitely not true: if I replace the file with a file that is half the size but all zeroes, I have a 2:1 compression ratio, but 100% data loss. Canon does manage to get the close to the same compression level using lossless compression, but Nikon’s compressed NEF format has the worst of both world – loss of data, without the high compression ratios of JPEG.

Update (2004-05-12):

Franck Bugnet mentioned this technical article by noted astrophotographer Christian Buil. In addition to the quantization I found, it seems that the D70 runs some kind of low-pass filter or median algorithm on the raw sensor data, at least for long exposures, and this is also done for the (not so) RAW format. Apparently, this was done to hide the higher dark current noise and hot pixels in the Nikon’s Sony-sourced CCD sensor compared to the Canon CMOS sensors on the 10D and Digital Rebel/300D, a questionable practice if true. It is not clear if this also applies to normal exposures. The article shows a work-around, but it is too cumbersome for normal usage.

Update (2005-02-15):

Some readers asked whether the loss of data reflected a flaw in dcraw rather than actual loss of data in the NEF itself. I had anticipated that question but never gotten around to publishing the conclusions of my research. Somebody has to vindicate the excellence of Dave Coffin’s software, so here goes.

Dcraw reads raw bits sequentially. All bits read are processed, there is no wastage there. It is conceivable, if highly unlikely, that Nikon would keep the low-order bits elsewhere in the file. If that were the case, however, those bits would still take up space somewhere in the file, even with lossless compression.

In the NEF file I used as a test case, dcraw starts processing the raw data sequentially beginning at an offset of 963,776 bytes from the beginning of the file, and reads in 5.15MB of RAW data, i.e. all the way to the end of the 6.07MB NEF file. The 941K before the offset correspond to the EXIF headers and other metadata, the processing curve parameters and the embedded JPEG (which is usually around 700K in size on a D70). There is no room left elsewhere in the file for the missing 2.5 bits by 6 million pixels (roughly 2MB) of missing low-order sensor data. Even if they were compressed using a LZW or equivalent algorithm the way the raw data is, and assuming a typical 50% compression ratio for nontrivial image data, that would still mean something like 1MB of data that is unaccounted for.

Nikon simply could not have tucked the missing data away anywhere else in the file. The only possible conclusion is that dcraw does indeed extract whatever image data is available in the file.

Update (2005-04-17):

In another disturbing development in Nikon’s RAW formats saga, it seems they are encrypting white balance information in the D2X and D50 NEF format. This is clearly designed to shut out third-party decoders like Adobe Camera RAW or Phase One Capture One, and a decision that is completely unjustifiable on either technical or quality grounds. Needless to say, these shenanigans on Nikon’s part do not inspire respect.

Generally speaking, Nikon’s software is usually somewhat crude and inefficient (just for the record, Canon’s is far worse). For starters, it does not leverage multi-threading or the AltiVec/SSE3 optimizations in modern CPUs. Nikon Scan displays scanned previews at a glacial pace on my dual 2GHz PowerMac G5, and on a modern multi-tasking operating system, there is no reason for the scanning hardware to pause interminably while the previous frame’s data is written to disk.

While Adobe’s promotion of the DNG format is partly self-serving, they do know a thing or two about image processing algorithms. Nikon’s software development kit (SDK) precludes them from implementing those algorithms instead of Nikon’s, and thus disallows Adobe Camera RAW’s advanced features like chromatic aberration or vignetting correction. Attempting to lock out alternative image-processing algorithms is more an admission of (justified) insecurity than anything else.

Another important consideration is the long-term accessibility of the RAW image data. Nikon will not support the D70 for ever — Canon has already discontinued support in its SDK for the RAW files produced by the 2001 vintage D30. I have thousands of photos taken with a D30, and the existence of third-party maintained decoders like Adobe Camera RAW, or yet better open-source ones like Dave Coffin’s is vital for the long-term viability of those images.

Update (2005-06-23):

The quantization applied to NEF files could conceivably be an artifact of the ADC. Paradoxically, most ADCs digitize a signal by using their opposite circuit, a digital to analog converter (DAC). DACs are much easier to build, so many ADCs combine a precision voltage comparator, a DAC and a counter. The counter increases steadily until the corresponding analog voltage matches the signal to digitize.

The quantization curve on the D70 NEF is simple enough that it could be implemented in hardware, by incrementing by 1 until 215, and then incrementing by the value of a counter afterwards. The resulting non-linear voltage ramp would iterate over at most 683 levels instead of a full 4096 before matching the input signal. The factor of nearly 8 speed-up means faster data capture times, and the D70 was clearly designed for speed. If the D70’s ADC (quite possibly one custom-designed for Nikon) is not linear, the quantization of the signal levels would not in itself be lossy as that is indeed the exact data returned by the sensor + ADC combination, but the effect observed by Christian Buil would still mean the D70 NEF format is lossy.

Attack of the London taxis

London taxiLondon-style taxis (also known as “Hackney carriages) are becoming a common sight in San Francisco, which is apparently one of the first cities in the US to get them. It is amusing, really, when most observers in London expected them to disappear a few years ago. The antiquated look of the London taxi endears it to Londoners, but more importantly, they are very roomy for passengers, and easy to get in and out of, even when you are carrying an umbrella…

One (regular) taxi driver complained to me the London taxis are under-powered and do not go fast enough for him to zip to the other side of the city to pick a ride. Anyone who has seen taxicabs drive in this city knows this is a feature, not a bug, in the interests of public safety. Not that taxi drivers are worse than others – I have never been in another city where drivers violate red lights as casually as in San Francisco, even though I have lived in Paris and Amsterdam.

Taxis, along with docks, are one of the few domains in everyday life where byzantine nineteenth century work arrangements still prevail in defiance of the free market. Most cities arbitrarily limit the number of taxis that can ply the streets, a system that usually benefits taxi companies more than taxi drivers, who often end up in a position similar to sharecroppers. The quotas are seldom updated to reflect demand, due to lobbying by entrenched taxi companies, and cities like Paris or San Francisco often face severe taxi shortages. The French demographer Alfred Sauvy (PDF) related how ministers would fear the wrath of taxi strikers and chicken out of raising numbers.

In San Francisco, proposition K, passed in 1978, limits the number of taxi medallions to 1300. The measure was designed to let genuine taxi drivers, not companies, own the medallions, by requiring a nominal number of driving hours to retain the medallion. The lucky few who hold medallions lease them for $20,000-30,000 a year to taxi companies for when they are not driving themselves. Most actual taxi drivers do not have medallions and lease them for $100 a day or so from taxi companies (sharecroppers on plantations were not required to pay for the privilege of employment).

Of course, the people profiting from this cozy arrangement are never content – the permit holders want to drive less so they can enjoy the rent they are collecting from the coveted medallions. One attempted ploy was to reduce the driving hours requirement for disabled workers. Needless to say, had the measure been passed, overnight many permit holders would have found themselves mysteriously incapacitated. Taxi companies would like to grab medallions for themselves and cut off permit holders from the trough.

The right solution would be to abolish the medallion system altogether, or grant one to all working as opposed to rent-collecting drivers. But of course that is the one solution all vested interests are adamantly opposed to, as it would upset their apple cart. Given the abysmally dysfunctional state of San Francisco municipal politics, the situation is unlikely to improve. No amount of window-dressing with London style cabs is going to change that.

9 Beet Stretch