Fazal Majid's low-intensity blog

Sporadic pontification

Fazal Fazal

The value of over-the-counter service

My primary computer is a dual 2GHz PowerMac G5 until I can upgrade it with a Nehalem Mac Pro, most likely around the end of the year or early next year. I bought it in 2004, along with a 23″ Apple Cinema HD (the old pinstripe plastic bezel kind with an ADC connector). Unfortunately, about a year ago the CCFL backlight on the monitor started turning pink from old age, and thus unusable in a properly color-managed photographic workflow.

I used that as an excuse to splurge on a humongous (and agoraphobia-inducing) HP LP3065 30 inch LCD monitor after reading the glowing reviews. The two features that sold me were the enhanced color gamut (the only way to improve that would be to get a $6000 Samsung XL30, something I am not quite prepared to do), and the fact it has 3 built-in DVI ports, so it can easily be shared by multiple computers (assuming they support dual-link DVI, which unfortunately my basic spec Sun Ultra 40 M2 does not). The fact it was 25% less expensive than the Apple 30″ Cinema Display helped, of course.

About 6 months ago, I discovered there was a fine pink vertical line running across the entire height of the monitor, roughly 25 centimeters from the left. Since I primarily use that monitor for photo (the primary monitor for Mail, web browsing or terminals remains the Apple), at first I worried there was a defect with my camera. I managed to reproduce the problem with my MacBook Pro (they have dual-link DVI, unlike lesser laptops), and called HP support (the 3 year HP warranty was also an important consideration when I purchased).

My first support call in November 2007 went well, and the tech told me I would be contacted to arrange for an on-site exchange. This is a seriously heavy monitor and I did not relish the idea of lugging it back to FedEx, so getting premium support for a business-class monitor sounded an attractive proposition. Unfortunately, they never did call back, and as I had other pressing matters to attend to involving international travel, I just put it out of my mind (it is a very subtle flaw that is not even always visible).

I only got around to calling them back a few weeks ago. Unlike in November, I was given the run-around with various customer service reps in India until I was finally routed to a pleasant (and competent) tech in a suburb of Vancouver (the US dollar going in the direction it is, you have to wonder how much longer before HP outsources those call centers back to the US). The problem is not with Indian call centers, in any case, all but one of the CSRs were very polite (I suspect Indians learn more patience as they grow up than pampered Americans or Europeans would). The problem is poorly organized support processes and asinine scripts they are required to go through if they want to keep their jobs. In any case, the Canadian rep managed to find the FRU number and also told me someone would call to schedule an appointment. Someone did call this time, to let me know the part was back-ordered and they would call me when it becomes available.

This morning, as I was heading for the shower, my intercom buzzed. It was a DHL delivery man with the replacement monitor. I had to open the door to him in my bath robe. Naturally, nobody at HP bothered to notify me and had I left earlier, I would have missed him altogether.

One of the great things about Apple products is that if you live near an Apple store, you can just stop by their pretentiously-named Genius bars and get support for free (though not free repairs for out-of-warranty products, obviously). I now have a fully working HP monitor again, so I suppose I can’t complain too loudly, but the Apple monitor with the sterling support looks like the true bargain in hindsight.

Backing up is hard to do (right)

You can never overstate the importance of backups. Over the last year I have put quite a bit of effort in making sure my data is backed up properly. The purpose of this article is not to describe backup best practices (that is a vast subject, there are other, better resources available on the web, and in any case there is no one-size-fits-all solution). I am just documenting my setup, the requirements that drove it, and possibly give readers some ideas.

The first part in planning for backup is to do an inventory of the assets you are trying to protect. In my case, in order of priority:

  • 1.5GB of scans of important documents: birth certificates, diplomas, invoices, legal documents, bank statements, and so on. This data is very sensitive, and should be encrypted.
  • 150GB of digital photos and scans
  • My address book, which lives on my laptop
  • My source code repositories
  • My personal email, approximately .75GB
  • The contents of this website, about 5GB
  • 190GB of music (lossless rips of my CD collection)
  • My Temboz article database

Thus the total storage capacity required for a full backup is reaching the 400GB mark. This in itself precludes DVD-R or even tape backup (short of buying an expensive LTO-4 tape drive or an autoloader, that is).

The second step is to devise your threat model. In my case, by decreasing order of likelihood:

  1. Human error
  2. Hard drive failure
  3. Software failure (e.g. filesystem corruption)
  4. Silent data loss or corruption, e.g a defective disk
  5. Theft
  6. Fire, earthquake, natural disaster, etc.

Third, some general principles I believe in:

  • Do not use proprietary backup formats. The best format is plain files on a filesystem identical in structure to the original.
  • Do not rely on offline media for backups. The watched pot does not boil over, online data is much less likely to go bad without my noticing until it is too late.
  • A backup plan needs to be effortless to be successful. Plugging in external drives when backups are needed, or rotating drives between home and office is something I have tried, but not stuck to.
  • Backups should be verified — they should generate positive feedback, so that the absence of feedback can alert to problems
  • For all types of data, there should be one and only one reference machine that holds the authoritative copy. Multi-master synchronization and replication is possible using tools like Unison, but is much harder to manage and increases the risk of human error.

With these preliminaries out of the way, here is my system:

  • My primary backups reside on my home server, a Sun Ultra 40 M2 workstation, running Solaris 10. This machine is very quiet, so I can keep it running in the room next to my bedroom without disturbing my sleep. It is also relatively power-efficient at 160W with seven hard drives.
  • One of the seven drives is the 160GB boot drive, and the other six are 750GB Seagate drives configured in a 3TB ZFS RAID-Z2 storage pool.
  • With large SATA drives, reconstruction after a drive failure is long and the risk of another drive failing due to the stress of rebuilding is not negligible. RAID-Z2 can tolerate two drives failing, unlike RAID 5 which can only tolerate a single drive failure. This level of data protection is higher than RAID 1 since RAID 1 won’t protect you if two drives that are the mirror of one another fail. You can get the same level of protection in RAID 6 or RAID-DP.
  • I have scripts to take ZFS snapshots daily, equivalent to the auto-snapshot service. The daily snapshots are kept for the current month, then I keep only monthly snapshots. Snapshots are the primary line of defense against human error.
  • Snapshot technology consumes only as much disk space as required to store the differences between the snapshot and current versions of a file, and is much more efficient than schemes like Apple’s Time Machine where a single byte change to a multi-gigabyte file like a Parallels virtual disk image will cause the entire file to be duplicated, wasting storage. Because snapshots are taken near instantly and cost almost nothing, they are an extremely powerful feature of a storage subsystem.
  • I backup from my various machines to the Sun via rsync over ssh. An incremental backup of my PowerMac G5, which has most of the 400GB in my backup set, takes less than 5 minutes over Gigabit Ethernet, despite the ssh encryption.
  • ZFS is probably the best filesystem, bar none, but it is not perfect, as demonstrated by the Joyent outage and you still need another copy for backup in case of ZFS corruption.
  • Every night at 2AM a cron job on my old home server (2x400GB, ZFS RAID 0), that I now I keep at work, pulls updates from the Sun using rsync over ssh (the company firewall won’t let me push updates to it from the Sun). Another cron job at 8AM kills any leftover rsync processes, e.g. if there are more data changes to transfer than fit in the 1-2 GB that can be transferred in 6 hours over my relatively pokey 320-512kbps DSL uplink (no thanks to AT&T’s benighted refusal to upgrade its tired infrastructure).
  • My cron jobs use verbose output which generates an email sent back to me. I could suppress those messages, but then I would lose the ability to detect errors.
  • A last line of defense is to back up my server at work to a D-Link DNS-323 NAS box using rsync over NFS. This cute little unit holds two Western Digital Green Power 1TB drives in RAID 1, which slide right in, no tools required. It consumes next to no power or desk space. Since it runs Linux and is easy to extend using fun-plug, I could conceivably run the cron and rsync from there. As a bonus, the built-in mt-daapd server streams my entire music collection to iTunes over the LAN so I can listen to any of my CDs at work.
  • It can take a few days for this data bucket brigade to catch up with a particularly intense photo shoot, but it will eventually and is never too far behind. This provides me with near continuous data protection and disaster recovery.

Update (2009-10-07):

I made some changes. My office backup server is now an inexpensive Shuttle KPC 4500 running OpenSolaris 2009.06 and a 1TB drive. It in turn backs up to the DNS-323, although I need to qualify the recommendation – like many embedded Linux devices, the DNS-323 has a distressing tendency to get wedged every now and then, requiring a reboot, and is not reliable enough as primary offsite backup in my book. OpenSolaris, of course, is rock-stable, and the hardware is not much more expensive (I paid $400 for the KPC).

My backups are now much faster since I upgraded to 20Mbps symmetric Metro Ethernet service from Webpass a month ago.

Update (2014-01-09):

Since I moved to a semi-suburban house two years ago and had to revert to AT&T’s abysmally slow DSL service, remote backups over rsync are no longer a viable option and I have to use sneakernet. My current setup is:

  • A Time Machine backup onto a 4TB internal drive inside my Mac
  • hourly rsync backups onto a 2TB WD My Passport Studio. I actually have two of these and rotate them between home and office. They have a metal case (helps heat dissipation and increase drive lifetime and reliability) as well as hardware AES encryption

Push recruiting

As I was debugging why feedparser is mangling the GigaOM feed titles, I found this easter egg on the WordPress hosted site:

zephyr ~>telnet gigaom.com 80
Trying 72.232.101.40...
Connected to gigaom.com.
Escape character is '^]'.
GET /feed HTTP/1.0
Host: gigaom.com

HTTP/1.0 301 Moved Permanently
Vary: Cookie
X-hacker: If you're reading this, you should visit automattic.com/jobs and
apply to join the fun, mention this header.
Location: http://feeds.feedburner.com/ommalik
Content-type: text/html; charset=utf-8
Content-Length: 0
Date: Thu, 20 Mar 2008 23:36:17 GMT
Server: LiteSpeed
Connection: close

Connection closed by foreign host.

Knowing how to issue HTTP requests by hand is one of my litmus tests for a web developer, but I had never thought of using it in this creative way as a recruiting tool…

US banks lag behind in secure email adoption

My banks send me monthly reminders when a statement is ready, but I have to log onto their site to actually get it. This is quite annoying, I would much rather have them simply attach the statements to the notification emails, but I can understand their security concerns. The current system does encourage bad habits that can be exploited by phishers, however.

One of my colleagues informed me that in Japan, banks will actually send them by email using S/MIME public key encryption. I have a S/MIME certificate courtesy of the Thawte web of trust (in fact I am also a Thawte WOT notary) but no US bank that I know of supports this. Secure email adoption is so low in no small part due to the NSA’s successful campaign to make encryption inconvenient to obtain. All major email clients support it (Outlook, Apple Mail.app, Thunderbird, and so on), but webmail users don’t even have the option. This is just another illustration of how the US is lagging behind Asia and Europe in Internet adoption.

Macworld 2008 round-up

MacBook AirThe MacBook Air was what I was waiting for (I pre-ordered the SSD version just before the online Apple Store buckled under the load). I have a MacBook Pro 15″, and because of its weight I end up leaving it at work and not carry it with me at all times (the MacBook is hardly any lighter). Sure, the Air has drastically limited connectivity (the lack of Gigabit Ethernet is probably what I will regret most, even though I clocked my Airport Extreme at 90 true Mbps throughput). Other minuses include the glossy screen (instead of an anti-reflective one), the MacBook-like chiclet keyboard rather than the much nicer MacBook Pro keyboard), or the sealed non user-replaceable battery.

I suspect people deriding it are people whose main machine is a laptop. My main machine is a tower desktop, and no laptop is ever going to compete in terms of capacity and expandability. The drive on the laptop is merely a cache for the desktop where the real data lives. The compromises the Air makes are acceptable ones in exchange for a machine that is light enough for me to carry all the time. I was considering getting an Asus Eee PC prior to the show, and the MacBook Air is a vastly more capable and polyvalent machine.

Apart from that, the show was a relatively quiet one with few truly noteworthy new products. Here are the main highlights:

  • Matias did not have the Tactilepro 2.0 keyboard on display. I love mine (a version 1 with the ALPS keyswitch) and would like to get a spare, but apparently they have parted ways with the manufacturer of the new Matias-designed keyswitches and are working on a 3.0 version for later this year.

  • Fujitsu were demonstrating an ultra-small, bus-powered document scanner, the S300M. Unfortunately, once again for reasons due to licensing of the bundled software, they could not release a single SKU that would work with both PCs and Macs.

  • The German company Project Wizards was demonstrating Merlin, a project management program similar to Microsoft Project. The scheduling and load-leveling algorithms look at least as capable as Project 2000, and they told me the next version will allow team members to report on task advancement by simply contacting a built-in web server. Looks like a promising product.

  • Samsung showed the CLP-300 which they bill the world’s smallest color laser printer. Indeed it looks roughly the same size as my monochrome HP LaserJet 1320, and much smaller than my bulky HP 2605dn, that’s quite an achievement. I am wary of Samsung lasers since buying the CLP-500 for Kefta a few years back. The print quality was fine, but it was ludicrously slow, taking something like 5 minutes per color page to print. The CLP-300 seems reasonably fast, faster than the 2605dn at any rate.

  • Samsung was also showing off the gorgeous XL30 30″ LED-backlit LCD monitor. LED backlight is more environmentally friendly, does not shift colors as it ages unlike a TFT backlight, and gives a wider color gamut. Unfortunately, its price is a princely “between $6000 and $7000”.

  • Microsoft was showing off Office 2008, emphasizing ease of use and productivity rather than features for features’ sake for a change. Microsoft Blogger lounge

    They even set up a bloggers-only salon to curry favor, complete with Internet cafe and snacks.

    • I tried Nikon’s humongous AF-S VR Nikkor 200mm f/2G IF-ED lens. Very heavy but impressive piece of gear.
    • Canon was showing off the new Flash-based HD camcorders they introduced at CES. They are not that much smaller than the HDV ones. The HV30 replaces the excellent HV20, but the only real improvements are 1080p30 mode and an articulating LCD.