Fazal Majid's low-intensity blog

Sporadic pontification

Fazal

Parallelizing the command-line

Single-thread processor performance has stalled for a few years now. Intel and AMD have tried to compensate by multiplying cores, but the software world has not risen to the challenge, mostly because the problem is a genuinely hard one.

Shell scripts are still usually serial, and increasingly at odds with the multi-core future of computing. Let’s take a simple task as an example, converting a large collection of images from TIFF to JPEG format using a tool like ImageMagick. One approach would be to spawn a convert process per input file as follows:

#!/bin/sh
for file in *.tif; do
  convert $file `echo $file|sed -e 's/.tif$/.jpg/g' &
done

This does not work. If you have many TIFF files to convert (what would be the point of parallelizing if that were not the case?), you will fork off too many processes, which will contend for CPU and disk I/O bandwidth, causing massive congestion and degrading performance. What you want is to have only as many concurrent processes as there are cores in your system (possibly adding a few more because a tool like convert is not 100% efficient at using CPU power). This way you can tap into the full power of your system without overloading it.

The GNU xargs utility gives you that power using its -P flag. xargs is a UNIX utility that was designed to work around limits on the maximum size of a command line (usually 256 or 512 bytes). Instead of supplying arguments over the command-line, you supply them as the standard input of xargs, which then breaks them into manageable chunks and passes them to the utility you specify.

The -P flag to GNU xargsspecifies how many concurrent processes can be running. Some other variants of xargs like OS X’s non-GNU (presumably BSD) xargs also support -P but not Solaris’. xargs is very easy to script and can provide a significant boost to batch performance. The previous script can be rewritten to use 4 parallel processes:

#!/bin/sh
CPUS=4
ls *.tif|sed -e 's/.tif$//g'|gxargs -P $CPUS -n 1 -I x convert x.tif x.jpg

On my Sun Ultra 40 M2 (2x 1.8GHz AMD Opterons, single-core), I benchmarked this procedure against 920MB of TIFF files. As could be expected, going from 1 to 2 concurrent processes improved throughput dramatically, going from 2 to 3 yielded marginal improvements (convert is pretty good at utilizing CPU to the max). Going from 3 to 4 actually degraded performance, presumably due to the kernel overhead of managing the contention.

benchmark

Another utility that is parallelizable is GNU make using the -j flag. I parallelize as many of my build procedures as possible, but for many open-source packages, the usual configure step is not parallelized (because configure does not really understand the concept of dependencies). Unfortunately there are too many projects whose makefiles are missing dependencies, causing parallelized makes to fail. In this day and age of Moore’s law running out of steam as far as single-task performance is concerned, harnessing parallelism using gxargs -P or gmake -j is no longer a luxury but should be considered a necessity.

The Albanian scenario

People are only now beginning to realize the real estate bubble of the noughties was naught but a gigantic pyramid scheme. There is unexpected resistance to the idea of bailing out the investment bankers who did most to get us in this mess (while paying themselves handsomely to do so), and one of the counter-proposals is to give money to insolvent mortgage owners, i.e. reward the imprudent over those who followed the rules, did not lie about their income on a loan application. Economists call this moral hazard.

That said, the idea may have political wings. Investment bankers are not the only ones who like the idea of feeding at the public through. When a substantial enough proportion of the population loses its shirt in a pyramid scheme, it expects to be compensated by the public purse, and sometimes the entire social order breaks down, as happened in Albania circa 1990. Might this be the direction the US is headed towards?

Logorrhea

It’s conventional wisdom that politicians are self-absorbed windbags. Another piece of evidence to contribute: the longest words in the English and French languages are antidisestablishmentarian and anticonstitutionnellement respectively, both of which pertain to the political realm.

Crissy Field

One of my happiest experiences in the Bay Area was the reopening of Crissy Field as a national park. They were handing out free kites. I flew mine for a couple hours of pure, carefree, unalloyed fun, then gave it to three kids who had arrived too late to get one.

Crissy field is one of the windiest places in San Francisco, and ideal for flying kites. I am not sure who thought it would be a good place to build an airfield, though…

The importance of short iteration feedback cycles

I blog at best once or twice a month on my regular low-intensity blog, which runs my home-grown Mylos software, but am surprising myself by blogging on an almost daily schedule with this WordPress-based blog. Mylos is batch-based: you edit a post, run the script to regenerate the static pages, review, edit and iterate. It takes a minute to regenerate the entire site.

This is a similar effect to using an interpreted language like Python or PHP vs. a compiled language like C or Java. Even though I am more comfortable editing in Emacs (used by Mylos) than in a browser window, the short cycle between edit and preview in WordPress makes for a more satisfying experience and encourages me to blog more freely.

I suspect I will end up importing my Mylos weblog into WordPress, once I figure out how to address some niggling differences in functionality, such as the way images or attachments are handled, and how to use nginx as a caching reverse proxy in front of WordPress for performance reasons.