Fazal Majid's low-intensity blog

Sporadic pontification

Fazal Fazal

Migrating to Hugo

I have been meaning to move away from Wordpress to a static site generator for a very long time, due to:

  • The slowness of WP, since every page request makes multiple database calls due to the spaghetti code nature of WP and its plugin architecture. Caching can help somewhat, but it has brittle edge cases.
  • Its record of security holes. I mitigated this somewhat by isolating PHP as much as possible.
  • It is almost impossible to follow front-end optimization best-practices like minimizing the number of CSS and JS files because each WP plugin has its own

My original plan was to go with Acrylamid, but about a year ago I started experimenting with Hugo. Hugo is blazing fast because it is implemented in Go rather than a slow language like Python or Ruby, and this is game-changing. Nonetheless, it took me over a year to migrate. This post is about the issues I encountered and the workflow I adopted to make it work.

Wordpress content migration

There is a migration tool, but it is far from perfect despite the author’s best efforts, mostly because of the baroque nature of Wordpress itself when combined with plugins and an old site that used several generations of image gallery technology.

Unfortunately, that required rewriting many posts, specially those with photos or embedded code.

Photo galleries

Hugo does not (yet) support image galleries natively. I started looking at the HugoPhotoSwipe project, but got frustrated by bugs in its home-grown YAML parser that broke round-trip editing, and made it very difficult to get galleries with text before and after the gallery proper. The Python-based smartcrop for thumbnails is also excruciatingly slow.

I wrote hugopix to address this. It uses a simpler one-way index file generation method, and the much faster Go smartcrop implementation by Artyom Pervukhin.

Broken asset references

Posts with photo galleries were particularly broken, due to WP’s insistence on replacing photos with links to image pages. I wrote a tool to help me find broken images and other assets, and organize them in a more rational way (e.g. not have PDFs or source code samples be put in static/images).

It also has a mode to identify unused assets, e.g. 1.5GB of images that no longer belong in the hugo tree as their galleries are moving elsewhere.

Password-protected galleries

I used to have galleries of family events on my site, until an incident where some Dutch forum started linking to one of my cousin’s wedding photos and making fun of her. At that point I put a pointed error message for that referrer and controlled access using WP’s protected feature. That said, private family photos do not belong on a public blog and I have other dedicated password-protected galleries with Lightroom integration that make more sense for that use case, so I just removed them from the blog, shaving off 1.5GB of disk in the bargain.

There are systems that can provide search without any server component, e.g. the JavaScript-based search in Sphinx, and I looked at some of the options referenced by the Hugo documentation like the Bleve-based hugoidx but the poor documentation gave me pause, and I’d rather not run Node.js on my server as needed by hugo-lunr.

Having recently implemented full-text search in Temboz using SQLite’s FTS5 extension, I felt more comfortable building my own search server in Go. Because Hugo and fts5index share the same Go template language, this makes a seamless integration in the site’s navigation and page structure easy.

Theme

There is no avoiding this, moving to a new blogging system requires a rewrite of a new theme if you do not want to go with a canned theme. Fortunately, Hugo’s theme system is sane, unlike Wordpress’, because it does not have to rely on callbacks and hooks as much as with WP plugins.

One pet peeve of mine is when sites change platform with new GUIDs or permalinks in the RSS feeds, causing a flood of old-new articles to appear in my feed reader. Since I believe in showing respect to my readers, I had to avoid this at all costs, and also put in place redirects as needed to avoid 404s for the few pages that did change permalinks (mostly image galleries).

Doing so required copying the embedded RSS template and changing:

<guid>{{ .Permalink }}</guid>

to:

<guid isPermaLink="false">{{ .Params.rss_guid | default .Permalink }}</guid>

The next step was to add rss_guid to the front matter of the last 10 articles in my legacy RSS feed.

How big can a panorama get?

I use the Kolor AutoPano Giga panorama-stitching software, recently acquired by GoPro, but I have yet to produce a gigapixel panorama like those they pioneered. This brings up an interesting question: given a camera and lens, what would the pixel size of the largest 360° stitched panorama be?

Wikipedia to the rescue: using the formula for the solid angle of a pyramid, the full panorama size of a camera with m megapixels on a sensor of a x b using a focal length of f would be:

m * π / arctan(ab / 2f / sqrt(4f2 + a2 + b2))

For single-strip panoramas of height h (usually a or b), the formula would be:

m * π * h / 2f / arctan(ab / 2f / sqrt(4f2 + a2 + b2))

(this applies only to rectilinear lenses, not fisheyes or other exotics).

Here is a little JavaScript calculator to apply the formula (defaults are for the Sony RX1RII, the highest resolution camera I own):

MP
mm actual 35mm equivalent

MP
MP
MP

The only way I can break through the gigapixel barrier with a prime lens is using my 24MP APS-C Fuji X-T2 with a 90mm lens.

Update (2020-01-21):

Now I could reach 171 gigapixels with my Nikon Z7 and the Nikkor 500mm f/5.6 PF.

Update (2021-01-30):

There was an error in the JavaScript that implements the calculator, it used 4f instead of 4f2, and for telephoto focal lengths, the difference is dramatic. Thanks to users ZS360 and GerladDXB at DPReview for pointing out my error.

Scanner group test

TL:DR Avoid scanners with Contact Image Sensors if you care at all about color fidelity.

Vermeer it is not

After my abortive trial of the Colortrac SmartLF Scan, I did a comparative test of scanning one of my daughter’s A3-sized drawings on a number of scanners I had handy.

Scanner Sensor Scan
Colortrac SmartLF Scan CIS ScanLF.jpg
Epson Perfection Photo V500 Photo (manually stitched) CCD Epson_V500.jpeg
Epson Perfection V19 (manually stitched) CIS Epson_V19.jpg
Fujitsu ScanSnap S1500M (using a carrier sheet and the built-in stitching) CCD S1500M_carriersheet.jpg
Fujitsu ScanSnap SV600 CCD SV600.jpg
Fuji X-Pro2 with XF 35mm f/1.4 lens, mounted on a Kaiser RS2 XA copy stand with IKEA KVART 3-spot floor lamp (CCT 2800K, a mediocre 82 CRI as measured with my UPRtek CV600) CMOS X-Pro2.jpg

I was shocked by the wide variance in the results, as was my wife. This is most obvious in the orange flower on the right.

Comparison

I scanned a swatch of the orange using a Nix Pro Color Sensor (it’s the orange square in the upper right corner of each scan in the comparison above). When viewed on my freshly calibrated NEC PA302W SpectraView II monitor, the Epson V500 scan is closest, followed by the ScanSnap SV600.

The two scanners using Contact Image Sensor (CIS) technology yielded dismal results. CIS are used in low-end scanners, and they have the benefit of low power usage, which is why the only USB bus-powered scanners available are all CIS models. CIS sensors begat the CMOS sensors used by the vast majority of digital cameras today, superseding CCDs in that application, I would not have expected such a gap in quality.

The digital camera scan was also quite disappointing. I blame the poor quality of the LEDs in the IKEA KVART three-headed lamp I used (pro tip: avoid IKEA LEDs like the plague, they are uniformly horrendous).

I was pleasantly surprised by the excellent performance of the S1500M document scanner. It is meant to be used for scanning sheaves of documents, not artwork, but Fujitsu did not skimp and used a CCD sensor element, and it shows.

Pro tip: a piece of anti-reflective Museum Glass or equivalent can help with curled originals on the ScanSnap SV600. I got mine from scraps at a framing shop. I can’t see a trace of reflections on the scan, unlike on the copy stand.

Update (2018-10-14):

Even more of a pro tip: a Japanese company named Bird Electron makes a series of accessories for the ScanSnap line, including a dust cover for the SV600 and the hilariously Engrish-named PZ-BP600 Book Repressor, essentially a sheet of 3mm anti-reflection coated acrylic with convenient carry handles. They are readily available on eBay from Japanese sellers.

Colortrac SmartLF Scan review

TL:DR summary

Pros:

  • Scans very large documents
  • Easy to use
  • Packs away in a convenient carrying case

Cons:

  • So-so color fidelity
  • Hard to feed artwork straight
  • Dust and debris can easily get on the platen, ruining scans
  • Relatively expensive for home use

Review

One thing you do not lack for when your child enters preschool is artwork. They generate prodigious amounts of it, with gusto, and they are often large format pieces on 16×24″ paper (roughly ISO A2). The question is, what do you do with the torrent?

I decided I would scan them, then file them in Ito-Ya Art Profolios, and possibly make annual photobooks for the grandparents. This brings up the logistical challenge of digitizing such large pieces. Most flatbed scanners are limited to 8.5×14″ (US Legal) format. Some like the Epson Expression 11000XL and 12000XL can scan 11×17″ (A3), as can the Fujitsu ScanSnap SV600 book scanner, but that is not fully adequate either. One option would be to fold the artwork up, scan portions then stitch them together in AutoPanoGiga or Photoshop, but that would be extremely cumbersome, specially when you have to do a couple per day. I do not have access to a color copier at my office, and most of these are only A3 anyway.

I purchased a Kaiser RS2 XA copy stand (cheaper to get it direct from Europe on eBay than from the usual suspects like B&H) and got a local framing shop to cut me a scrap of anti-reflective Museum Glass. This goes up to 16×20″ for the price of a midrange flatbed scanner, but it is tricky to set up lights so they don’t induce reflections (no AR coating is perfect), perfectly aligning the camera with the baseboard plane is difficult (I had to shim it using a cut-up credit card), and this still doesn’t solve the problem of the truly large 16×24″ artwork (stands able to handle larger formats are extremely expensive and very bulky).

I then started looking at large-format scanners like those made by Contex or Océ. They are used by architecture firms to scan blueprints and the like, but they are also extremely large, and cost $3000-5000 for entry-level models, along with onerous DRM-encumbered software that requires license dongles and more often than not will not run on a Mac. They are also quite bulky, specially if you get the optional stands.

That is why I was pleasantly surprised to learn British company Colortrac makes a model called the SmartLF Scan! (I will henceforth omit the over-the-top exclamation mark). It is self-contained (can scan to internal memory or a USB stick, although it will also work with a computer over USB or Ethernet, Windows-only, unfortunately), available in 24″ or 36″ wide versions, is very compact compared to its peers, and is even supplied with a nifty custom-fitted wheeled hard case. The price of $2,000 ($2,500 for the 36″ version), while steep for home use, is well within the range of enthusiast photo equipment. I sold a few unused cameras to release funds for one.

Once unpacked, the scanner is surprisingly light. It is quite wide, obviously, to be able to ingest a 24″ wide document (see the CD jewel case in the photo above for scale). There is a LCD control panel and a serviceable keypad-based (not touch) UI. The power supply is of the obnoxious wall-wart type. I wish they used text rather than inscrutable icons in the UI—it is much more informative and usable to see a menu entry for 400dpi resolution rather than checkerboard icons with various pitches.

After selecting your settings (or saving them as defaults), you load paper by feeding it from the front, face up. It is quite hard to feed large-format paper straight, and this is compounded by the lack of guides. On the other hand it is hard to see how Colortrac could have fitted photocopier-style guide rails in such a compact design, and they would be likely to break.

The scanner is simplex, not duplex, unsurprisingly at that price point. The sensor is on top of the feed, which helps control dust and debris sticking to it, but when scanning painted artwork, there will inevitably be crumbs of paint that will detach and stick to the sensor platen. This manifests itself as long dark vertical lines spoiling subsequent scans, something I occasionally also see on my Fujitsu ScanSnap document scanner. Cleaning the Colortrac is way easier than on the ScanSnap, as unfolding rear legs and releasing front catches opens it wide, and a few passes with optical cleaning wipes (I use Zeiss’ single-use ones) will do the trick.

By the manufacturer’s own admission, the scanner is designed to scan technical drawings, not art. It uses a linear contact image sensor (CIS) like lower-end flatbed scanners and document scanners, unlike the higher-fidelity charge-coupled device (CCD) sensors used in higher-end graphics arts and photo scanners. The light source is a row of point light LEDs that casts relatively harsh shadows on the paper. They do make CCD scanners for graphics arts, but they start at $10,000… Contex makes an A2 flatbed CCD scanner, the HD iFlex, but it costs $6,700 (at Jet.com of all places), their iQ Quattro 2490 at $4,500 is the most viable step-up (it uses a CIS, but offers 16-bit color, AdobeRGB and beyond gamut, calibration and magnetic paper guides).

The scanner’s resolution is 600dpi. Scanning 16×24″ originals at that resolution yields a 138MP file that is nearly a gigabyte in size. The 400dpi setting yields a much more reasonable 200MB or so, and compressing them further using tiffcp with zip compression (not an option on the scanner) yields 130-140MB files.

Unfortunately, I ended up returning it. There was a 1cm scratch in the glass platen, which manifested itself as streaks. It takes quite a bit to scratch glass (I don’t think it was Lexan or similar), and I wasn’t scanning sandpaper, so it must have been a factory defect or a customer return. When I looked at the color fidelity of the scans, I was not inclined to order a replacement, and got a Fujitsu ScanSnap SV600 from Japan instead from an Amazon third-party reseller (25% savings over the US price, even if you usually forgo a US warranty on grey-market imports).

Avery 22807 template for InDesign

The Avery 22807 2-inch circular stickers are a good alternative to Moo, PSPrint et al when you need a small quantity of stickers in a hurry. Unfortunately Avery has not seen it fit to provide usable InDesign templates as they do with some of their other sticker SKUs, only Microsoft Word, which is needless to say inadequate. A search for “Avery 22807 Indesign template” yielded some, but they have issues with missing linked PDF files.

I reverse-engineered the Microsoft template to build one of my own, with dimensions (including the tricky almost-but-not-quite square grid spaced at 5/8″ horizontally but 7/12″ vertically) to simplify “Step and Repeat…”.

I have only tested this with my InDesign CS6, not sure if it will work with older versions.

Avery 22807 2-inch circular labels.indt