automation

Automatically splitting, cropping and rotating multiple photos from a combined scan

I recently offered to digitize all the 4x6 inch family & childhood photos prints, which ended up being a harder task than I thought it would be due to some newbie mistakes.

Originally, I had thought it would be a piece of cake to simply scan multiple photos at a time with a flatbed scanner, which I could trigger from my computer and save images directly to it via the macOS Image Capture app. I'd then write a quick script to detect whitespace between photos and crop them out.

To maximize the number of print photos per scan, I arranged my scans like this:
Example of bad photo positioning on flatbed scanner

This turned out to be a terrible idea.

Scanners are not perfect, and both scanners I used in the process of digitization captured black bars near the edge of the scan, particularly on the side where the lid closes:

Example 1 of edge artifacts in scan

Example 2 of edge artifacts in scan

Example 3 of edge artifacts in scan

This is a classic example of something really easy for the human eye to detect, but something that is difficult to get computers to detect. Auto-split features in Photoshop didn't work, nor did open-source tooling like ImageMagick + Fred's multicrop.

Doing it properly

So, did you volunteer to digitize a bunch of photos as well? Don't be like me, simply arrange your photos like this and you'll have no problems at all:

  1. Use Image Capture (or equivalent for your OS) to begin scanning photos
  2. Set format to PNG and DPI to 300 (you can do 600 if you'd like but it will be considerably slower and isn't useful unless you intend to make larger prints than the originals)
  3. Position photos non-overlapping in the center of the flatbed so that whitespace exists on all sides, like this:
    Example of good photo positioning on flatbed scanner
  4. After you've completed all your scans, install ImageMagick. It's typically available via Homebrew or your OS' package manager.
  5. Download Fred's multicrop and run it:

    cd /path/to/scans
    mkdir split
    for photo in *.png;do
    /path/to/multicrop "$photo" "split/${photo// /_}"
    

    I noticed that the multicrop script has issues if you specify spaces in the output file, so this invokation automatically replaces them with an underscore.

I doubt this will be relevant for much longer since I'm likely the last generation that will need to do this, but hopefully this helps!

But wait, how did I fix it?

After learning the above, surely I didn't have to re-scan all the photos you might ask?

I was not thrilled about the prospect of cropping manually, but I was also not about to rescan some 2000 photos. With a bit of help from ImageMagick, I was able to get most of the pictures auto-cropped and rotated thanks to Fred's great script.

The photos that were joined by a black bar would still needed to be split manually, but most of the scanned photos could still benefit from being auto-cropped and rotated.

I wrote a quick script to address the issue:

  1. Chop 6px off the left of the combined scans, which was roughly the width of the black artifacting
  2. Take each combined scan and add a 50px margin to the left and top to ensure each individual photos would have whitespace on all sides
  3. Run Fred's multicrop script as usual

Here's the script:

getFilename() {
    filename=$(basename "$1")
    filename="${filename%.*}"
    echo $filename
}

getExtension() {
    filename=$(basename "$1")
    extension=$([[ "$filename" = *.* ]] && echo ".${filename##*.}" || echo '');
    echo $extension
}

pad() {
    in="$(getFilename "$1")"
    ext="$(getExtension "$1")"

  # crop 6px from left
    convert "${in}${ext}" -gravity West -chop 6x0 "tmp/${in}-cropped${ext}"

  # add 50px whitespace top and left
    convert "tmp/${in}-cropped${ext}" -gravity east -background white -extent $(identify -format '%[fx:W+50]x%[fx:H+50]' "${in}${ext}") "tmp/${in}-extended${ext}"
}

split() {
    in="$(getFilename "$1")"
    ext="$(getExtension "$1")"
    ~/bin/multicrop "tmp/${in}-extended${ext}" "output/${in// /_}-split${ext}"
}

mkdir -p tmp output done
for combined in *.png;do
    pad "$combined"
    split "$combined"
    mv "$combined" done
done