I ended up using ImageMagick and the multicrop script to help me. The multicrop script requires that the unrotate script is installed.
After installing ImageMagick, the ~/.profile file needs to be updated, in my case it was to add these lines at the end:
~/.profile
export MAGICK_HOME="/Applications/ImageMagick-6.9.3"
export PATH=$MAGICK_HOME/bin:$PATH
export DYLD_LIBRARY_PATH="$MAGICK_HOME/lib/"
I copied the multicrop and unrotate scripts to the ImageMagick 'bin' directory (in my case it was /Applications/ImageMagick-6.9.3/bin).
This is how I configured OS X Image Capture utility for scanning the photos - Colour @ 600DPI with JPEG output. I chose 600DPI because that's the highest resolution available on my scanner. In hindsight JPEG could have been changed to TIFF for lossless compression.
And so began the scanning...actually at the point of writing I am about half way through with about 25 albums scanned in total (where some albums contain 200 photos).
In the early stages of scanning I experimented quite a bit with the multicrop settings and found the below ones to be the best in my case. I adjusted the background colour to match what my scanner was actually outputting (did this in Gimp with the colour picker tool). The other arguments were a bit of trial and error. The command below will take a scanned image file named Scan.jpeg that contains multiple photos and will attempt to extract, rotate and crop out each of the individual photos within it. The photo files will have a prefix of 'Photo' e.g. Photo_0-000.jpeg, Photo_0-001.jpeg, etc.
Command
multicrop -b '#e7e7e5' -f 20 -u 2 -g 40 "Scan.jpeg" "Photo.jpeg"
The results were quite good on the most part. Some photos were quite troublesome to extract however, no matter what settings I tried a handful of photos just wouldn't extract correctly and I had to do those by hand.
I also noticed that a slight border is left around the extracted photos, this can be ignored but is also easy enough to crop out manually or with the shave command.
While the scripts are running they take up quite a bit of extra disk space...
Since I had so many scanned images to process there was no way I was going to do it by hand, so I wrote the script below to help me. This script looks for all files starting with Scan in the directory where you run it from and processes two scanned images at a time using multicrop. Depending on the amount of scans and your computer speed this can take a while. In my case 20 photos would took around 10-15 minutes. Adjust the multicrop command on line 10 as required.
scan2photo.sh
#!/bin/bash
IFS=$'\n'
function mcrop() {
oldFile="Scan $1.jpeg"
newFile="Photo_$1.jpeg"
if [ -f "$oldFile" ]; then
echo ===\> Processing "$oldFile" to "$newFile"
multicrop -b '#e9e9e9' -f 20 -u 2 -g 40 "$oldFile" "$newFile"&
fi
}
if [ -f Scan.jpeg ]; then
mv Scan.jpeg "Scan 0.jpeg"
fi
for i in $(ls Scan*.jpeg | sed -n '/.*[02468]\.jpeg/p'); do
fileNum=`echo "$i"|cut -d. -f 1|cut -d" " -f 2`
mcrop $fileNum
mcrop $(($fileNum + 1))
wait
done
-i