recovering JPGs from a corrupted filesystem

29 Aug

as usually, we took tons of pictures during our last vacation, so eventually the memory card (a SD card) ran out of space. luckily, I had another SD within reach, so we just swapped them and went on…
arriving at home, I started downloading the pictures. when I plugged in the second card, loads of error-messages appeared, indicating a badly broken FAT32 filesystem. none of the newly taken pictures was there :-(

first, I tried to repair the filesystem using fsck.vfat (part of the dosfstools suite), without any success. so I finally just made a plain dump using dd, planning to work on the dump rather than on the real SD card.

so I made my way to google and found C├ędric Blancher’s wiki entry about recovering digital photos. lots of very useful information what to do:

1. examine dd-image with hexedit

the starting signature for a JPEG file is 0xFFD8, the corresponding end-signature is 0xFFD9. so just looking for those chars gives you the hex-offsets of the start and end of a picture:

hexedit image.dd
[...]
0385C000   FF D8 FF E1  79 FE 45 78  69 66 00 00  49 49 2A 00  ....y.Exif..II*.
0385C010   08 00 00 00  0D 00 0F 01  02 00 0A 00  00 00 AA 00  ................
0385C020   00 00 10 01  02 00 09 00  00 00 B4 00  00 00 12 01  ................
0385C030   03 00 01 00  00 00 08 00  00 00 1A 01  05 00 01 00  ................
0385C040   00 00 BE 00  00 00 1B 01  05 00 01 00  00 00 C6 00  ................
0385C050   00 00 28 01  03 00 01 00  00 00 02 00  00 00 31 01  ..(...........1.
0385C060   02 00 0A 00  00 00 CE 00  00 00 32 01  02 00 14 00  ..........2.....
0385C070   00 00 D8 00  00 00 13 02  03 00 01 00  00 00 02 00  ................
0385C080   00 00 69 87  04 00 01 00  00 00 7C 02  00 00 A5 C4  ..i.......|.....
0385C090   07 00 D0 00  00 00 EC 00  00 00 D2 C6  07 00 40 00  ..............@.
0385C0A0   00 00 BC 01  00 00 D3 C6  07 00 80 00  00 00 FC 01  ................
0385C0B0   00 00 20 28  00 00 50 61  6E 61 73 6F  6E 69 63 00  .. (..Panasonic.
0385C0C0   44 4D 43 2D  46 5A 32 38  00 00 B4 00  00 00 01 00  DMC-FZ28........
[...]

so the beginning of the above file is at offset 0x0385C000. unfortunately, there are some problems…

first, the end-marker proved to be unreliable. to solve this, I just had a look at the size other pictures taken with this camera and discovered that the maximum filesize is roughly 5.1 MB. so just dumping 5.5 MB from the starting marker should make its job…
second, the SD card contained a lot more JPGs than just those taken with this camera. so I decided to have a look at the EXIF data that follows the header right away and contains the name of the cam in plain text: “DMC-FZ28”. so just look for that string and then find the last occurence of the start-marker before…
last, it is a pity job to extract the offsets manually and is at maximum to be meant for testing. to do such a test-extraction, you just need the offset and the size of the chunk you want to cut out (both values converted to the block-size used with dd) . so here’s how to do that manually (note that bash can convert from hex to decimal by using $((...))):

BS=1024
dd if=image.dd of=$(date +%s).jpg bs=${BS} skip=$((0x0385C000/${BS})) count=$((5529600/${BS}))

2. extract offsets with hexdump

since I didn’t want to do this for all pictures manually, I used hexdump and grep to go fishing for the offset-values. as mentioned above, I looked for the string identifying the camera model. figure out how much additional lines you need for the starting-marker (in this case it’s 12, hence the grep -B 12):

hexdump -C image.dd  | grep -B 12 FZ28 > markers

in my case the starting markers were always at the beginning of a line, so extracting the offsets from the above file just works like this:

grep '  ff d8 ' markers | cut -c 1-8 > offsets

3. extract JPEGs using dd and the gathered offsets

for better readability, let’s define a simple function to dump the JPEGs:

dumpjpg() {
    BS=1024
    FNAME=${1}
    echo "dumping to ${FNAME}"
    dd if=../TRIFOO.dd of=${FNAME}.jpg bs=${BS} skip=$((${1}/${BS})) count=$((5529600/${BS})) status=noxfer
}

then, do the actual work:

for offset in $(cat offsets) ; do
    dumpjpg 0x${offset}
done

I love Linux :)

Having succeeded in recovering approximately 100 files this way, I just discovered (on ubuntugeek.org) the existance of special tools that do exactly the same:

foremost -q -t jpg -i image.dd