Sun, 21 Feb 2010

File Set Diff

I wrote a utility to compare files from two directories.

A friend had a large directory of photos on their computer and some of it was backed up to an external hard disk. We suspected that some photos were not backed up, but which ones? This was made more difficult because they had renamed some of the files.

So I wrote a script to find all the files in a directory and calculate a SHA-1 hash on their contents. The script does the same to a second directory and compares the hashes. It then prints out the files that are in one directory but are missing from the other. It also can detect duplicate files in a directory, since the SHA-1 hash uniquely identifies the contents of a file (even if it has been moved or renamed).

The script can be obtained from the downloads page on this Web site.