I want to share a little script I wrote that check if two directory is identical rather efficiently. First you may need to understand what sha1sum does.
The sha1sum tool
As the name may suggest, sha1sum compute the SHA-1 checksum of a particular file / text string.
[jason@madcoda ~]# echo "Hello World" | sha1sum
[jason@madcoda ~]# echo "Hello World" > hello.txt
[jason@madcoda ~]# sha1sum hello.txt
You can see from the above example that sha1sum produce a hash of the content, by comparing whether the hash is the same, you can tell that the two file / string is identical or not.
Running sha1sum on a directory
The problem is that sha1sum only compute the hash for 1 file at a time. With the help of a few bash tools, we can achieve our needs.
1. To get a list of all files in a directory:
find /path/to/directory -type f
2. To run SHA-1 hash on each and every file:
find /path/to/directory -type f -print0 | xargs -0 sha1sum
3. By doing the above, you have an output of the identity hash of all the files in the directory. The next step we have to produce a single hash value for the entire directory so that we can easily compare. One clever trick is to perform a hash again on the output of the above. Thus,
find /path/to/directory -type f -print0 | xargs -0 sha1sum | sha1sum
4. That’s pretty much it, but we can still improve on this by sorting the rows before hashing, to ensure the output is the same on different systems.
find /path/to/directory -type f -print0 | xargs -0 sha1sum | sort | sha1sum
My “custom” version
Finally, add some checking to check if the arguments are passed
# arguments check
if [ -z "$1" ]
echo "Usage: sha1dir /path/to/directory"
# run sha1sum on every file
find . -type f -print0 | xargs -0 sha1sum | sort | sha1sum
save this as “sha1dir.sh”
chmod +x sha1dir.sh
cp -p sha1dir.sh /usr/local/bin/sha1dir
Please find the Gist here. Feel free to give me suggestions to improve this script. Leave a comment below!