Are password protected excel spreadsheets really password protected? As it turns out, not necessarily.
When you set a password in excel, that password isn’t encrypted. This means that, you can see it somewhere in the system. Didier Stevens, a security researcher, figured out that you could cycle through many of the hashes, since they are only 16 bits long, and eventually run into a collision. A hash collision is when two distinct pieces of data share the same value. Imagine if I had the hash for “cat” and “dog” resolve to the same hash, “06d80eb0c50b49a509b49f2424e8c805” (dog in MD5 hash). That would be peculiar, right?
Not only can you brute force these passwords, (trying a list of passwords until you strike gold), but since you’re bound to run into a collision with such a short hash, you can eventually find this phenomenon where two pieces of data resolve to the same hash.
Take a look at this example:
mysupersecretpassword = easytocopyhash
somebruteforcephrase = easytocopyhash
easytocopyhash counts as the password; even if somebruteforcephrase was never set as a password by the user, the system sees it as the same password as mysupersecretpassword, based on the file hash.
Staying secure against hash collisions is complex, however, the best way to ensure that your excel file would be safe, would be to encrypt the file with a strong encryption algorithm. Therefore, instead of relying on the built-in hashing algorithms that are only 16-bits, you would be able to have access to a more effective encryption algorithm that reduces the chance of a collision. You can use software like Cryptomator or Rclone to accomplish this task. I recommend you do a bit of research before you start encrypting your systems. Legal disclaimer, I am not liable for the damage you to do your systems as a result of encrypting it after reading this post.
Lessons Learned
Even if you have the most secure password on planet earth, if the system that sees your password doesn’t do the right steps to hide it, you will still have an issue where prying eyes will be able to put the pieces of information back together. Encryption is tricky because as we get more advanced technologies, we run into different, new discovered problems, warranting new technology, and the cycle repeats itself. The vulnerability is at the encryption layer. It’s like giving your password to a spy that you thought you could trust, but then that spy happens to get confused and thinks your password is your friend’s password instead.
Resources:
https://crypto.stackexchange.com/questions/1170/best-way-to-reduce-chance-of-hash-collisions-multiple-hashes-or-larger-hash
https://isc.sans.edu/diary/16bit+Hash+Collisions+in+xls+Spreadsheets/31066/