I'd like to use a hash function for file integrity checking and am influenced by the choice of BLAKE2b as the default hash function in Sodium. My library gives me the option of choosing digests between 8 and 512 bits long. I'm guessing that either 256 or 512 bits would be fine but since I have the choice, what factors should I consider? Is there any reason not to choose 512?
2 Answers
Basically it's about two parameters, performance and security.
Security: From RFC 7693, you can see the difference of collision defense as follows:
Algorithm | Target | Collision | Hash | Hash ASN.1 | Identifier | Arch | Security | nn | OID Suffix | ---------------+--------+-----------+------+------------+ id-blake2b160 | 64-bit | 2**80 | 20 | x.1.5 | id-blake2b256 | 64-bit | 2**128 | 32 | x.1.8 | id-blake2b384 | 64-bit | 2**192 | 48 | x.1.12 | id-blake2b512 | 64-bit | 2**256 | 64 | x.1.16 | ---------------+--------+-----------+------+------------+ id-blake2s128 | 32-bit | 2**64 | 16 | x.2.4 | id-blake2s160 | 32-bit | 2**80 | 20 | x.2.5 | id-blake2s224 | 32-bit | 2**112 | 28 | x.2.7 | id-blake2s256 | 32-bit | 2**128 | 32 | x.2.8 | ---------------+--------+-----------+------+------------+
Performance: you need to compare with many other algorithms to find out how much this factor will effect your device performance. here you can find benchmarking site with almost every algorithm and cpu types.
After all, it's always recommended to use larger sizes, and the only scenario that might affect your performance is if you're installing this on a very limited cpu hardware chip. "FPGA for example"
P.S: you might need to check this for Blake2 configuration issue.
BLAKE2 block size is constant no matter what the output size is. The final state is simply truncated to the desired output size. So, the output size doesn't affect performance.
Any output size >= 160 bits will ensure collision resistance. 256 bits is a conservative choice, that also adds a comfortable security margin if the function ever happens to have a small bias.
Larger output sizes are useful; key derivation is a common use case. But they are not needed for collision resistance.
With libsodium, you can simply write crypto_generichash_BYTES
to refer to the recommended output size.
- 273
- 1
- 4