IPFS decomposes large files into blocks which are hashed to Content Identifiers. As of this writing, the default block size is 256 * 1024 bytes, hashed by SHA256 to generate a 256 bit hash. The size of a block exceeds that of the hash; by the pigeonhole principle, some number of blocks will share the same hash. If SHA256’s distribution of hash values is perfectly uniform, 8192 blocks must share the same hash.

Is this a problem in practice? Probably not. While it’s theoretically possible for my dog’s adorable mug to share a hash with a blurry snapshot of some spoons, it’s been observed that even a supercomputer would require millennia to search the hash space for colliding hashes. Humans taking pictures of pretty creatures do not generate new hashes at nearly the same rate. As a failsafe, the multihash spec supports the use of newer hash functions should SHA256 ever become obsolete.

My dog is glad I’m not worried about hash collisions but thinks I should tidy up so he can play with his raccoon.