The Register about ZFS deduplication
Even the Register reports about Deduplication in ZFS. I´m asking myself, if the people at the Register read my blog, as i´ve talked about that a few days ago. Fun aside: Synchronous Dedupe is the only sensible way to do dedupe data as you would need to provide the storage for undeduped data otherwise until the system gets to the point where the data gets deduplicated. Dependent on the frequency of dedupe runs, this could be a vast amount of storage. On the other side, synchronous dedupe is dependent of a fast mechanism to detect duplicates. The checksumming feature of ZFS looks like a good way to do this, as it capable to use various hashing algorithms. When the probability of collision is less than the probability of reading wrong data from disks it should suffice just to check the checksums instead of checking the complete block.