Secure Deletion with ZFS
A reader that wants to be anonymous (thus he use such a pseudo in his comments) asked an interesting question: How do you do secure deletion in ZFS? The standard mechanism to it, is to overwrite data with zeros, ones or a data pattern to ensure that the data is deleted as normal delete would only delete the metadata and not the data itself. This is a little bit hard with ZFS. Why? ZFS is a copy-on-write filesystem, thus the zeros are written somewhere else, as active data is never overwritten by ZFS. There are hacks to solve this problems: For example overwriting all sectors on the free list. Or you can implement code to overwrite the data directly in a kind of secure delete. But from my point of view this wouldn´t really help.
At first, as the reader just wants to be annoying ( ;) ) in regard of a web seminar about security: Secure delete has nothing to do with security, it´s about data protection. Albeit related topics (data protection is useless with security) it addresses different areas. But this assumes, that overwriting data is a good solution for secure delete. It isn´t from my perspective … out of several reasons. At first you need direct control over the position data is written. Albeit it´s highly probable (but not guaranteed) on rotating rust disks that you get the blocks with the data you want delete securely (but even there is something in your way like bad block relocation), it gets really really difficult with all this wear-leveling in solid state disk. And you have to keep in mind, that weared-out sectors on a flash device isn´t writeable anymore, it´s still perfectly readable. Perhaps not with the flash memory controller in it´s way … but you could read it directly. It´s the same with defragmentation, storage virtualisation and so on. That isn´t a problem of ZFS, thats a problem of all logical structures in a storage media. It´s perfectly possible to have copies of the data you want to delete on several positions on the data without knowing about it. Redundancies introduced by your storage system, by reorganisation of the data. As the disks get more and more intelligent, you have less control of the position of your data.
At the end the only sure way to securely delete data on your disk is not done by overwriting all data you have just delete with data patterns. It´s about overwriting the complete free space after deleting as every free block on your disk, as it could possibly contain data that you´ve assumed as deleted securely. Obviously this isn´t practical: At first this takes a time and eats away your precious IOPS for more important work and more importantly its a sure way to brick your solid state disk in no time. Overwriting all your data with zeros, ones and patterns is just a feasible when you do it with the complete disk and even then it´s difficult because your storage controller may have relocated blocks without your knowledge so you don´t reach all the data written to the disk. And additionally: What´s worth the effort, when you can´t guarantee that no data is left on the device. At the end it´s called “secure deletion” and not almost “secure deletion” or “a kind of secure deletion”. At the end all this secure deletion stuff is just a kind of highly sophisticated snake oil.
And there is an additional problem: Let´s assume you do backups on a regular schedule. For example by creating zfs send/receive streams, move it to a backup server, which moves it to tape in an autoloader. Let´s assume, you want to delete a file that was part of this backups securely? Perhaps the tapes aren´t in the autoloader anymore. The next interesting questions are : What happens with data on all the components of the hybrid storage pool? Fragments could be on the L2ARC disks, they could be on sZIL. How do you securely delete deduplicated data? You can put into one senctences: The people put too many thoughts in magnetic remanence and too few thoughts in logical remanence. At first you have to ask, what you want to solve with secure deletion. Overwriting data with 0 isn´t a feature by itself. Secure Deletion isn´t a feature in its self. It´s the result of a need. It´s the protection, that nobody can get insight into data, when the disk leaves the realm of your control. Leaving the realm of your control can be:
- you have to swap a hard disk, because the disk isn´t functional anymore. And by the way: When you are really paranoid shreddering the disks to pieces of 1mm by 1 mm this isn´t secure. One squaremillimeter contains a lot of data.
- the system is moved by people you don´t trust in another location
- you must ensure that data is delete after a certain span of time ... really delete them
- you have to work with data, that has to deleted after the work just to keep the result of the data on the disk
- you want to be sure, that nobody can get insight in your deleted data, when you notebook/harddisk/desktop is stolen.
So: How do you implement secure deletion. The answer is relatively easy. You should ask at first, what deletion is. Deletion is nothing more than adding resources to the pool of usable storage, the data is not erased, when you delete a file. Secure deletion isn´t about overwriting data with zeros. It´s about denying the access to data, even to your self by removing the usability of the data physically. This can be done by any technical mechanism. So the answer to secure deletion is relative simple: Use encryption and when you want to delete the data throw away the matching key. The nice thing: where ever you have ever replicated, backuped, restored and copied the data, it´s deleted at the same moment when you drop the key. Even on tapes not in drives, disks not in your system or disk images not mounted to your system. Without the key the data is just garbage. The data is away. By deleting the key physically, the data isn´t accessible anymore. And this is exactly the way, secure deletion will be done with ZFS. It´s done by encryption. You will be able to define an encryption key by dataset and when you want to delete a dataset securely just throw a way the key. Remember that creating a dataset is as easy as creating a directory in ZFS. ZFS Crypto will be the solution for the secure delete challenge. And until its integrated in ZFS you can use xlofi as done for example with the Immutable Service Containers introduced by Glenn Brunette. He describes this concept hin his article Encrypted Scratch Space in OpenSolaris 2009.06