Using ZFS to Fight Data Rot, the Silent KillerKevin McAleer, January 2009 Previously, I wrote an article for BigAdmin about why I chose the ZFS file system to ensure my data was safe: How I Used Solaris OS and ZFS to Solve My Mac OS X Storage Problem. One of the reasons I chose the ZFS file system as opposed to Apple HFS+, Linux ext3, or Microsoft Windows NTFS is because the ZFS file system checksums all the data written to and read from it. This might seem unnecessary, a little obsessive, or even CPU-hungry, but it is essential for long-term data storage and for detecting data rot. So what is data rot, why should I fear it, and most importantly, what can I do about it? Quite simply, data rot is the result of tiny changes in the magnetic particles that make up the media in hard disks. The effect this has on your data is random but predictable: data loss. It might be the contents of a file that gets corrupted, the file header that describes the contents of the file, or, worse, the file allocation table that describes the location or links to the file. The file might be a system file or a data file; either way, it's eventually going to be bad news. According to a recent study, Analyzing the Effects of Disk-Pointer Corruption (pdf), 0.66% of SATA disks and 0.06% of Fibre Channel disk developed corruption in 17 months of use. The same article describes how some corruption is worse than others and explains that most modern filing systems are unable to deal effectively with this (excluding the ZFS file system, of course!). So you're probably thinking "Doesn't We've established what data rot is and how existing tools are not suited to detecting, correcting, or preventing it. Now, on to why you should care about this... How important is your data? I mean, really? Think about it. I personally have the following data stored on my computer: photos and videos of my daughter since birth, software downloads I've purchased (including Adobe Photoshop and Adobe Dreamweaver, which weren't cheap), my iTunes library (for which I must have spent a couple of hundred, if not into the triple 0's, of dollars), and various work projects. I'm not prepared to let anything happen to this data. So I've taken steps to avoid obvious problems:
I've also taken steps to design my storage solution correctly: I use several disks in a RAID configuration (RAID-Z with a hot spare) to ensure a single disk failure can't cause data loss. Finally, I choose to use the ZFS file system because I know that it checksums every read and write to the filing system, ensuring that my data is as it was when it was written to disk. I run a "scrub" of the ZFS file system every week to ensure that no data has become corrupted by data rot, and this week, it detected over 20 instances of it. Thankfully, ZFS effortlessly replaced the corrupted data with good data held elsewhere on disk (thanks to RAID-Z) without any loss whatsoever. Conclusion: To prevent data rot, choose the ZFS file system. Although I didn't lose data, the experience did drive me to write this article, because I wanted to make people aware of this issue. References
About the AuthorKevin McAleer is the director of Advice Factory, offering advice and IT consultancy services to businesses in the UK. He is an Apple Mac fan and also an evangelist for Sun's ZFS technology.
Comments (latest comments first)Discuss and comment on this resource in the BigAdmin Wiki
Unless otherwise licensed, code in all technical manuals herein (including articles, FAQs, samples) is provided under this License. |
BigAdmin SubscriptionsBigAdmin Areas
BigAdmin Sun Center
BigAdmin Topics | ||||||||||