To protect embedded device data when power interruptions strike, file system disk checking utilities are important. We outline how those tools work and which options you may want to consider for your designs.
Power interruptions can cause embedded device data to become corrupted. In such an event, data files may have been stored in an incomplete state, or metadata about the files could have been lost. This outcome can be devastating to an embedded design – causing critical data to be unusable, increasing costs, and simply resulting in poor end user experiences.
To help deal with that risk of data corruption, the tools known as chkdsk and fsck exist – specifically for the FAT file system and Linux file systems.
Preserving system integrity with chkdsk and fsck
Before embedded devices, file systems were designed to work in servers and on desktops. Power loss was an infrequent occurrence, so little consideration was given to protecting the data. Frequent checks of the file system structures were important, and these were often handled at system startup by a program such as chkdsk (for FAT) or fsck (for Linux file systems). In each case, the OS could also request a run of these utilities when an inconsistency was detected, or when the power was interrupted.
With embedded devices, the situation is slightly different. The intensive environments and demanding use cases that embedded devices need to operate in, power interruptions are far more common nowadays. As a result, the risk to system integrity is more substantial, and file systems operating on such devices must be actively developed with that risk in mind.
Related content:
Learn more about system integrity in rugged embedded devices by reading our whitepaper, “Understanding system integrity and how testing can help in preventing field failures.”
chkdsk
Power interruptions occurring in the middle of a data entry to a file can cause errors in the documented cluster sizes. This results in so–called “lost clusters.” The same sorts of problems can occur in the file system metadata, or even the file allocation table itself. And the job of chkdsk? Find and fix all of those cluster errors.
fsck
The method behind these tools is a check of the entire disk – reading each block to determine if it is allocated for use, then cross checking with an allocated list located elsewhere on the media. FAT file systems have little other protection, and can only flag sections of the media without matching metadata by creating a CHK file for later user analysis. Linux file systems add in a journal mechanism to detect which files are affected, and can often correct the damage without user intervention. This is the role of fsck.
For data integrity, there are better options
The chkdsk and fsck utilities are necessary because basic file systems are not atomic in nature – data and metadata are written separately. While these utilities are useful options for designs that rarely lose power, a more fail-safe alternative is to use file systems precision-engineered for atomic operations and for withstanding power failure.
Tuxera Reliance Nitro is a file system that treats updates as a single operation, and thus the file system is never in a state where it would need to be corrected. Dynamic Transaction Point™ technology allows the user to customize precisely how atomic their design is – protecting not just a block of data and metadata but the whole file. After all, half a JPG is pretty much useless, from a user perspective.
Put bluntly, the repairs that fsck and chkdsk can perform are completely unneeded with Reliance Nitro. At the device design level, this results in quicker boot times for a system that is entirely protected from power failure. A file system checker is of course provided, and is useful for detecting failures caused by media corruption.
Our exFAT implementation – which is part of our GravityCS by Tuxera suite of file systems – provides better power interruption handling than an open-source alternative. The volume check and repair tools built into our exFAT implementation ensure file system consistency, while fixing corrupted volumes and recovering lost files.
Intelligent write ordering in our file system reduces orphaned clusters and other problems that require chkdsk and user intervention to properly correct. Each write is also performed in an atomic fashion, so that only blocks of complete data are written to the media. The result is an exFAT implementation that surpasses the power fail-safety provided by both chkdsk and fsck.
Final thoughts
Modern embedded systems must often operate in demanding conditions where constant, sustained power to the device is by no means guaranteed. To ensure system integrity, avoid lost data, and prevent costly device problems, sudden power losses have to be taken into account at the file system level. Embedded designers have several options for that, and the fsck and chkdsk utilities can be helpful for detecting and fixing data errors caused by power losses.
A more fail-safe and ultimately cost-effective option? Select optimized file systems that ensure customizable atomic operations and power failure protection – file systems like Tuxera Reliance Nitro and GravityCS by Tuxera.
For the best system integrity, put to use our reliable and high-performing exFAT implementation.
Thom Denholm
Thom Denholm is a technical expert on flash media and file systems, with 35 years experience in embedded designs. He is a frequent speaker at conferences, including the Embedded World conference and Flash Memory Summit for the last 15 years. His strength is translating tough technical topics into easily accessible concepts that designers can immediately implement to solve daily challenges. He also serves as the secretary for the board of directors of the SD card association.