Write amplification happens when data written to the flash is multiplied on the flash media, reducing the device lifetime and hurting performance.
At this moment, countless flash memory devices across the world store precious data. Those devices are each involved in the loop of writing, erasing, and re-writing data which makes storing all of that data possible.
However, this writing and erasing also carries a threat to the data reliability and limited lifetime of the flash – a threat called write amplification. Where there is data storage, there has always been write amplification. Back in the days when hard disk drives (HDDs) were a more widely preferred form of storage, write amplification was more of a trivial nuisance. But with the dominance of NAND-based flash storage in the world today – and the nature of how that technology works – write amplification is a real cause for concern. Before we go into explaining what write amplification is, we will briefly touch on the underlying flash concepts that make it a problem.
How NAND-based flash handles writes
Data written to a NAND-based flash device goes through multiple layers. The data typically originates from applications, then moves through the file system to a block device layer. From there it moves downward and is finally written to the flash media.
Down at this level, there are two important concepts – a write page and an erase block. Each erase block is made up of several write pages. The smallest amount of data that can be written to NAND-based media is a write page. The capacity of a write page depends on the type of NAND being used (typically anywhere from 512 to 4096 bytes), but the actual size here is not important. What is important is that all NAND-based media has a limited lifetime of program and erase (P/E) cycles.
There is also another limiting factor with NAND-based technology: to write to a block on a flash device, it must be first erased. In order to make room for new data, processes such as garbage collection relocate any remaining valid data to a new location (we will cover those more in detail below).
Moreover, when an application writes a data file – even if it is smaller than the write page size – an entire page must be written. There are no partially written pages on NAND-based flash. So if the data file happens to be smaller than the write page capacity, the remaining area on the page is filled with bits of undefined data. This also means that if an application needs to expand the original file, a new NAND write page is created elsewhere with the expanded data.
WHAT IS WRITE AMPLIFICATION?
Write amplification is a phenomenon that happens when writing data to a storage media, whereby the original amount of the write is multiplied – due to several factors – into a disproportional amount of data on the storage device. This phenomenon is primarily caused by how writing (programming) and erasing works on NAND flash.
Many factors cause write amplification
Before we get into how write amplification affects the flash, it is important to understand what creates it. Each of the following factors is another part of the multiplication, and combined together the effect can become quite large.
File system operations
The first factor, and the one that a system designer has the most control over, is how the application uses the file system to write data. Writing blocks that are smaller than the media block size is one factor.
All file systems track the data on the media with metadata. This is where file names, folders, and lists for creation date/time and block allocation live. This metadata usually takes up a fraction of a write page, but remember that NAND flash media can only write full pages. In addition, some file systems write copies of the metadata to a journal, which is also “scaled up”. Each metadata update – every file save, resize, and flush – will cause write amplification.
Processes and updates
Most processes that provide system and data integrity will also add to the write amplification. In addition, some updates can be a factor, like database updates – a simple database can be the largest contributor to write amplification factor.
Garbage collection
Device level processes also increase write amplification factors, and cannot always be controlled by the system designer. One example of these is garbage collection. On flash media, one erase block contains many write pages. When garbage collection happens, each write page that still has valid data must be relocated to another erase block. The newly empty block can be erased, but each added write is a hidden increase to the write amplification factor.
Wear leveling
Even wear leveling can increase the write amplification. Static data is written once in a design, for example system files. In order to get the best life from the media, this data is relocated to another erase block when conditions are right. This allows the flash media to wear more evenly, at the cost of more hidden write amplification.
After all of the above factors are taken into consideration, what you end up with is a multiplied amount of data piled on top of the amount the application initially intended to write. This has consequences for the flash media.
Why is write amplification bad?
NAND flash media has an expected maximum for program and erase (P/E) cycles. This maximum is for each erase block or write page of the flash media. In a perfect world with no write amplification, the raw maximum data that can be written can be calculated by multiplying the maximum P/E cycles and the number (and size) of the erase blocks on the media.
Erasing and relocating data ends up using more of the flash than is intended at the application level. This effect of multiplied data increases the amount of wear on the flash, decreasing its lifetime and negatively affecting its long-term reliability. In the worst case, with enough wear, the device can simply stop working reliably overall.
In short – multiplied writes mean a shorter flash lifetime.
HANDPICKED RELATED CONTENT:
Want to learn more about automotive flash memory lifetime? Check out our blog post: Automotive flash – what’s the real lifetime?
In addition, write amplification influences device performance. Erase operations are usually taxing and slow operations on the device level. If a use case leads to high write amplification, a flash device like a solid-state drive (SSD) will suffer from poorer performance, due to its controller needing to perform many more writes during the background operations (in addition to erasing the flash). In contrast, with low write amplification a device can perform at a higher throughput and will complete its writes sooner – it simply will not have to write that much more data.
Write amplification also contributes to so-called “field failures”
Write amplification is one of many factors affecting embedded device reliability and lifetime. Other hardware-related factors include power disruptions, data retention, and more. These issues can all lead to an embedded device failing in the field, resulting in serious consequences.
In our conversations with embedded original equipment manufacturers (OEMs), one thing is certain: addressing a problem in the field is costly. Anticipating and preventing field failures enable market leaders to invest in innovation – rather than costly resource-draining diagnosis, repair, and redesign. Furthermore, as data storage needs of these devices has increased dramatically over the years, unreliable data storage can be a significant contributor to field failures.
HANDPICKED RELATED CONTENT
Interested in learning more about embedded device failures in the field – and how to more effectively prevent those failures? See our whitepaper: Understanding and preventing field failures in embedded devices.
There is quite a bit going on with write amplification. See our video below for an overview of the concepts we have discussed.
Final thoughts
On flash media, data loss can be very costly. The standard process of writing and re-writing data eventually wears out the flash, through mechanisms like write amplification. This greatly decreases the lifetime and reliability of the device – risking a significant loss of potentially critical data.
However, avoiding issues like write amplification is achievable. Understanding the process of write amplification, the factors that cause it, how it works, and its effect on the flash are important steps towards enabling improved embedded device reliability.
Let’s keep write amplification to a minimum and solve your embedded flash memory storage challenges.