Embedded data storage management is filled with technical terms that can make your head spin if you’re not familiar with them.

To make it easier for you to keep track of all these terms, our experts have compiled a glossary of essential embedded technology definitions.  It’s not meant to be a finalized list, but a glossary that our experts will continue to contribute to over time. And as the embedded data storage industry continues to grow, we expect that to happen regularly. So check back often!

Atomic operation

For embedded designs, an atomic operation refers to the smallest change that can be attempted – it will either be fully done or not done at all.

For older media, atomic operations were at the byte level. If file systems didn’t provide additional protection, a write or update could be stopped halfway through a data set, leaving it corrupted – neither fully written nor untouched. For media based on NAND flash memory, atomic operations are at the size of a flash write block.

File systems still need to match any protection to the media operation in order to provide the most efficient solution. Some file systems require multiple files and updates to be written so as to ensure system integrity, and in those cases the perceived atomic operation size is much larger.

RELATED CONTENT:
Embedded file systems – trickier than you think

See also system integrity.

Compression

Data collected or generated by devices is stored on the media – when that data is compressed, that means it is stored in a way that consumes less space on the media. Some of that data is provided in a compressed state, such as MP3 audio or MP4 video and JPG images, while other data (such as text-based log files) is provided raw.

The ability to compress data can be valuable to the raw data, and that can be done within the file system or by the media device driver. When data is compressed, you must be aware of any changes to the data, as modifying a data set can make it bigger or smaller, potentially causing issues like fragmentation. One good use case for compressed data is in read-only formats – in such cases the data is decompressed before use. The system designer must determine whether compression will result in any improvement for their specific use case.

Both compression and decompression are usually done by the CPU, and can affect CPU consumption.

See also CPU consumption, fragmentation.

CPU consumption

The amount of work handled by the CPU of a device. The CPU handles a lot of things – in some designs, that means everything from the application software visible to the users, to timed tasks and communication, sensor data gathering and analysis, and storage of data to the media. Because of that broad usage, understanding just how much time the processor spends on managing storage will help predict how available those CPU cycles are to the other tasks.

See also power consumption.

Database storage

A database collects structured and unstructured data into one or more large files. Access to the data is provided by a structured element, often with multiple indexes to the data. This allows sorting and access via various methods, including query languages.

Most database storage also adds a data resilience element, allowing atomic storage to the media. An entire element is either stored or not stored, updated or not updated. The Transaction Point model of data resilience used by Tuxera Reliance Nitro and Tuxera Reliance Edge are conceptually based on database storage. Also, Tuxera Reliance Sense is our database-like file system for structured data.

See also structured data, unstructured data.

Data integrity

When the power is lost or the system crashes, data not yet committed to the media is usually lost. A measure of how much is lost can be considered the data integrity. This problem can exist for both the file system and the media – both can have uncommitted data in cache or buffers.

Control over data-at-risk is managed with mount settings, flush and fsync commands, and the design of the software and hardware. In many designs, this control can be directly opposed to the raw performance of the solution. The ability to control that balance with runtime settings can be very useful. Tuxera Reliance Assure is a file system that can help developers achieve control over data-at-risk, thanks to deterministic file system operations.

See also system integrity.

Data retention

How long the data will last on the device. When the device is active, software will correct occasional bit errors. When inactive, data retention is how long the device can hold the data without refreshing that data. There are several factors involved in data retention – such as time, environmental conditions, and the ability to correct errors on NAND flash media.

Data retention decreases when the device nears end of life. To better understand that decline, the short answer is to work with your vendor. They can give important information on how long data will be retained in a working system, how long without power, and the maximum erase count of the media.

Tackling data retention and understanding issues with embedded flash is best accomplished with cost-effective and use case-specific flash memory testing.

RELATED CONTENT:
NAND vs NOR flash memory: an embedded developer’s guide to choosing

See also secure erase.

Device health

NAND flash media is rated for an expected maximum lifetime as a number of erases for each erase block. The software or firmware that manages access to the NAND based media has the capability of tracking this information at some level, and most provide an interface to access a rating based on this information. This is known as Health Information or Health Status.

On eMMC, for example, the device health report was added in the 5.0 revision of the specification. Device health is reported through the extended CSD register in a fairly low resolution, rounding device usages to the nearest 10%. Reach out to the vendor for more information, as some provide a finer resolution through alternate channels.

It’s worth mentioning that accurately determining the lifetime of a flash device can be tricky. We go through why that’s the case in our blog post, “Automotive flash – what’s the real lifetime?”

Dynamic wear leveling

All wear leveling software or firmware must track how often a given media block has been written and erased. Designs which perform dynamic wear leveling will often perform the next write operation to the least-used block in the available pool – a significant improvement over no wear leveling. Tuxera FlashFX Tera features dynamic wear leveling technology.

See also static wear leveling.

eMMC (or e∙MMC)

Short for “embedded Multi-Media Controller,” and refers to a package consisting of both flash memory and a flash memory controller integrated on the same silicon die. The eMMC solution consists of at least three components – the multimedia card (MMC) interface, the flash memory, and the flash memory controller – and is offered in an industry-standard BGA package. This standard also specifies the firmware functionality, though the exact capabilities vary from vendor to vendor.

Drivers are often needed by the operating system and file system to take advantage of the eMMC block device. Some of these drivers provide access to vendor specific operations. A file system can then take advantage of these operations to provide more performance and/or functionality.

The eMMC standard is currently maintained by JEDEC.

See also device health.

Encryption

Data stored on the media without some form of encryption can be read by external means – a potential security risk if the media contains sensitive data. The encryption software or hardware operates on the data to provide readable content to the application and file system.

Encryption can be performed at the media level, at the file system level, or within the application itself. Linux examples of media and file system level encryption are dmcrypt and fscrypt, respectively. Support for fscrypt has been made a requirement for current and future versions of Android.

For encryption done in software, see also CPU consumption.

exFAT file system

An industry-standard file system introduced in 2006 as part of Windows CE 6.0, to extend FAT and remove limitations on file size and media capacity.

In August 2019, Microsoft published the exFAT specification for the first time, but designs which implement this require a license from Microsoft.

Since 2009, Tuxera has provided both of these licenses and an improved design of exFAT in a hassle-free bundle: Microsoft exFAT by Tuxera. This implementation features performance and interoperability advantages over other solutions – like increased structural integrity and background disk checks, for speedier mounting.

See also FAT, SD card, structural integrity.

Fail-safety, power fail-safety in embedded systems

For software, fail-safety encapsulates the designs and mechanisms to ensure both data integrity and system integrity. In general, a fail-safe design will respond to a failure in a specific way that will cause no harm to anything relying on this design – usually people or other equipment. Failure isn’t impossible, but any unsafe consequences are mitigated. Designs which require continuous availability can’t be made fail-safe.

RELATED CONTENT:
Learn how embedded storage experts tackle device failure in our whitepaper, “Understanding system integrity and how testing can help in preventing field failures.”

See also data integrity, functional safety, system integrity.

File Allocation Table (FAT) file system

A widely used type of file system originally derived from use on CP/M in the last half of the twentieth century. It has been extended to allow larger file names, file sizes, and media capacities. For early implementations, the size of the FAT entry (in bits) dictated the naming – from 12 bit FAT12 through 32 bit FAT32. The maximum size for a FAT32 volume on Windows was 32 gigabytes, with a maximum file size of 4 gigabytes. The exFAT file format was eventually designed as a way to extend the earlier FAT16 and FAT32 formats.

Tuxera offers a fail-safe implementation of FAT together with other industry standard file systems – see GravityCS by Tuxera suite.

RELATED CONTENT:
What the FAT – understanding FAT file system types

See also exFAT.

Flash memory

A type of non-volatile storage that is used in most embedded devices today. The non-rotating media is safer from physical impacts, but has a limited lifetime. Two commonly used types of non-volatile flash memory are NAND and NOR.

RELATED CONTENT:
Bit by bit: comparing NAND flash memory types for your embedded design

See also SLC flash and MLC flash.

Fragmentation, file system fragmentation

An impact to files when the data is not stored in a contiguous region on the media. This was a serious problem with rotating media, leading to physical head seek for each separate part of the data. While flash memory doesn’t have any moving parts, it can still cause latencies at the file system level and hinder throughput, reducing performance.

Fragmentation can also cause overhead, with additional metadata required for the individual segments of data set. Over time, this can also lead to reductions in available storage capacity.

RELATED CONTENT:

Read our whitepaper, “The impacts of file system fragmentation on automotive storage performance.”

Functional safety in embedded systems

Relates to all the requirements that define the environment where the embedded software developed will have to perform, and what level of risk is acceptable. One set of requirements is that the product at hand:

  • Satisfies all the design requirements.
  • All risks or hazards are identified through a systematic analysis approach.
  • The respective severity for each risk or hazard is identified and defined.
  • Proper mitigation techniques are identified and implemented.

As a result of those actions, the severity of the risks and hazards is reduced. All the risk mitigation actions are verified, validated, and properly documented, and any residual risk is reported.

RELATED CONTENT:
Read our whitepaper, “Challenges of tomorrow’s data storage integrity in automotive and IOT projects.”

Inodes

This refers to a data storage location on Linux. On the media in Linux, data is stored in blocks, with specific block sizes depending on the physical requirements of the media.

Input/output operations per second (IOPS)

This is a way of measuring performance of a system, since each of the involved instructions needs to complete (and uses media throughput to do so). Other forms of measurement include throughput, which will show up as Mb/s.

Journaling file system

As one of the methods to improve system integrity, file systems that use this technique track metadata changes in an additional reserved location called the journal. When recovering from an unexpected interruption, a journaling file system examines the structures on the media to decide which files on the media are valid and which are not. Then the fsck() tool is able to recover lost space and correct any other errors.

Many Linux file systems use journaling, including ext4, btrfs, and f2fs. Tuxera Reliance Nitro also uses journaling.

See also system integrity.

Latency

The measurable delay between an action and the response or result. For an embedded design, the raw throughput minus the file system latency minus any media firmware or software latency results in the perceived performance for the system.

Latency can be kept to a minimum by reducing overhead, optimizing write patterns, and even minimizing data moves in memory where possible.

Log-structured file system

With this type of file system, the entire media is treated as one large circular buffer. The position of the last write can be determined at mount time, allowing an incomplete last write to be easily discarded. Log-structured file systems are sometimes used on NAND-based media because they make fewer in-place writes inherently provide dynamic wear leveling. Some new designs in SSDs will likely work better with log-structured file systems.

Some examples of log-structured file systems on Linux include the flash file systems JFFS2, UBIFS, and YAFFS.

See also dynamic wear leveling.

Managed flash

NAND flash media requires software to handle the wear leveling, bad block management, and error correction. When this software is provided in silicon it is referred to as firmware, and is usually provided by the media vendor, and typically cannot be customized. Types of managed flash memory include eMMC, UFS, and SSDs. This category also covers removable media such as SD and Compact Flash (CF). A high-performing and fail-safe flash manager like Tuxera FlashFX® Tera can help provide versatility to flash media management.

See also raw flash and SD card.

MLC NAND flash

Originally NAND flash was designed to store a single bit per cell (SLC). One advancement has been to store multiple bits per cell, also known as multi-level cell (MLC). These additional bits require more measurement thresholds, and therefore bit errors are more common on this type of NAND. MLC usually has a shorter maximum erase count, and therefore a shorter lifetime.

Technically MLC refers to any situation with more than a single bit per cell. Marketing in the industry has used the term TLC to refer to three bits per cell, and QLC for four bits per cell.

MLC NAND flash imposes additional rules on the software that works with the media. Linux flash media drivers can’t match these requirements, and the flash file systems available on Linux are not guaranteed beyond SLC NAND flash.

RELATED CONTENT:
Learn more about MLC, SLC, and other NAND flash types in our whitepaper, “How to avoid end of life from NAND correctable errors.”

See also SLC NAND flash and pseudo-SLC NAND flash.

Media Transfer Protocol (MTP)

One of two primary standards used for file transfers between the host and a USB connected device. With this protocol, data is transferred at the file level – the device storage media is hidden from the host. While the command set is more limited, the host does not need to use a driver to access the media – the device file system is sufficient.

Most Android phones use MTP, and thus Windows hosts are not required to provide support for the ext4 file system commonly in use. By the same measure, other file systems can also be used in the embedded design without requiring a host driver. This allows the system designer the opportunity to select the best file system for their use without concern for a user-level requirement.

RELATED CONTENT:
Comparing protocols for USB devices – which one’s more significant?

See also USB mass storage.

NVM express (NVMe)

This non-volatile memory interface provides an open logical device specification to connect with NAND media-based devices. The interface was designed to capitalize on low latency and internal parallelism of solid-state devices.

PCI express (PCIe)

A high speed serial computer expansion, designed to replace PCI or PCI-X. Improvements include higher maximum bus throughput, smaller physical footprint, and a more detailed error detection and reporting mechanism.

Platform interoperability

Modern automotive and embedded systems often encompass more than a single task, whether using a hypervisor or multiple OS options. Each of these store data, and if the data is to be shared between environments, a common format is required. There are only a few “standard” formats, and none designed for data integrity.

Tuxera offers file systems that are operable on multiple RTOS with the same on-media format.

Power consumption

Embedded devices can be limited in power use – especially designs for use in the field or in space. This measurement reflects the amount of power in use by the device while performing standard operations. The storage software can affect this in a number of ways, including the power used to write or erase NAND flash, the processor usage (or CPU consumption), the RAM requirements, and even the amount of time the device must stay “awake” to complete a given write operation.

See also CPU consumption.

Pseudo-SLC

Media which stores multiple bits per cell (MLC) can sometimes be used as single bits per cell (SLC) by the firmware. This design halves the storage capability – one bit is stored as two – and provides a larger maximum erase count, which extends lifetime.

See also MLC NAND flash.

Raw flash

NAND flash media requires software to handle wear leveling, bad block management, and error correction. When this software is external to the device, the media is referred to as raw flash media. Some of the advantages of using external software include the ability to fine tune the design and access to the source code.

With Tuxera FlashFX Tera, the same software solution can be used with media from multiple vendors. This provides a common environment for the developers and a broader range of part choices, giving flexibility to the bill of materials.

See also flash memory.

Removable media

In earlier days, removable media referred to floppy disks. With modern designs, this is now SD cards and USB sticks. With the wide variety of desktop environments, the need for a common file system on these devices has grown. The capacity of most devices has also grown, far beyond the capabilities of FAT.

Microsoft designed exFAT for those environments, and it is the standard file system for larger SD media, and is increasingly common on USB media also. In addition to FAT, NTFS could also be used on these devices. Gravity CS by Tuxera offers a suite of industry-standard file systems for designs which need to provide access to removable media.

See also SD card and exFAT.

Ring buffer

Structured data elements can be stored in an imagined circular structure. The last element in the data set is considered to be directly ahead of the first element in the set. A single data pointer is used to indicate the “current” location in this buffer.

When logging data to a ring buffer, the oldest data will be overwritten at the pointer of the current location. The newest data is immediately before the current location. With structured data elements, a predictable count of how many elements are represented can be factored against a time that each element is logged.

This technique is used in data logging and dash cameras. The oldest logs or video segments are overwritten by the newest, providing several hours of continuous coverage.

A “modified ring buffer” technique is also used, where certain elements are protected, often by marking them read-only. In an example dash camera, these could be elements which indicate a sudden vehicular stop.

Tuxera Reliance Sense employs ring buffer technology.

See also structured data.

Single Board Computer (SBC)

This acronym refers to a design which encapsulates the processor (CPU), memory, and storage all on a single design. A popular example of an SBC is Raspberry Pi, widely used among enthusiasts.

See also SOC/SOM.

SD card

Secure Digital is a proprietary non-volatile memory format developed by the SD Association for use in portable devices. This term refers to a number of physical design specifications, from standard SD and miniSD to the more modern microSD and newer Nano memory. The format has also evolved to include the newer SDIO, SDHC, and SDXC formats.

Read more about the types of SD cards on the market – and how to format them – in our blog post, “A quick guide to SD card speed and capacity for video recording.”

Tuxera serves on the board of directors for the SD Association, working closely with other member companies to define the software and hardware standards for the future of SD media.

MicroSD card with adapter.

Secure erase

For older magnetic media, data was securely erased by first overwriting the location and then erasing it. Some environments specified multiple overwrites, to completely obliterate any chance of reading the earlier data at that location.

It is not possible to write to a block of NAND media a second time. When a file system modifies or overwrites a block, the NAND media driver creates another block instead, marking the former block for eventual erase. This, like wear leveling, is invisible to the file system and application in the device.

To securely erase NAND flash, the block erase must be done immediately. This can require special commands, and the impact of doing so will affect system latency, performance, and even media lifetime. File systems need to be aware of this also, because anytime a file is modified or deleted, the secure erase option must be used.

Learn more about secure erase in our blog post, “The nuts and bolts of secure erase.”

SLC NAND flash

As a Single Level Cell, this type of flash media stores one bit per NAND cell. This term is used to differentiate from MLC NAND flash, which stores more than one bit per cell. In general, SLC NAND provides the most erases and longest media lifetime, though it is usually more expensive.

See also MLC NAND flash and pseudo-SLC NAND flash.

SOC/SOM

System on Chip, or System on Module, describe a processor and RAM package that is more condensed than a single board computer (SBC). This type of design also allows vendors to provide “carrier” boards with several types of I/O connections that are independent of the actual processor package.

See also SBC.

Static wear leveling

All wear leveling software or firmware must track how often a given media block has been written and erased. Designs which perform static wear leveling will track which blocks are erased considerably fewer times with a difference threshold. When the difference grows too large, data is moved from the less frequently erased block to a more frequently erased one. This averages the usage on a part even more than dynamic wear leveling, providing the best possible product lifetime.

See also dynamic wear leveling.

Structured data

Structured data is a format for organizing information that conforms to a predefined schema, making it easily searchable and retrievable by computer systems. Examples of structured data include log files, raw image or sound capture, and elements (or nodes) of a tree structure.

Data that has a regular structure is easier on both applications and file systems, especially if the size of a structure matches the physical characteristics of the media. Memory buffers can be of a regular size to match the data structure. When memory is deallocated (or returned to a pool), a buffer of the same size can be reallocated for next use. This prevents fragmentation.

One application advantage for structured data is the ability to iterate over a large set of data elements. Seeking to the 15th element is a matter of multiplying the size of an element by 15 to reach the proper location.

Tuxera Reliance Sense is a specialized embedded file system designed for handling structured data.

See also fragmentation, unstructured data.

System integrity

Devices which feature high system integrity are designed to start up after unexpected power loss or system crash. A file system check will clean up any fragments of data, leaving the system in a functional state. Some of these checks run in the background – for example, fsck() on ext4 or Tuxera Reliance Velocity.

RELATED CONTENT:
Under the hood: system integrity and disk checks

See also data integrity.

Telematics

In automotive environments, telematic devices deliver data such as position and onboard diagnostics, fuel consumption, or driver behavior. The goal is often to improve fleet management and efficiency, and to protect the safety and wellbeing of fleet drivers. These devices require rich data from sensors in and outside the car.

Edge storage needs to be provided inside the car because it’s not feasible to transmit all data to the cloud for further processing. These storage solutions need to withstand heavy read workloads, countless write-and-erase cycles every day, and get the maximum life out of the storage hardware. The storage also needs to be power fail-safe to ensure the necessary data is preserved in event of failure.

See Tuxera’s automotive solutions for file systems, flash management, and networking software designed for telematics use cases.

Transactional file system

A file system design that provides both system integrity and data integrity. One way this is achieved is by not overwriting data on the media, preserving a “known good state.”  After unexpected power loss or system failure, this type of file system mounts quickly – it merely must determine which is the proper media state. No file system check or other cleanup is required.

Control over data-at-risk is managed with mount settings, flush and fsync commands, and the design of the software and hardware. In many designs, this control can be directly opposed to the raw performance of the solution. For this reason, file systems like Tuxera Reliance Edge and Tuxera Reliance Assure provide runtime access to this control also, through an API and provided system library.

See also data integrity.

Trim/discard

A trim or discard command (also known as unmap in SCSI) allows the file system to notify the media driver that certain blocks of data are no longer in use. This allows the media driver to flag the data to be removed with a later erase, and also disregard these blocks during wear leveling, garbage collection, and related operations. Properly using trim or discard commands results in both consistent high performance and the longest NAND-based media lifetime possible.

Universal Serial Bus (USB)

An industry standard for cables and connectors, covering both data communication and power connection. There have been four generations of USB standard so far, numbered USB 1.0 through USB4. Physically, there are also a number of different hardware connectors, with varying data rates.

Two standard protocols for data communication are USB mass storage and Media Transfer Protocol.

USB mass storage (UMS)

One of two primary standards used for file transfers between the host and a USB connected device. With this protocol, the device storage media is presented to the host as an external drive; this requires the host to use a matching file system driver for this storage. This is inherently how USB flash drives function.

See also Media Transfer Protocol (MTP) and exFAT.

Universal Flash Storage (UFS)

A specification for digital cameras, mobile phones, and consumer electronic devices. This was originally positioned as a replacement for eMMC and SD specifications, with goals of higher data transfer speed and increased reliability for flash storage.

See also eMMC.

Unstructured data

Unstructured data refers to information that does not have a predefined data model or format, often including text, images, and other forms of data not easily categorized in traditional databases.

Data elements of irregular sizes tend to match real-world use cases. These are simple to store sequentially, but when elements are removed, the holes left behind are often too large or too small to fit future data elements. This can mean either less usable holes in the sequential memory or elements which are fragmented over multiple locations.

Data elements which are unstructured but similar in size can be padded to the largest size, creating structured data at the cost of larger memory allocations. This is related to write amplification.

Seeking data elements in an unstructured data set is done by reading each of the elements sequentially until the proper element is found. This can be improved by adding a structured element with “pointers” to the unstructured elements. File systems uses this technique to improve access speed, using a File Allocation Table (FAT) or a tree-based structure.

Database storage also uses a structured element to accelerate access to unstructured data.

See also database storage, FAT, fragmentation, structured data, write amplification.