Chracterizing Solid State Drives

The Effects of Industrial Temperature on Embedded Storage

For many applications, it imperative that embedded systems developers understand what effects extended high temperatures have on SSDs. To assisting the decision process and provide reference points, OEMs can use several helpful calculations to determine SSD endurance and data retention characteristics of different types of NAND flash media.


  • Page 1 of 1
    Bookmark and Share

Article Media

The ability to support ever-increasing amounts of data and higher functionality are definitely ongoing embedded system demands.  Exploding capacity requirements of tens or hundreds of gigabytes put real pressure on storage budgets.  With the spotlight on development budgets, many embedded systems developers are realizing that NAND costs are nothing compared to what other SSD components can total. 

The storage mainstay for these mission-critical applications has been higher-end, industrial SLC-based SSDs, but OEMs may be forced to consider less expensive alternatives that still give them the capacity required by their system design.  That alternative is MLC-based SSDs.  This isn’t such a straight-forward decision to make as OEMs will need to make trade-offs in the form of eventual requalification or overall endurance in their quest to develop a reasonably-priced product.  Industrial operating temperature (I-temp) applications just make these issues more challenging because higher operating temperatures can intensify the endurance and data retention problem.

Framework for a Decision-Making Process

To find the most optimal SSD for an application, designers must have a working knowledge of reliability factors and their relationships.  Helping in the process, two application classes have been defined by the Joint Electron Device Engineering Council (JEDEC):  client (personal) computing and enterprise (multi-user) applications.  JEDEC has still not developed a standard for embedded computing primarily because of the diverse nature of these applications.  Even so, existing class definitions provide a good framework in the decision-making process.

Application class definitions include data pattern or workload, an acceptable error rate, operating and data retention temperatures, and a standard period of time the data must be retained in the power-off state (Figure 1).  JEDEC enterprise application class definitions are used by many SSD suppliers in their product datasheets to address endurance.  It is important to note that endurance and data retention can change dramatically based on workload and operating temperature.  While these datasheet values are helpful in comparing SSDs, they should not be considered a conclusive specification. Since most embedded workloads and temperatures differ greatly from the JEDEC enterprise definition, endurance and data retention will also vary.

Figure 1
JEDEC Application Classes

The calculations will be held constant for workload, uniform bit error (UBER), and functional failure requirements in this article so that the affects of endurance, high temperature and NAND configuration on power-off data retention can be specifically highlighted. 

Temperature Effects on Endurance and Data Retention

Extended temperatures exacerbate endurance and data retention because different temperature points in physics create various levels of energy momentum to the electron in the flash device causing distinct rates of leakage current. The rate of leakage current translates to differences in data retention within the flash cell itself. For endurance, extended temperatures also produce an accelerated rate of charging in the flash cell. For instance at low temperatures, the flash cell is not performing fully charged compared to how it performs at higher temperatures. This results in variations in endurance at different temperatures.  Important to the discussion is an understanding of the basic elements of SSD operation such as NAND program/erase (P/E) cycles, drive writes (DW) and write amplification with these definitions shown in Table 1.

Table 1
The basic elements of SSD operation

To accurately evaluate the effect of higher temperatures on endurance and retention we must look at the number of drive writes as opposed to capacity. The reason is that drive writes provide a comparison point along with the number of days of data retention required.

The following example illustrates this point.  The example uses a varied industrial workload with an SSD based on 1ynm MLC NAND flash rated at 3,000 P/E cycles at 40°C (104°F).  It offers one year of data retention with a workload that results in a WA of 4 that equals a total DW of 750 as shown in the equation below.

If the requirement is for a five-year deployment, the DW per day can be calculated as:

This example spotlights the need to determine the proper SSD capacity from the developers’ thorough understanding of the amount of data needed to be written and for what period of time.

While most embedded systems strive to have maximum data retention, developers are restricted by the amount provided in current NAND flash functionality. Typically it is three months for SLC and less for MLC depending on how manufacturers optimize the program time to the NAND flash device.

Why is data retention important? Data retention primarily translates to loss of important user data when the power is turned off.  Applications that have redundancy or backup are not critical compared to applications that do not have this functionality such as single-board embedded systems.

Power-on and power-off are the two types of data retention.  Power-on data retention for most SSDs is virtually unlimited.  This is because newer, high-end SSDs implement patrol read and patrol scrub algorithms where the SSD firmware periodically reads all LBAs and repairs or refreshes when needed.  Power-off data retention is really the focus for industrial temperature applications and is a crucial consideration for systems that may be sitting on a shelf prior to deployment or ones that have been decommissioned.

In the structure of a NAND flash cell, the data value is determined by the number of bits per cell and the voltage level read by the SSD controller.  The voltage level is established by the number of electrons on the transistor floating gate.  Over time, electrons on the floating gate can leak through the oxide layer back to the substrate.  The more electrons leak, the more the voltage changes and the higher the chance of a bit error.  If there are more bit errors than the SSD controller can correct, then uncorrectable errors or system errors can occur.  The number of bit errors that can be accommodated are different (higher or lower) depending on the controller hardware design. 

A stronger oxide layer equates to better data retention.  The oxide layer is used to isolate the floating gates. Oxide strength is determined by two factors – endurance and temperature. A strong oxide layer ensures more reliable flash operation, however, the trade-off is higher power consumption. The greater the number of program / erase cycles, the weaker the oxide layer becomes. 

Temperature also affects the oxide layer.  When programming, electrons get injected from the substrate onto the floating gate.  The colder the temperature, the more difficult it is to program -- the hotter, the easier it is.  Colder temperatures make it more difficult for electrons to leak back into the substrate and hotter temperatures enable more leakage to occur. Ideally, it is best to program at higher temperatures and store at lower, which is reflected in the JEDEC application classes. 

To analyze the affects of drive writes and temperature on power-off data retention, the same capacity SLC, MLC and pseudo-SLC drives at various DW points and temperatures will be compared. 

Power-off Data Retention

Many OEMs are concerned with initial or early stage power-off data retention because a device could sit on the shelf at a manufacturing facility weeks or even months after configuration and testing before being deployed.  They worry that firmware, operating systems and other configuration data could be lost before the system is deployed (Figure 2).  Figure 3 demonstrates the results of ongoing operation at industrial temperatures.

Figure 2
Data retention characteristics for the three different SSD configurations at drive writes DW ? 25. The chart shows that storing MLC SSDs at high temperatures for long periods is risky, but it is unlikely that storage facilities would have 85°C (185°F) temperatures for more than two months.

Figure 3
The MLC power-off data retention at 25°C (77°F) and 40°C (104°F) per the number of drive writes.

The data in Figure 4 shows the remarkable effects temperature has on data retention for given workloads.  For the same 750 full drive writes (0.4 drive writes per day for five years), SSDs operated and stored at 85°C (185°F) will only have two days of data retention compared to drives at 40°C (104°F) that have one year and others at room temperature 25°C (77°F) that will deliver almost eight years of data retention.

Figure 4
The MLC power-off data retention at 25°C (77°F) and 40°C (104°F) per the number of drive writes.

Examples of applications with growing needs for storage capacity include edge routers or fleet tracking systems that also require maximum data retention and endurance. These types of systems continuously operate at extended temperatures and are exposed to harsh environments so the reliability of I-temp storage is critical. 

SLC-based SSDs that have a price premium compared to MLC may not be a viable option for budget-constrained designs.  If the storage budget dictates employing MLC-based SSDs, then workload, operating temperature and data retention must be thoroughly evaluated with a keen focus on retention requirements in a power off state.  Developers are wise to select the size of an SSD for retention time if the system loses power in applications that have long, higher temperature deployments. 

Virtium Technology
Rancho Santa Margarita, CA
(949) 888-2444