RTOSs: Compact, Powerful and Fast

Addressing the Testing Challenges of Safety-Critical Embedded Systems with Ultra-Light Instrumentation

Traditional approaches to testing and verifying today’s complex multicore systems to meet safety-critical standards can affect memory and runtime performance to the point of creating even more uncertainty. A new approach promises to provide verification with minimal effect on system behavior.


  • Page 1 of 1
    Bookmark and Share

Article Media

Every embedded development environment, from 8-bit microcontrollers to 64-bit multicore CPUs has a variety of debug and code trace capabilities either built in or used commonly, from printf() debugging to hardware-assisted trace solutions. In early development, these tools are essential to acquiring insight into the behavior of the code. However, when a system is required to meet the needs of safety-critical industry standards, these debug and code trace capabilities are simply not sufficient. Their overly intrusive nature causes them to adversely affect performance and behavior of the software and the system as a whole.

Satisfying safety-critical industry standards (e.g., DO-178C for aerospace, IEC 62304 for medical devices, IEC 61508 for industrial systems, and ISO 26262 for automotive) requires that code be executed and tested to completion on the final system hardware. The overhead of using debug and code trace capabilities for system level testing and for assessing test completion through coverage analysis is particularly onerous.

Traditional debug and trace solutions typically require the software build process to include debug symbols, which has a significant impact on the software memory footprint and the run-time performance.  The complexity added with multicore applications potentially creates an exponentially more difficult debug situation and environment.  In addition, debug and trace mechanisms are not always available for multicore systems, but when they are, this overhead becomes so intrusive that it affects the synchronization mechanisms required to handle the resource contentions inherent in multicore systems (Figure 1).

Figure 1
Traditional coverage collection focuses on monitoring each line of code, which requires substantial overhead just to facilitate the data logging of the debug traces.

When it comes to cost effectively proving that embedded software meets the requirements of safety-critical standards, the state-of-the-art requires collecting test and structural coverage data by instrumenting the code under test. With increasingly tight development schedules and industry’s drive to reduce the size, weight and power (SWaP) of embedded systems, the memory and run-time overhead of traditional instrumentation and data collection techniques has become a productivity and application performance barrier. Traditional instrumentation and data collection techniques are simply not viable with the multicore systems increasingly used to meet SWaP requirements. However, new instrumentation techniques are now available that help to eliminate these challenges, enabling the cost-effective verification of multi-core systems to safety critical-standards.

Meeting the verification requirements for safety-critical industry standards requires that all testing be performed at the system level on the final system hardware, and test completeness must be proven through the use of structural coverage analysis. Both of these objectives are addressed by the new instrumentation techniques.

What Makes Safety-Critical Code Harder

to Test?

Before looking at the multicore challenge, we need to look at the typical dynamics of safety-critical development. First of all, the hardware arrives quite late in the project timeline. Clearly, software development cannot wait until the hardware shows up to start testing, particularly since the best way to produce a high-quality system is to test early and test often.

In order to keep the project on track, particularly with the enormous increase in the quantity and complexity of the code, the traditional test harness “stubs out” the missing hardware interfaces and/or enables test automation. While this enables software development to proceed, safety-critical software must be tested at the system level. And this is when the challenges begin. Target systems never have the same memory and run-time performance as the development platforms. The instrumentation of the code necessary to prove structural coverage tends to bog systems down, particularly if onboard I/O is needed to export test results. 

Since instrumentation is often the ultimate challenge, let’s drill down to where the limitations lie. The traditional instrumentation used for performing structural coverage analysis is a combination of precompilation and run-time processes. Most companies place probe points on every line of code. Each instrumentation point  becomes a function call in a data logging library, which at run-time records the points of code that are executed.

While this approach seems logical, the impact on both the size of the compiled executable and the run-time performance is significant. The size of the compiled executable is affected not only by each of the instrumentation points in code, but also by the inclusion of the instrumentation data logging library. These extra components can have a significant memory impact on both the firmware image in non-volatile memory and the run-time image in RAM. At run-time, each of the instrumentation data logging function calls incurs a significant performance overhead that can easily compromise the run-time performance of the system.

Because of the impact on system performance, it is not normally possible to instrument an embedded system in its entirety due to limited target resources. This limitation is typically addressed by verifying and analyzing the code one component at a time and then aggregating the results. In the safety-critical world, this step-by-step process is made more challenging by the fact that the processors and tool chains used are typically not the latest and so have even greater resource limitations. Traditional processes therefore require that system-level testing be repeated for each component, significantly extending the overall verification schedule.

Testing on multi-core systems further complicates these challenges. When multiple processes run on different cores, collecting structural coverage data efficiently can be hampered by concurrency, reliability, and robustness roadblocks. For example, the typical approach to creating thread-safe instrumentation data-logging functions relies on the use of synchronization objects such as mutexes. This ensures that only one process can execute the data logging code at a time; the first process to execute the function “grabs” the mutex, blocking any other process attempting to use the same function from executing. Once the first function has completed the data logging process, it “releases” the mutex and the next process in the execution list then “grabs” it and continues its processing. It’s not hard to see that this hand-off method could adversely affect the run-time performance of the system (Figure 2).

Figure 2
Safety-critical libraries typically control data into and out of the application, which restricts the use of typical IO functionality and memory allocation. Because of this, instrumentation to collect code coverage must be independent of this functionality.

So, How Can Multicore Testing Happen?

First of all, we need to rethink instrumentation. Remember how traditional instrumentation is a combination of precompiled and run-time processes where probe points are inserted on every line of code. There is a better way!

To make things more efficient, static analysis of the code under test can be used to determine the best locations to place instrumentation points. This ultra-light instrumentation coupled with a new form of highly optimized test harness framework significantly reduces the memory footprint required to perform system-level testing and coverage analysis. With this approach, it is now possible to use test automation and hardware stubbing on target systems with well under 1K bytes RAM/ROM! This approach takes advantage of a highly optimized data collection approach that integrates all platform test results and coverage dependencies into one data structure, a data structure generated by static analysis that takes into account concurrency constraints as part of its structure. Finally, developers can instrument applications on resource-constrained platforms.

To prevent concurrency issues at run-time, this approach eliminates calls into the operating system or to other library functions that manage memory or deadlocks. Thus, on resource-limited target platforms, it enables the test environment to mirror the speed and functionality of the final application execution. Rather than having to piece together multiple component-level tests, system-level testing can be accomplished with fewer—if not a single—pass, enormously cutting the time needed for testing. Structural coverage analysis can be captured at the individual core or aggregated to provide a multicore system-level view.

The test results and coverage data are tracked in a scoreboard-like matrix of bits, whereby each bit translates to a particular control-flow decision point in code. This means that storing results or “checking off” each instrumentation point involves setting a value in a position of the matrix that corresponds to the location in code that was just executed.

The challenge here is performance, especially for the coverage instrumentation. A straightforward “index into a matrix” operation involves calculation of target addresses each time, which may sound trivial, but it adds up. Multicore systems just make matters worse, not only because you might expect such new programs to be bigger, but because now you have the possibility of collisions writing to the data structures.

As a result, a unique precompilation process can perform all of the address calculations for each instrumentation point ahead of runtime and store the results into a data structure set up so that multiple cores can read and write to it, consistently and simultaneously as necessary.  This minimizes runtime overhead while at the same time reducing address collisions which may be possible in multicore applications (Figure 3).

Figure 3
If you compare the speed of traditional single core to multicore instrumentation, you quickly see why system verification has not been possible. However, all of this overhead goes away when you use structured bitmap techniques. Here concurrence management can be done directly at the instrumentation layer and the instrumentation optimized to 1-3 instructions per probe point versus dozens of instructions and the waits of the traditional approach.

Multicore Testing Breakthroughs

For the first time, the technology exists that can enable multicore systems to achieve compliance. What’s made that possible? Two new “bests” in verification technology. First, the structure can now be set to fully use every bit. To minimize memory footprint, one bit per decision point makes the instrumentation as light as possible.

Secondly, the inline structure manipulation is done at compile time, and results in anywhere between one and three instructions, depending on the processor and memory-addressing scheme. This approach is many times lighter than traditional approaches, which can result in 10-20 instructions per probe point.

Together, these approaches have been validated by users to produce an overall overhead of one to ten percent in terms of executable size and execution time, marking a significant reduction in overhead from other mechanisms.

By minimizing the memory and performance overheads of both system test frameworks and code coverage instrumentation, developers can not only instrument applications on resource-constrained platforms such as multicore platforms, but they are also able to run tests once and capture data for the entire application. This helps to reduce or eliminate any test duplication that is required to achieve the verification objectives for meeting safety-critical standards, increasing productivity.

Furthermore, by explicitly addressing the vagaries of instrumenting, verifying, and measuring the coverage analysis of code executed in a multicore environment, this approach helps projects realize the tight development schedules inherent to industry’s drive to reduce SWaP.

LDRA Technology
Atlanta, GA
(855) 855-5372