Optimizing Energy Use

Power Debugging the Software: Optimizing the Power Consumption of an Embedded System

In an active system, the power consumption depends not only on the hardware design, but also on how it is used. And that is controlled by the software.


  • Page 1 of 1
    Bookmark and Share

Article Media

Power debugging is a methodology that provides software developers with information about how the software implementation in an embedded system affects system-level power consumption. By coupling source code to power consumption, testing and tuning for power optimization is enabled. Recent innovative technology has integrated the system’s power consumption into the embedded software development tools. With this approach, developers get an insight into how power consumption can be minimized in embedded software. That is what we call power debugging.

The technology for power debugging is based on the ability to sample the power consumption and correlate each sample with the program’s instruction sequence and hence with the source code. One difficulty is to achieve high precision with sampling. Ideally you would need to sample the power consumption with the same frequency the system clock uses, but power system capacitances will reduce the reliability of such measurements. Put simply, the measured power consumption will be blurred in relation to what is actually being consumed by the CPU and peripherals.

In practice this is not a problem though. As from the software developer’s perspective it is more interesting to correlate the power consumption with the source code and various events in the program execution than with individual instructions, so the resolution needed is much lower than one sample per instruction.

Power is measured by the debug probe. For example, the IAR Embedded Workbench uses the IAR J-Link Ultra. This measures the voltage drop across a small resistor in series with the supply power to the device (Figure 1). The voltage drop is measured by a differential amplifier and then sampled by an AD converter.

Figure 1
In order to correlate the PC and the power samples, the time offset between the PC sample clock and the ADC clock must be considered. The debug probe calculates this offset and makes it part of the data packets sent to the debugger.

The key to accurate power debugging is a good correlation between the instruction trace and the power samples. The best correlation can be done if complete instruction trace is available. The drawback with it is that it is not available in all devices and, if it is, it often requires a special debug probe.

Less accurate but still giving good correlation is to use a PC (program counter) sampling facility that can be found in some modern on-chip debug architectures. It will sample the PC periodically and each sample will be given a time stamp. The debug probe samples the power consumption of the device using an AD converter. By time stamping the sampled power values and the PC samples it is possible for the debugger to present power data on the same time axis as graphs like interrupt log and variable plots, and to correlate power data to source code (Figure 1).

Figure 1
In order to correlate the PC and the power samples, the time offset between the PC sample clock and the ADC clock must be considered. The debug probe calculates this offset and makes it part of the data packets sent to the debugger.

Presenting the Power Debug Information

As stated before, power debugging is based on the ability to sample the power consumption and correlate each sample with the source code. To illustrate this, let’s look at how IAR Embedded Workbench displays this. The power samples can be displayed in different formats. The Power Log window is a log of all collected power samples. This window can be useful to find peaks in the power sampling, and since the power samples are correlated with the executed code, it is possible to double-click on a value in the Power Log window to get to the corresponding code. Depending on the power sampling frequency, the precision will be different, but there is a good chance that you find the code sequence that caused the peak (Figure 2). Another way of viewing the power samples is via the Timeline window. In the Timeline window, the power samples are displayed in a time scale together with the call stack and up to four application variables that you can select (Figure 3). 

Figure 3
The timeline window combines quantities on a common time scale: Top graph: Two application values sampled by the DWT unit. Second graph: All interrupts activities in the system. Third graph: The call stack. Bottom graph: Power samples in milli-Ampere, double clicking on a power value takes you to the corresponding code.

Figure 2
The power log window shows a log of all measured power values. The time values are measured relative to program start. Double clicking a line in the log window will take you to the source code corresponding to the program counter.

In embedded systems, peripheral devices often account for much of the power consumption, and software controls how they are used, regardless of whether they are integrated into the micro-controller or not. This view provides a very convenient way of viewing the power consumption in relation to both function calls and, if variables that are related to the status of a certain peripheral are used, also to activities that increase the power consumption on the board. The goal here is of course to see if the code can be optimized in the power domain. Also the Timeline window is correlated to both the Power Log window and the source code windows, so that you are just a double-click away from the source code that corresponds to the values you see on the time line.

Power Profiling

In practice and in a task-oriented system it is probably more interesting to see how a particular function affects power consumption than to see statement by statement of how the power consumption changes. The function profiler will help you find the functions where most time is spent during execution for a given stimulus. In this way you can expose regions in the application where optimizations for power consumption should be done. 

On a device with the ability to sample the PC, the debugger can provide statistical profiling. The profiler finds the function that correlates to the sampled PC value and builds an internal database of how often the functions are executed to generate function profiling information. The profiling information for each function in an application will be displayed in a debugger while the application is running. With power profiling we combine the function profiling with the power sampling to measure the power consumption per function and display that in the Function Profiler window (Figure 4).

Figure 4
The profiling window lists all functions in the application together with statistical data from the PC sampling and the power sampling. Average, min and max power values are provided for each function.

The Function Profiler window will list the number of samples per function and also the average values together with max and min values. Once again we have a convenient way of finding peaks and abnormal behavior when it comes to power consumption in an embedded system. The system can appear to be fully functional and behave as expected in tests, but the power consumption can be much higher than it should and now we have a way to see that.

Optimizing Code for Power

In general optimizing for power is very similar to optimizing for speed. The faster a task is executed, the more time can be spent in a low-power mode. So by maximizing the idle time we are reducing the power consumption.

Some examples can help illustrate the difficulty in identifying how a system unnecessarily consumes energy and where the system can be optimized for power. Typically it is not explicit flaws in the source code that are exposed, but rather opportunities to tune how the hardware is utilized. Sometimes, however, it may involve what might be described as pure bugs.

Power debugging can be used to diagnose the effects of different low-power modes. Many embedded applications spend most of their time waiting for something to happen: receiving data on a serial port, watching an I/O pin change state, or waiting for a time delay to expire. If the processor is still running at full speed when it is idle, battery life is being consumed while very little is being accomplished. So in many applications the microprocessor is only active during a very small amount of the total time, and by placing it in a low-power mode during the idle time, the battery life can be extended by orders of magnitude.

A good approach is to have a task-oriented design and to use an RTOS. In a task-oriented design, a task can be defined with the lowest priority, and it will only run when there is no other task that needs to run. This idle task is the perfect place to implement power management. In practice, every time the idle task is activated, it puts the processor into a low power mode. Many microprocessors and other silicon devices have a number of different low-power modes, in which different parts of the processor can be turned off when they are not needed. The oscillator can for example either be turned off or switched to a lower frequency, peripheral units and timers can be turned off, and the CPU stops executing instructions. The different low-power modes have different power consumption based on which peripherals are left on.

A power debugging tool can be very useful when elaborating with different low-level modes. The Function Profiler could be used to compare the power measurement for the task or function that brings the system down to the low-power mode when different low-power modes are used. Both the mean value and the percentage of the total power consumption could be useful in the comparison.

CPU frequency definitely affects power consumption as well. Power consumption in a CMOS MCU is theoretically given by the formula:

P = f x U^2 x k

where f is the clock frequency, U is the supply voltage and k is a constant.

Power debugging allows the developer to verify the power consumption as a factor of the clock frequency. A system that spends very little time in sleep mode at 50 MHz is expected to spend 50% of the time in sleep mode when running at 100 MHz. The power data in the debugger will allow the developer to verify expected behavior and if nonlinear dependency on the clock frequency exists, to choose the operating frequency that gives the lowest power consumption.

An example involving interrupts can serve to illustrate another situation where it is difficult to identify that a system consumes unnecessary energy. Figure 5 shows a diagram of the power consumption of an event-driven system where the system at t0 is in an inactive mode and the current is I0. At t1 the system is activated whereby the current rises to I1 which is the system’s power consumption in active mode with one used peripheral device. At t2 the execution becomes suspended by an interrupt that is handled with higher priority. Peripheral devices that were already active are not turned off, although the thread with higher priority is not using them. Instead, more peripheral devices are activated by the new thread, resulting in an increase in current I2 between t2 and t3 when control is handed back to the thread with lower priority.

Figure 5
Power consumption in an event-driven system. The shaded area shows wasted power due to poorly scheduling the activation of two peripherals.

The functionality of the system could be excellent and it can be optimized in terms of execution speed and code size. But in the power domain even more optimization can be done. In Figure 5, the yellow area represents the energy that could have been saved if the peripherals that are not used between t2 and t3 had been turned off, or if the priorities of the two threads could have been changed.

Figure 5
Power consumption in an event-driven system. The shaded area shows wasted power due to poorly scheduling the activation of two peripherals.

Using power debugging would have made it easy to discover the extraordinary increase in power consumption that occurs when the interrupt hits and identify it as abnormal. A closer examination of the Timeline window could have detected that unused peripheral devices were activated and consuming power for a longer period than necessary. Naturally there would have to be a review of whether it is worth spending extra clock cycles to turn on and off peripherals in a situation like this example.

Mixing analog and digital circuits on the same board has its own challenges. Board layout and routing become important to keep the analog noise levels at a low level to ensure accurate sampling of low level analog signals. Doing a good mixed signal design requires careful hardware considerations and skills. Software design can also affect the quality of the analog measurements. Performing a lot of I/O activity at the same time as sampling analog signals will cause many digital lines to toggle state at the same time, a candidate for introducing extra noise into the AD converter (Figure 6).

Figure 6
An example of where the acquisition of a low-level analog signal is disturbed by a power spike caused by the switching of a high power stepper motor. Power debugging can help find better sampling points that do not interfere with the stepper motor switching.

Power debugging will aid to investigate interference from digital and power supply lines into the analog parts. Interrupt activity can easily be displayed in the Timeline window along with power data. Studying the power graph right before the AD-converter interrupts could identify power spikes in the vicinity of AD conversions that could be the source of noise and must be investigated. All data presented in the timeline window is correlated to the executed code. Simply double-clicking on a suspicious power sample will bring up the corresponding C source code. 

IAR Systems
Foster City, CA.
(650) 287-4250.