If You Cant Take the Heat... Get out of the Stack

  • Page 1 of 1
    Bookmark and Share

In spite of the new ultra-low-power silicon on the market, it’s tempting to bite off more than you can cool. If you ‘duo’ (do), you can blow your diet threefold—first, you can kill your expensive CPU; second, you add the cost of the 500W power plant required to boot it; and finally, you lower reliability with the CPU cooler and system fans (including the fans to cool the power supply). It’s even worse if your system requires a high-end graphics card that also requires cooling. And don’t forget to budget for the inefficient / unconverted power of the supplies themselves—ten watts or more—since that heat also must be removed. Job security for your thermal engineer. 

Full-size ATX motherboards spread out the hot CPU and graphics cards longitudinally, and Rambo fans provide forced-air cooling to the tune of hundreds of linear feet per minute. If the fans fail, the system overheats and shuts down, with no damage other than user inconvenience and a service call. The situation is complicated when re-purposing this desktop technology for the small form factor world. Move from an indoor to an outdoor environment, such as a vehicle, and the ambient temperature baseline can jump up dramatically. Low-voltage (LV) and ultra-low-voltage (ULV) processor SKUs suddenly become well worth the price premium. But first download the processor datasheet to see whether the Tj max—the on-die transistor junction temperature maximum specification—is 100°C or only 90°C.

Industrial and military systems are often specified up to 85°C. Forced-air cooling is typically not allowed. When fan bearings go, blue screens and smoke screens can cause collateral damage. Alternatives to heat sinks and fans are needed. The lower the temperature to which the electronics are subjected, the better the reliability / MTBF will be. 

PC/104 stacks have been popular in tight spaces such as military, vehicle and aircraft applications. Unlike the motherboard example above where the heat is spread out, the heat generated by the CPU, graphics and power supply are combined into the same vertical stacking chimney. Without adequate cooling, the best-case scenario is reduced operating lifetime (shortened MTBF). The worst-case scenario is thermal runaway, where heat generated exceeds heat removed, and the temperature keeps rising… a ‘smoke stack’ reminiscent of Three Mile Island.

Heat pipes can solve this dilemma. With heat pipes, the heat is neither created nor destroyed, just moved to a location far away from sensitive electronics where it can be safely absorbed and dissipated, such as the system enclosure or a large heat sink. However, these solutions are expensive for this low-volume, high-mix market because they are custom-fitted to each unique board/stack/enclosure. And the need to increase the board-to-board stack height from 0.6” to 0.9” to make room for the heat pipe and aluminum block adds more cost and might mean that the proposed next-generation board set won’t fit in the space allotted. 

Enter computer-on-modules (COMs) to the rescue. Rather than insisting on stacking up and down, COMs make an important simplifying assumption, which has a side benefit for thermal designers—they only stack downward. As a result of the strict top side height limit, heat spreaders can be installed on the top surface to bring the same low-cost thermal relief to the x86 community that analog designers have used for many decades. 

The ultimate thermal solution consists of directly coupling multicore processor and chipset heat to the flat metal heat spreader plate, which is then attached to the outside cabinet or system enclosure. Recall that thermal rise equals the power dissipation in watts times the ‘thermal resistance’ in degrees C per watt. Aluminum and copper have low thermal resistance making them ideal for a heat spreader. The weakest link is usually the thermal grease or gap filler pad in the case of a heat spreader plate, or a heat sink in a low airflow environment. Expect a 10 to 15 degree rise from ambient to thermal junction depending upon the processor’s thermal design power (TDP) rating and the resistance of the thermal solution. This means that the processor die could still be running at 90°C if the air outside the thermal solution is 75°C. Be especially careful of Atom SBCs that claim to support 85°C because only very few Atom processor models have a high enough junction temperature rating to support +85°C correctly.

More efficient processors and chipsets coming down the road will re-invent this market. Until then, thermal analysis remains a critical part of your design.