Advances in Embedded Processing Architecture
New Embedded Generation Fuses x86 with Parallel Processing Engine
The G Series Fusion architecture from AMD pushes the envelope for the processing power in a single device. The x86 instruction set is integrated with a parallel core that can deliver high-end graphics and video as well as general compute-intensive parallel processing.
TOM WILLIAMS, EDITOR-IN-CHIEF
Page 1 of 1
What do embedded processor customers want? Customers want infinite performance, zero power consumption, all for the price of a penny or less. And semiconductor manufacturers are scrambling to accommodate them. We’re not there yet but the drive is unrelentingly in that direction. Perhaps paradoxically, embedded customers also want something else. They also want product lifetimes on the order of ten years, which means that once they have designed this close-to-ultimate processor into their products, they want to stay with it and its soon to be no-longer-ultimate performance/power/cost characteristics for a long time.
The end-of-life conflict comes from the fact that the PC market, the main world market for processors, keeps moving up the performance curve every few months abandoning the older processors as it goes. So stretching out the end of life (EOL) requires a commitment to the embedded market on the part of the manufacturer that runs somewhat counter to the demands of the much larger PC market. Often this had consisted of stretching out the EOL of processors made for the PC market but also popular for embedded applications.
Advanced Micro Devices, which has been serving the embedded market as well as the PC and notebook market for a good many years, has recently come out with a new high-performance x86-based architecture that includes an advanced graphics processing unit (GPU) integrated into a single device. The integration enables the GPU to act together with the CPU as a single computing device as well as play the role of a graphics engine. This combination has been dubbed by AMD as the accelerated processing unit or APU (Figure 1). The AMD Embedded G Series platform is aimed at the embedded market and features a 413-pin ball grid array (BGA).
The accelerated processing unit (APU) is the result of the integration of a multicore x86 CPU with a parallel graphics processor that is also capable of single instruction multiple data (SIMD) parallel processing for general computing tasks.
A minimum of five years product life is guaranteed. AMD says that its manufacturing arrangements with foundries such as TSMC in Taiwan, make it possible to economically continue a product with what will eventually become a somewhat older processor for a longer time.
AMD is calling its architecture the Fusion family of APUs because it considers the integration of CPU and GPU to constitute a truly heterogeneous processing device rather than a CPU with a tightly integrated peripheral. The APU consists of a multicore x86 architecture CPU coupled via a high-speed bus with a parallel GPU architecture that is not only capable of high-end graphics processing but also of high-speed parallel data processing. It consists of video and display elements along with parallel SIMD arrays.
The way the GPU has been integrated has led to its description as a “discrete class GPU.” That means that rather than being an internal CPU element, it is connected via the high-speed bus like a discrete GPU, has direct access to memory like a discrete GPU, so it is both integrated and discrete, hence “discrete class.” This kind of integration also enables it to perform high-speed parallel processing tasks other than graphics such as array processing applications. It can also combine such parallel tasks with graphics and display tasks—with appropriate performance tradeoffs.
In terms of the combination of parallel computing engines with an x86 processor, the AMD APU is reminiscent of the Compute Unified Device Architecture (CUDA) developed by Nvidia for graphics processing that has been combined on boards with x86 processors. In the case of AMD, of course, this idea had been carried to the integration of two such elements on a single die.
This is important because the use of CUDA with x86 processors had given birth to a number of software solutions to exploit its potential. One of these is a framework called OpenCL that is designed to provide for writing programs that execute across heterogeneous platforms such as CPUs and GPUs. AMD supports OpenCL, which is managed by a non-profit consortium called the Khronos Group. OpenCL is included in the G Series development kit, which also includes OpenCL-compatible compilers.
It should come as no surprise that the G Series is equipped with the latest integrated display interfaces on-chip. These include HDMI, dual-link or dual single-link DVI, dual DisplayPort, LVDS and analog VGA. In addition, there is a x4 PCI Express port directly off the chip and an integrated DDR3 memory controller.
The GPU core architecture has direct access to memory via a ring bus memory controller and provides full support for DirectX 11 including full speed 32-bit floating point per component operations. Motion video acceleration is provided through dedicated hardware for H.264, VC-1 and MPEG2 decode.
Additional I/O is provided by an I/O hub called the Hudson, which connects to the G Series processor via a unified media interface (UMI). The Hudson hub provides a SATA interface, LPC and SPI as well as high-definition audio, USB 2.0 and another four x1 PCIe lanes. AMD has at present no partners building specialized versions of the I/O hub for such vertical markets as automotive, nor is it clear if there is information available to configure FPGAs with specialized I/O to communicate over the UMI (Figure 2).
The integrated high-end video and graphics are available via interfaces directly from the processor die. Other I/O is carried by the I/O hub attached via the universal media interface.
AMD also offers a development board with either the T56N 1.6 GHz dual core or the T40N 1.0 GHz dual core processor and the A55E controller hub (Hudson). The board provides a variety of display interfaces.
The availability of high-performance embedded processors with high-end 3D, high-resolution graphics is significantly extending the reach of what has traditionally been considered the realm of embedded systems. From industrial controls and headless devices of the past, we are witnessing the emergence of systems with compelling visual interfaces that are nonetheless dedicated in functions yet spreading more into what may be considered the “consumer” market. At the same time, what was once considered the consumer electronics market is no longer as well defined as it once was.
There was a time when consumer electronics meant things like televisions, stereo and audio equipment for the home and even personal computers. Now we are seeing devices such as a restaurant or bar table whose top is an interactive touch-sensitive video display. One such system not only lets customers play games but presents the menu so that they can order food and drinks. If someone sets a glass down on the table, it senses that it is a glass. It is even able to determine if the glass is getting empty and can alert the customer to ask if he or she would like to order a refill. Designs like this in addition to the growing market for interactive digital signage, casino gaming machines of all sorts, point of sale and kiosk systems are all target applications of this enhanced class of embedded processor. Things that formerly were inanimate objects like bar tables are becoming intelligent interactive devices thanks to the computing and graphics power that is now being packed into this generation of small, low-power and low-cost processing engines.
Advanced Micro Devices