TECHNOLOGY IN SYSTEMS
Meeting the Challenge of Developing with FPGAs
Programming ASP-Type Devices: New Approaches for a New Paradigm
The application services platform (ASP) represents a new path for development of embedded systems. As this class of devices proliferates in the market, new approaches to development tools and methodologies will have to evolve with them.
GREG BROWN, XILINX
Page 1 of 1
An application services platform or ASP, a name coined by RTC editor-in-chief Tom Williams, is a new class of IC that combines a CPU, a standard set of configurable peripherals and a programmable fabric—all on a single device. Because they combine logic programming with software programming, these new devices offer design teams the maximum flexibility, enabling users to rapidly develop unique functionality for whatever application they are targeting. At the same time, the combination of hardware and software on one device raises many questions regarding the usage model. Luckily, the advent of the ASP does not mean that the software programming model, processes and tools have to change. When designed to use industry standards, ASPs can provide a programming model identical to that of fixed devices. Let’s examine a practical ASP flow that leverages industry standards and standard software flows.
To set the stage, Figure 1 displays the block diagram for the Xilinx Zynq-7000 EPP, an ASP device built around a dual ARM MPU core and development architecture with functionality that can expand by making use of the device’s programmable logic resources. The device is implemented in 28-nanometer silicon process technology to hit the targeted power/performance/cost points. Unlike previous processor-equipped FPGAs that required designers to program the FPGA logic blocks before they could even activate the onboard processor, the new device uses a processor-first approach in which the processor runs the show and boots from startup. This particular ASP has an application processor that provides a dual, ARM CortexA9 MPCore with NEON SIMD and floating-point engines. Each core has 32 Kbytes of separate I and D caches as well as a unified 512 Kbyte Level 2 cache. The CPU cores can function in asymmetrical (AMP) or symmetric (SMP) multiprocessing mode, or even as a single CPU with the other processor gated off for power savings. An integrated DRAM controller supports DDR3, DDR2 or LPDDR2, while a set of flash controllers handles boot and configuration storage. The device also includes several peripherals. The processing system has its own set of I/Os with dedicated pins for DRAM and a muxed I/O for the peripherals and flash devices. Unlike an ASSP, design teams can use the ASP’s programmable logic I/O to fully pin-out the peripheral set if that is what the end application requires.
Xilinx Zynq-7000 Extensible Processing Platform Block Diagram.
The processing system—and indeed the entire device—is built around the ARM AMBA interface standard with a high-speed, crossbar-switch-style interconnect. Xilinx memory-mapped the registers, especially those for the integrated peripherals, and linked the entire processing system to the programmable logic using the AXI set of interfaces, which support control, data and memory. Additionally, the device has an accelerator coherency port (ACP) to enable design teams to add cache-coherent accelerators to their system if required.
To be a viable solution, an ASP must address the following programming challenges:
- Programming model
- Configuration of the processing system
Configuration of the programmable logic
Libraries to access unique hardware and system-level functions
Board support package (BSP) development
Operating system (OS) support
Software development tools
Programming and Configuration
Programming and Configuration
First and foremost, ASPs must have an easy-to-use and robust programming model. In the present case, the programming model is based on ARM’s well-established CPU instruction set. Thus, the same types of memory-mapped calls that are made to resources in the processing system are identical for the application-specific extended functionality that is added to the programmable logic. This is very important because for software to effectively use the extensions, the extensions must use a known and consistent set of interfaces. This also helps design teams create sets of functions they can implement in software, in hardware or both. They can even create a library of software functions and hardware IP functions that share a common programming interface. Because these devices and their functions are built to common standards, third-party vendors can also create general-purpose and application-specific software functions and hardware IP, benefiting the entire marketplace.
To be a viable ASP, the device must have highly flexible CPU cores. Asymmetrical multiprocessing should support the use of different runtime environments on the different cores, either unsupervised or supervised, using hypervisors. SMP support, especially using OSs, provides an overall boost in processing performance and can remove much of the complexity of multicore design for application development. An ASP should also support single-core usage for applications that do not need or cannot take advantage of multiple CPUs. By supporting divergent use of the CPUs, the spectrum of single core and multiple core (AMP, SMP) can be delivered on a single platform, enabling application-specific use of the processing architecture.
ASPs should also allow users to configure the functionality of the processing system itself. These devices will let designers set the type of DDR they want the device to use, along with peripherals and associated muxed I/O as well as the width of the interfaces between the memory controller and the programmable logic. Users can also access a smaller subset during runtime to change, for example, arbitration or priority on the memory controller to adapt to specific traffic patterns. To simplify making all these settings, ASP vendors must provide a configuration tool that outputs configuration documentation for use in hardware and software development. These tools must also create human-readable data files that the downstream tools will use directly, be hand-edited, or used for driving script-based build environments. Figure 2 shows a simplified flow diagram to illustrate this concept.
Conceptual Design Flow Diagram.
Above and beyond being able to program the ASP’s processing system, an ASP device, by definition, allows users to offload functions to programmable logic. To make the system accessible to a wider variety of users (software developers as well as hardware engineers), configuration of the programmable logic should be under the control of the processing system. Ideally, the processing system should not require the programmable logic to be configured at boot time. Rather, it should be a choice the design team makes based on the needs of their targeted application. This enables the processing system to load a configuration based on parameters determined after boot. An ASP should also support dynamic partial reconfiguration, a technology analogous to dynamically loading and unloading software modules. The processing system can reconfigure a portion of the programmable logic to load in, for example, a new set of algorithmic parameters or even an entire new function needed at a particular time. Further, the configuration information for the programmable logic should support encryption and authentication to protect the IP. This is important not only for custom IP, but for IP purchased from third parties.
Basic System Management
ASPs should feature a multistage boot function that allows users to tailor boot-up for their specific applications. Users should also be able to select a boot device via mode pins or another mechanism. A boot ROM should support secure and nonsecure modes, including decryption and authentication. After the boot ROM, user-defined boot should make it possible to load additional boot images from not only the flash devices but any appropriate peripheral, such as Ethernet, USB or SD. Configuration of the programmable logic should be a parallel process that can be started during this stage as well. Fallback to a known-good image for both the processing system boot and the programmable logic image provides a recovery mechanism in case an image is corrupted, for example, during a remote field update. For an ASP, such operations involve not only software updates but hardware updates as well.
For implementations that require it, the capability to combine the two images is very important. A tool that merges these images into a single flash image enables ease of flash programming and reduces the number of data files required in product data management.
Any unique hardware or system-level functions should have an application programming interface. APIs simplify programming by providing a defined set of software functions that access available hardware and system resources. An example here is the dynamic configuration of the programmable logic. A set of API functions should exist to make such capability readily available to the software application to use.
Similarly, ASPs should include features that allow users to manage their device’s or even system’s power usage. Further, they should be able to accomplish this by making tradeoffs in the processing system and the programmable logic. This can cover capabilities such as shutting down the programmable logic, slowing down the clocks or putting the processing system itself in sleep mode and waking on a LAN or CAN signal.
ASP devices help narrow the board support package (BSP) development gap between fixed devices and completely programmable devices, such as soft processors in FPGAs. With fixed devices, OS vendors can provide a BSP that supports all the device’s features. Users can tailor these features for a specific implementation using that device. With a completely soft solution, vendors provide fixed BSPs only for reference, since almost every implementation uses a different set of peripherals and other capabilities. Thus a vast majority of these require custom BSPs. The design industry has tried to develop dynamic BSP generation technologies to help with this task, but the use of vendor-proprietary ISAs has limited the adoption of this technology.
With ASP devices, vendors can provide fixed BSPs that support the entire processing system. Since the programming model is consistent between the processing system and programmable logic, BSP development for the custom portion can leverage device driver libraries for a wider set of interface standardized IP. Design groups, device vendors or third-party IP vendors can develop libraries of soft IP with drivers that users can add, in turn, to a BSP.
ASPs must support industry-standard OSs. In fact, the ARM architectures enjoy what is arguably the widest support among RTOS and OS vendors as well as suppliers of open-source operating systems such as Linux. Therefore, the availability of kernel ports is not usually an issue. Rather, the work generally centers around the availability of a BSP. As previously discussed, the ASP devices can be supported much like fixed devices with the libraries available to support the extensions into the programmable logic.
As complexity has increased and more functionality is being integrated into devices, more and more development time is often spent in the debug cycles. Therefore, a viable ASP must include a robust debug infrastructure. This infrastructure needs to provide more control and visibility into the internal workings of the devices. ARM provides such an infrastructure using the CoreSight debug and trace IP. On-chip trace using a few kilobytes of memory is important, especially for field failure analysis, where the trace pins for off-chip collection are not available. Trace pins should be available as an option so that users can recover valuable pin resources once the product is in production. For multicore designs, CoreSight enables visibility into program execution and is OS-aware. The CoreSight standard enables tool support for JTAG and trace so that tools from multiple vendors can support a wide variety of ARM-based devices from multiple vendors.
An embedded logic-analyzer capability that can trigger and capture events and data provides visibility into the programmable logic. Signals between the CoreSight IP and the programmable logic should exist, enabling cross-triggering between the two domains. Breakpoints set in the software debugger should trigger the capture of the embedded logic analyzer, while trigger points set in the logic analyzer can stop the debugger. This on-chip cross-trigger keeps the debugger and logic analyzer in closer synchronization, and the data capture is much easier to correlate. Users can also create custom debug IP and add it to the programmable logic to increase control, visibility and capture.
The software development tools for ASPs must leverage tools that software developers already use, from command-line tools to integrated development environments. Since OS support for the ARM architectures is widely available, tool support is also widely available from the OS vendors and others. Open-source options such as GNU are well supported and available.
The critical component for the software tools is having the information on how the processing system is configured, the memory map, the functions residing in the programmable logic and the boot parameters. The configuration tool should provide this information for the software development tool in a human-readable and editable format to fit into the specific build environment the software development team is using.
The ASP devices that are now coming onto the marketplace present an exciting platform to address a variety of product development and employment challenges. These devices are hybrids in that they provide a fixed but extensible development capability. But when properly designed to use available standards, they can leverage what already exists while simultaneously opening up new application spaces and opportunities. The entire platform, especially programming, must be well thought through and addressed so as to enable a broad number of users to readily adapt the technology to build a variety of powerful and innovative products. A comprehensive platform must be the goal, as illustrated in Figure 3. This was the approach taken for the newly announced Xilinx Zynq-7000 Extensible Processing Platform.
More than just ASP Device, A Comprehensive Platform.
San Jose, CA.