Switched Fabrics and Publish- Subscribe Middleware Combine for a Robust Communications Architecture
The evolution of the VMEbus and other parallel bus backplane standards to support emerging, high-speed, serial switched fabric interconnects makes possible architectural freedoms that embedded system designers haven’t had in the past.
EMMANUEL ERIKSSON, DY 4 SYSTEMS, DAWN VME PRODUCTS
Page 1 of 1
Designers of complex, distributed systems stand to gain the most from the emergence of switched fabrics. Table 1 shows some of the common requirements that these designers are faced with and how those requirements are typically met with existing bus backplane technologies. The complexity of software needed to implement current solutions to these common requirements increase system development and post-deployment maintenance costs. In some cases, the required complexity results in projects that fail to complete development.
Switched fabrics such as StarFabric, PCI Express Advanced Switching (AS) and Serial RapidIO give new freedoms to designers to implement their systems. In the case of StarFabric, the solutions are available today. For others, solutions are just around the corner. On VME, the new VITA 41 spec calls for the fabric interconnect to be implemented on the P0 connector. This leaves the backplane backward compatible with earlier specifications.
As the “switched fabric wars” rage and the switched fabric trade associations race to establish the physical and electrical specifications for their technologies, niceties such as the development of abstracted, full-featured communications protocols are left to others to define, and some would argue that this is for the best. An exception to this is InfiniBand, which defines a very rich API. One software solution for leveraging switched fabrics to move messages or large amounts of data with ultra-low latencies in closely coupled, multiprocessor systems was described in an earlier issue (see “Software Solutions for Interprocessor Communications,” p. 56, RTC, August 2003).
However, for applications requiring a more loosely coupled architecture supporting dynamic loading, hot failover and other advanced features, we suggest that Publish-Subscribe protocols—already popular with users of Ethernet interconnects—are an ideal choice. Switched fabric features such as Scalability, Quality of Service and High Availability fit very well with the way publish-subscribe operates.
There are three common communication models: point-to-point, client-server and more recently, publish-subscribe. Point-to-point is like a phone call. You know the address of the remote node, establish the connection, and then communicate. A phone call and a TCP/IP socket connection are examples of point-to-point or connection-oriented communications.
The client-server model was created to help scale the point-to-point model. Multiple client nodes can establish connection to a known address where a server waits to establish connections with each client. Clients can then make requests of the server and get replies. Overlaid on the concept of software objects, client-server underlies remote method invocations. This model is well established and is the basis of Microsoft’s DCOM and the Object Management Group’s (OMG) Common Object Request Broker Architecture (CORBA) standards.
For developers building real-time distributed applications, the client-server model has some distinct disadvantages: 1) The server represents a bottleneck and potential single-point of failure; 2) The request-reply semantics require two messages to get the data for each client, which increases bandwidth load and transaction latency; and, 3) It is often based around a remote method invocation or “object-centric” design that is not suitable for many distributed real-time applications that simply need to communicate data and not objects. Shoehorning object-centric communication models into “data-centric” systems frequently leads to unnecessarily complex system designs and significantly degraded performance.
Publish-subscribe excels at real-time data distribution. Publish-subscribe is characterized by a set of data producers and data consumers. Where client-server has a request-reply form, publish-subscribe is more a “push” model. That is, after the publishers and subscribers have identified themselves on the network, the data is pushed onto the network by the publishers. Subscribers can then pull the data off the network anonymously—no requests or polling are required.
Another advantage is anonymous communications—publishers and subscribers don’t need to know each other’s physical address. This is in direct contrast to the connection-oriented communications models. The middleware keeps track of which subscribers want which data from which publishers. This makes complex data distribution patterns quite simple to program. This anonymity also makes it simpler to set up redundant publishers for fault-tolerant systems. It’s also straightforward for nodes to come into and leave the network and for applications to be moved from node to node as required in load-balanced systems.
The OMG (which manages the CORBA standard) recognized the need for publish-subscribe communications. In June 2003, the OMG adopted the new Data Distribution Service for Real-Time Systems (DDS) standard. Now there is a publish-subscribe standard for developers to use that is tailored specifically for real-time distributed systems.
As concerns communication network topologies, Ethernet uses the carrier sense multiple access/collision detect (CSMA/CD) algorithm for arbitrating transport access. This algorithm’s non-deterministic method of handling contentions or “collisions” is well-known. The TCP/IP protocol solves this issue by providing a reliable transport protocol. However, TCP/IP is also problematic in real-time systems. Its reliability algorithm introduces non-deterministic delays. Also, it is a connection-oriented protocol that doesn’t scale well and is hard to use when you need the flexibility the connectionless protocols provide.
This tradeoff between reliability on the one hand and determinism and scalability on the other is simply not an option for many real-time distributed system designers. One solution is to implement a replacement Ethernet transport for use with real-time publish-subscribe middleware. Another is to leverage the capabilities of switched fabrics.
Switched Fabrics and Publish-Subscribe
Switched fabrics have been designed from the ground up to provide scalable, reliable and high-availability communications. Buses scale poorly. They are restricted by physical size and bandwidth. Using networking technologies such as Ethernet helps but introduces its own limitations such as the need to trade reliability for determinism. Switched fabrics on the other hand are highly scalable both in the number of nodes and in the bandwidth between nodes without the determinism and reliability constraints. For example, StarFabric can support thousands of nodes with interconnecting links supporting 2.5 Gbits/s in each direction. With support for transmission distances of 10+ meters over standard Cat 5e cable, StarFabric also allows designers to physically scale systems to room size.
Systems designed using publish-subscribe protocols are naturally scalable. With anonymous messaging, designers can change the number of subscribers to published data without affecting the publishing application code by simply duplicating the subscribing code on the added nodes. Publish-subscribe also simplifies bandwidth upgrades like those needed to improve, say, a control loop’s resolution in an industrial automation system to support a faster sensor. The designer simply adds the new sensor and increases the sensor’s publishing rate. The controller node receiving the publications will be notified of new data at the faster rate.
Publish-subscribe is inherently multicasting because it can efficiently publish data to any node that may potentially be subscribed to the data. Unlike IP, which relies on the stack to perform multicast function, switched fabrics implement multicasting in the switches. The result is that protocol stack overhead is minimized.
Quality of Service (QoS) features are essentially lacking in Ethernet protocols. Switched fabrics, however, offer rich QoS features that help designers develop reliable, hard real-time systems.
For example, the credit-based flow control mechanisms used by StarFabric and PCI Express AS permit bandwidth-reserved isochronous transactions across the fabric. Isochronous transactions occur at a fixed periodic interval and fixed latency. The result is a guaranteed messaging with deterministic behavior. For hard real-time applications with strict latency requirements, isochronous messaging support combined with the matching periodic publish-subscribe messages makes for a communications architecture that is both robust and easy to program.
Just as switched fabrics offer deterministic latencies at the transport level, DDS publish-subscribe middleware makes determinism possible at the application level. For example, the DDS specification allows application developers to specify a Latency_Budget QoS policy. The middleware can use this Latency_Budget policy to better manage how it aggregates data for sending from multiple applications running on one node to multiple applications on another node. In this manner, publishers can ensure the middleware expedites its data versus the data of other publishers on the node.
The ability to replace processor blades in a powered and running system is becoming a common requirement. Switched fabrics specifications provide for physical layer hot plug capability. DDS publish-subscribe also provides “virtual” hot plug capability at the application level that complements switched fabrics’ support. Because publish-subscribe messaging is anonymous, the unannounced removal of data-reading nodes from the network will not cause the errors that would occur under a connection-based client-server model.
Switched fabrics provide rich error management features designed to support High-Availability requirements. For instance, failures in PCI Express AS fabric paths are reported to a Fabric Manager (FM) node that identifies the failed paths and reroutes traffic to avoid the failure.
At the application level, DDS-compliant middleware users can set a Deadline QoS policy on their subscribers. If the publisher doesn’t publish a new update to the subscriber within the specified Deadline time duration, the subscribing application is notified that new data was not available. This could indicate a failed application on the publishing node allowing appropriate application-specific error recovery to take place.
Support for redundant fabric paths is also a key feature of switched fabrics. For example, StarFabric’s support for a distributed switch topology results in each node having multiple, redundant paths to other nodes in the fabric (Figure 1). Combined with the error management features, this gives the designer simple-to-implement physical interconnect redundancy.
At the application level, DDS provides for redundant publishers. Redundant applications can be created that publish the exact same data onto the fabric, but with different “strengths”. Subscribers to the data topic will receive data from the higher “strength” publisher (the higher strength publication “masks” the lower strength publication). As shown in Figure 2, if the higher strength publisher fails or is removed from the network, the middleware automatically switches the subscribers to the lower strength or backup publisher without skipping a beat. This provides for “hot failover” redundancy.
Designers of distributed, embedded computing systems with one or more of the following requirements should consider the advantages of using publish-subscribe atop a switched fabric interconnect:
- Deterministic messaging
- High-availability requirements such as hot swap
- Load balancing—either dynamically in the deployed system or simply as part of an iterative software development cycle
- Support for scaling up number of processor cards in the future
By employing publish-subscribe the designer can extend the features of switched fabric interconnects to the application layer thus providing not just physical but software redundancy and determinism as well. As long as the physical distribution of the interconnect is limited to about 10 meters, copper media-based, high-speed switched fabric interconnects can meet these needs. For more widely distributed systems, a mixture of both a switched fabric and Ethernet could be used—the former for the more physically co-located, hard real-time processor cards and the latter for remote, soft real-time subsystems. The publish-subscribe middleware provides a common API and communications model over both networks. The combination of switched fabrics and publish-subscribe middleware provides a robust, real-time communications platform that greatly simplifies developing scalable, fault-tolerant, field-maintainable distributed systems.
Kanata, Ontario, Canada.