Data-Oriented Architecture: Loosely Coupling Systems into “Systems of Systems”
Real-time systems today interact with other systems, both hard and soft real time as well as enterprise. Easily integrating them requires a fresh look at data-oriented interfaces.
RAJIVE JOSHI, REAL-TIME INNOVATIONS INC.
Page 1 of 1
Embedded systems are no longer self-contained boxes of electronics and firmware. Many of them now have to interact with the rest of the world. They have to cooperate with each other, and in turn provide information to upstream systems so that other embedded systems, human operators and even the general public can see what is going on.
You can see this effect in a new generation of traffic-control systems. Data from traffic sensors does not just control how traffic lights and active road signs operate in real time. The same information is used to provide an overview of what is happening across a city to operators in a control center and fed to public information kiosks around the city to let people know what their journey will be like. It may even be streamed to drivers in their cars. This is the structure of a new class of “system of systems” and it requires a fresh design perspective to address the realities of integrating components from multiple independent providers.
Service-Oriented Architecture (SOA) has emerged as an effective paradigm in the enterprise computing space for addressing integration of software components. How do we utilize the SOA guiding principles of reuse, granularity, modularity, composability, componentization, interoperability and standards compliance to the design of embedded and real-time systems to achieve similar business benefits? What are the fundamental underlying principles that we can utilize to integrate embedded systems with enterprise systems, while optimizing end-to-end real-time performance? Can this be done without magnifying the ongoing operational and administrative costs?
We address these concerns by introducing the framework of “Data-Oriented Architecture,” which can be viewed as a SOA for real-time development and end-to-end integration of disparate independently developed software components. The data-oriented architecture framework results in “loosely coupled” software components with data-oriented interfaces that can be seamlessly integrated using a high-performance standards-based communication middleware infrastructure, such as the Data Distribution Service (DDS).
A Real-Time System-of-Systems Scenario
The air traffic control example of Figure 1 involves a variety of disparate systems that must seamlessly operate as a whole. On the “edge” is a real-time avionics system inside the aircraft, which may communicate with a control tower. The data flowing in this system is typically at high rates and is time-critical. Violating timing constraints could result in the failure of the aircraft or jeopardize life or safety.
The control tower is yet another independent real-time system, monitoring various aircraft in the region, coordinating their traffic flow and generating alarms to highlight unusual conditions. The data flowing in this system is time-sensitive for proper local and wide-area system operation, although it may be a bit more tolerant of occasional delays.
In our simplified example, the control tower communicates with the airport enterprise information system. The enterprise information system keeps track of historical information, flight status and so on and may communicate with multiple control towers and other enterprise information systems. The enterprise system is not in the time-critical path and therefore can be much more tolerant of delays on arrival of data. This enterprise system is responsible for synthesizing a composite “dashboard” view, such as passenger information, flight arrival and departure status.
Key Integration Challenges
Such a system of systems must effectively deal with various issues. The information crosses trust boundaries, where each system is controlled and managed independently, and involves social, political and business considerations. The quantitative and qualitative differences in the data exchange, performance and real-time requirements across the disparate systems must be dealt with. For example, an edge system often carries time-critical data at high rates, some of which must eventually trickle into an enterprise system. Also, the architecture involves different technology stacks, design models and component life cycles. Many of these systems have different components evolving at different rates and being upgraded independently.
By their very nature, the systems under consideration are loosely coupled—minimal assumptions can be made about the interface between two interacting systems. The integration should be robust to independent changes on either side of an interface. Ideally, changes in one side should not force changes on the other side. This implies that the interface should contain only the invariants that describe the interaction between the two systems. As behavior is implemented by each independent system, the interface between them must not include any system-specific state or behavior. Therefore, the essential invariant is the information exchange between the two systems (Figure 2).
An information exchange can be described in terms of the information exchange “data model,” which involves the roles of data “producer” and “consumer” participating in the information exchange. Thus, when dealing with loosely coupled systems, a system’s interface can be described in terms of the data model and the role (producer or consumer) the system plays in the information exchange.
The systems on either side of an interface may differ in the qualitative aspects of their behavior, including differences in data volumes, rates, real-time constraints and so on. We use the term “impedance mismatch” as shorthand for all the non-functional differences in the information exchange between two systems. Dealing with the impedance mismatch will involve considerations such as the quality of service that each side of the exchange expects and the architecture the overall system of systems adopts.
The independently managed systems can appear and disappear asynchronously, as they are started, shutdown, rebooted, upgraded, or reconfigured. The environment can change dynamically, causing systems to react differently. In general, it is not possible nor is it practical to have a centralized administrator or coordinator of the various systems, especially at the granularity of asynchronous changes that may occur dynamically. Thus, each system must detect and react to dynamic changes as they occur. Ideally, the information exchange infrastructure would be self-aware in the sense of being able to detect and inform the systems when changes occur in their connectivity with other systems.
System design in general may be viewed as a collection of interconnected components. Depending on the context and granularity of scale, a component may be, for example, a system of systems, an entire system within a system of systems, or simply an application in a system.
How do we create an unbreakable system software architecture that can accommodate disparate components maintained by independent parties?
An appropriate design model for building loosely coupled systems can be found in the principles of “data-oriented programming” or DOP as described by Eugene Kuznetsov of Datapower Technology (now part of IBM). It is based on the observation that the data model is the only invariant (if any) in a loosely coupled system, and should be exposed as a first-class citizen. In other words, data is primary and the operations on the data are secondary.
Data-oriented programming is complementary to object-oriented programming and provides a solid foundation for constructing loosely coupled systems. It can be seen as the theoretical basis for much of the recent work on service-oriented architecture (SOA) and Web services.
For a very simple example of what data-oriented programming entails, let us consider the task of registering a sale in a point-of-sale system. The participants involved in this task are: a customer, a store and the item sold. From a data-oriented viewpoint, we would define customer, store and item as publicly exposed data, and formally describe their structure as public metadata. We might define a message called register_sale that operates on the customer, store and item data to accomplish the task.
The consumer (or provider) of this message would have all the information necessary to execute the producer’s (or requestor’s) request, and its implementation is no longer tied to the behavior of the customer, store and item objects. If the definition of customer, store or item changes, the associated metadata is updated to inform the consumer of those changes, so that the data processing in the application logic can be adjusted accordingly. Thus, this approach is robust to the changes in the data structure, as well as the behavior of the participants.
It is important to note that a design model by itself does not result in well-designed systems. Like object-oriented programming, data-oriented programming can also be misused and abused. Data-oriented programming provides the principles and guidance for building loosely coupled systems. However, design is fundamentally a human activity; models and tools can only facilitate the process.
Data-Oriented Integration Architecture
Large scale distributed systems of systems are often a mishmash of different architectural styles. They are systems created by independent parties, often using different middleware technologies, with misaligned interfaces. A naïve approach to integrating such systems results in N*N point-to-point custom integrations for each pair of systems. This approach does not scale. Yet, it is often the outcome in practice.
A better approach is to use a data-oriented programming approach and explicitly formalize the data and meta-data produced and consumed by a component or a system, then use a “data bus” to connect them. This results in a generic Data-Oriented Architecture framework (Figure 3).
Such a data bus can accommodate a wide variety of architectural styles, and reduces the integration problem from an O(N*N) problem to an O(N). The popular architectural styles can be seen as specializations of the generic data-oriented architecture, by appropriate assignment of roles to the various components.
An example of middleware infrastructure able to provide a constantly available real-time distributed data bus is the RTI Data Distribution Service, which complies with the Object Management Group’s DDS open standard specification. The DDS middleware infrastructure takes on important responsibilities and provides the underlying infrastructure to realize a loosely coupled real-time service-oriented architecture (SOA).
The data-oriented programming and design philosophy maps naturally to the data modeling and introspection capabilities provided by the DDS middleware. The resulting data-oriented architecture framework provides key capabilities that ease and facilitate the construction of system of systems and distributed systems in general. It addresses the challenges of (a) dynamic real-time adaptation and (b) scalability and performance. It can also facilitate integration by directly supporting the data-oriented design approach for loosely coupled systems, and thus aid in addressing (c) incremental and independent development and (d) impedance mismatch issues across systems of systems.
Leading DDS implementations provide a low-latency, high-throughput messaging and data caching infrastructure, utilizing direct peer-to-peer communications to optimize the “end-to-end” performance. They do not require running any servers or daemons, thus there are no single points of failure or loading in a system either.
Rapid Development Using Integrated Application Platforms
Many application components can be rapidly developed by using off-the-shelf application platforms that have been a priori integrated with the communications data bus infrastructure. Applications are organized around an agreed-upon underlying semantic data model, which is automatically mapped to the natural representation used by the application platform, in accordance with the principles of data-oriented design (Figure 4).
Useful application platform components include event processing engines to interpret and transform streaming real-time data in meaningful ways; databases for storing, retrieving and manipulating historical and reference data; enterprise service buses for integration with existing infrastructure; application servers for providing and using Web services; workflow engines for orchestrating processes; and reusable business-critical legacy components.
Leading real-time middleware vendors have already begun providing a complete end-to-end and real-time application development platform, comprising of components that integrate the real-time communications infrastructure with popular application platform technologies such as (1) event processing engines, to make meaning out of continuously flowing real-time data and (2) databases, to automatically map tables stored in “in-memory” or “on-disk” into real-time data sources and/or sinks. The availability of such integrated application components can dramatically lower the risk in the integration phase and speed up the system of systems development effort by another order of magnitude.
The next generation of distributed systems will be loosely coupled systems that support incremental and independent development and are tolerant of interface changes; can systematically deal with impedance mismatches; work well in dynamically changing real-time situations; and can scale in complexity while delivering the required real-time performance.
Popular architectural styles, including data flow architecture, event-driven architecture, data caching architecture and client-server architecture can be regarded as special cases of a generic “data-oriented” architecture, by the appropriate assignment of roles and choice of quality of service in the interfaces between components. Data-oriented application architecture coupled with an appropriate standards-based communications middleware such as DDS can cut down the complexity of the integration problem, while preserving loose coupling and ensuring scalability.
An end-to-end and real-time application development platform that integrates useful application development technologies with a standards-based communications infrastructure such as DDS can further boost productivity and lower the cost of integration by reducing the overall risk and complexity of working with disparate systems.
Santa Clara, CA.