SOLUTIONS ENGINEERING
Data Acquisition
Maximizing Resources in Data Acquisition Systems with a Network Data Model
When deciding on a data management solution for a data acquisition device, there are several different solutions available depending on needs. Selecting the appropriate data model can significantly impact cost, quality, end-user experience and ultimately customer success.
JOHN PAI, BIRDSTEP TECHNOLOGY
Data acquisition devices have come a long way, but since the first devices were created, one thing has remained the same: they have limited resources. In such embedded devices high performance is paramount, and small data acquisition platforms need to handle the data throughput at high rates, manage data relationships and process the information with minimal processing power so that more is available for the actual acquisition application.
A common misconception when hearing the term “database” is the idea that the database is synonymous with enterprise-wise relational databases or inefficient and slow personal databases. Managing data in embedded devices is a different matter than managing data on PCs and servers. Enterprise databases are not designed for embedded devices, and their subsequent adaptations result in large memory footprints, often entirely too large to fit into the target device. Enterprise databases do not translate well into embedded devices, and attempting to adapt open source implementations presents its own special set of headaches. The embedded database is another type of database that is dramatically different from the well-known enterprise databases.
When it comes to choosing the right embedded database for data acquisition devices, there are three main types: flat-file, relational and network models. Which is the best one to use for a particular job will depend on factors such as type and amount of data to be processed, as well as how frequently it will be used.
Flat-File and Relational Models
Essentially, the flat-file model is a set of strings in one or more files that can be parsed to get at the information they store. It is decent in storing simple lists and data values, and can get more complicated when trying to model more complex data. Modeling complex data creates a new set of headaches. One of the main issues with using flat files for even a semi-active database is the tendency to corruption. Generally, a flat-file database does not have a locking mechanism, which prevents data from being corrupted when multiple threads may simultaneously try to write to the database. Even when the device is designed for multiple threads, it is possible for two or more threads to cause a “race condition,” which could be prevented with a locking mechanism. A race condition may force the device to stall indefinitely, which is detrimental especially in an embedded device that normally does not reset easily. In a data acquisition device where reliability of data is also a priority, flat-file databases may not be the best choice.
The relational model stores data in tables composed of columns and rows. When data from more than one table is needed, a joint operation relates these different data using a duplicate column from each table (Figure 1). While the relational model is flexible, performance is limited by the need to create new tables to hold results from relational operations, and storing redundant columns. Even when designed efficiently, there are several sources of overhead. The overhead comes in duplication of data, in helping maintain database integrity, and a need for a foreign key to help maintain relationships in the relational database. The overhead results in excess in file size and extra I/O needed to perform basic database operation. Such overhead is especially expensive in resource-constrained devices.

Most developers are familiar with the relational database model, such as those from Oracle, Informix, Sybase, etc. Alternative data model architecture is more appropriate for resource-constrained devices such as data acquisition devices.
The Network Model
The network model is conceived as a flexible way of representing objects and their relationships. The network model predates the relational model and can be viewed as a superset. This implies that anything expressed in the relational model can be expressed in the network model, even SQL support. The main advantage is the way the relationships are modeled.
A primary distinction to the relational data model is that the network model allows designers to describe relationships between records using “sets,” where pointers are used to relate objects directly and navigate between them (Figure 2). When compared to the relational model, the network model is faster, more reliable, uses disk space more efficiently and is better at expressing complex database designs.

Discuss
|
Several points in this article are incorrect. The main reason for database corruption is the poor implementation of ACID properties. It isn’t locking or concurrency controls. In addition, a network model database consumes overhead via pointers instead of through primary and foreign keys. With a properly designed DBMS, the amount of overhead will be very similar between the network and relational model. Also, it is incorrect to state that relational databases require greater IO. In reality, the IO for a relational database will be similar, or may actually be less, than with a network model database if the user desires to scan through sorted records (a common practice). One thing that isn’t mentioned is the likelihood of corruption due to lost pointers. This is why all hierarchical and network model database companies had to add utilities to fix corrupted databases. These issues are not tolerable in embedded systems. In general, it seems like this article uses overly simple examples to show the network model unrealistically favorably. One has to ask: If the network model is so superior, why are there not more network model database systems? If they are so much faster and store data in less space, why aren’t the major enterprise database systems also network model? Wouldn’t those attributes be appealing to the enterprise, too? The reality is that the industry decided, long ago, that the network model’s disadvantages far outweigh its advantages, and no new network model database system has been written in 20 years. |
|
Juergen, I am still not sure what points are incorrect. It is true that databases may become corrupted for various reason, but it was not mentioned locking or concurrency controls as the main culprit of corruption. The use of pointers actually saves space and IO compared to the PK FK relationship. Many times, you will find duplication in the PK and FK relationship, and that is why relative to the Network model, the Primary Key and Foreign Key relationship will cost embedded system an overhead of AT LEAST 30%. When the relationship become more complicated the relational model grows unpredicatably less inefficient. This is not acceptable to many embedded systems as many Raima customers such as Boeing, Fujitsu, 3Com, Accenture, Alcatel-Lucent, and many many others have discovered. Even the best designed relational databases on the market does indeed require more IO to perform the same tasks. Comparitively, Raima have demonstrated the network model databases does require less IO than the relational model counterparts like SQLite, Solid, and various others. Depending on competitors, Raima databases does use at least 15% less IO. It is true it is possible the database is corrupt due to lost pointers, but the likeliness of this occurring is just as likely as corruption in relational databases. Therefore, the network model databases does include API and utilities to address this issue. To Raima customers, the API/utility makes this a non-issue. Indeed the network model does perform much better and require less resources. In the past, engineers have preferred relational model over network model due because relational model is closer to SQL rather than database specific APIs. Those advantages are no longer monopolized by relational databases. Raima databases has implemented SQL APIs and ODBC. Now, engineers have the advantage of choosing a database with the portability with ODBC and the power and efficiency of using a network model database. |