How Polyhedra’s active query mechanism improves the performance of real-time applications
In a big real-time system (big enough that the software is split into a number of applications, running semi-independently) it can be quite a problem to ensure that one application knows when another application has altered the shared data. For example, in a telecoms system a line card has to know when somebody has changed its configuration - but it is far too expensive to keep polling a database just to see if something has changed. The reasons polling is bad is threefold:
There are other solutions, such as an explicit notification mechanism where something that makes a change sends a message to all interested parties - however, this increases the amount of work that has to be done by the component making the change, and puts a lot of responsibility on the author of the component. The Polyhedra solution is to allow "active queries", where the database server itself takes responsibility for notifying interested parties about relevant changes. This simplifies the work both of the component updating the database and of the component interested in the change, and improves the scalability of the system.
The following article goes into more detail about the background and the capabilities of our active queries. (A copy of this article in PDF format is appended below, for ease of printing and offline viewing; it includes a few illustrations, and also has sample C code illustrating how a client application can perform active queries. There is also a 3-part webcast that covers the same topic.)
All but the simplest real-time applications are split into a number of software components to make things manageable and to allow different people or teams to be working on the application at the same time. Each component will be responsible for a separate job and communicating with other components via defined interfaces. The data-handling needs of each component will differ, but in general some data will have to be shared between components, and some data will need to be carefully preserved to guard against partial or complete system failure; there may or may not be an overlap between these categories. (Thus, some shared data might be transitory by nature, whereas configuration information will need to be shared and preserved.)
Traditionally, one component might act as the ‘data manager’ for the application, mediating access to the data and ensuring the critical stuff is preserved; where speed was important, this data manager component would be developed specifically for the application – but this can be complex, and it can be simpler to use a database management system designed for real-time work. In fact, the main reason for adopting commercial database technology is the same as that for adopting any COTS (common off-the-shelf) technology: cost. This shows up in many aspects, including reduced risk, reduced development time, reduced maintenance costs, and fewer specialist skills needed in the development team. As with all COTS technology, the emphasis must be that the chosen product is appropriate to one’s needs, otherwise costs will actually increase rather than decrease. In order to keep the footprint down and performance up, a real-time COTS DBMS will be more light-weight than a full, general-purpose DBMS, but will still offer benefits such as transactions, data isolation, data persistence, and industry-standards APIs (such as SQL, ODBC and JDBC) as well as hot-standby capabilities to guard against partial system failure, and the ability to adapt the database structure on the fly to cope with changing needs.
Consider first a SCADA system, controlling (say) a factory or electrical distribution network. Sensors are feeding back readings to the central system, and operators can see what is happening via ‘mimic diagrams’. The question arises, how are the mimic diagrams kept up to date? Does the responsibility lie with the components that are obtaining the readings from the sensors, or the data management component?
Next, consider a base station in a telecoms system. The line cards will need to obtain configuration from somewhere; rather than each card having storage capabilities, it is easier to keep the information in cards in the control plane – preferably in a fault-tolerant fashion. External workstations would be able to change the information, but will have no network connection to the line cards and thus would have no direct way of informing them about updates. In a similar fashion, line cards would like to make sure administrative workstations could determine their status.
A final example: in a financial system, the database could hold information about the most recent share prices of stocks, shares or other ‘financial instruments’, allowing traders to monitor their favourite shares in real time.
In both the above scenarios, there is a general need for a low-cost mechanism that allows components to be able to determine when the data in which they are interested is changed ‘under their feet’ by other components of the system.
The problem with tailored push mechanisms – where the originator/detector of an event tells concerned parties about changes that are relevant to the recipient (without telling them about irrelevant changes) – is that it can be significant effort to adapt to changing requirements during the development phase and beyond. A generic publish & subscribe mechanism avoids this problem, but may not offer sufficiently fine granularity or may not allow precisely the right information to be supplied. Thus, the listener might just get a notification that a value associated with a datum point has changed, and may have to find out for itself whether it is a point of interest, and if so what is the current value. Even if the publish and subscribe mechanism avoids these problems, it still complicates the job of the component publishing the information as it is responsible both for putting the information into the database (for long term storage, say, or to be available to components that are not running or that do not need immediate notification) and for initiating the notification process to subscribers.
Some database systems allow code to be attached to a database, which is triggered on an event. Normally, the trigger code will be written by the database administrator using PL/SQL or in a DBMS-specific trigger language such as Polyhedra’s CL language, and the role of the triggers is either to perform some integrity checks (and aborting the transaction if the conditions are not met), or to do some ‘knock-on’ actions. The integrity checks can be as simple as ensuring particular attributes of the changed records are within a range directly or indirectly defined by other fields in the record, or can be much more complex. Thus, in a telecoms application, an attempt to create a record for a new connection might be rejected if either of the endpoints are flagged as being in a call – if, however, the new connection is permissible, the trigger on the connection table can set the flags indicating that the two endpoints are in a call. (A separate trigger would clear these flags when the connection record is deleted.)
The uses described above require the trigger code to be run within the transaction that activated it (either as the triggering event is performed, or with other triggered code at the end of the transaction), and obey the ACID properties (see panel to right). In particular, the triggered code should have no side-effects that cannot be rolled back, should the transactions fail. For this reason, one should treat with great caution data management systems that allow code written in C or C++ to be called within transactions, as atomicity cannot be enforced by the system. However, it is OK to run such code post-transactionally as the server can cancel the triggers prior to execution if the transaction is rolled back.
Where a data management system allows post-transactional triggers, such triggers can be used for knock-on actions that take too long to be done in the original transaction – but they can also be used for event reporting: either the database administrator can add C or C++ code to hook in to a generic publish and subscribe mechanism (as described in the previous subsection), or the data management system can have its own inbuilt notification mechanism.
Reporting events in this way has the advantage that the client applications generating the updates only have to tell the database; their job is simplified, and, as a central component is taking responsibility for the job of notification, overall system correctness is improved. However, as with the use of a generic notification system described in the previous subsection, it is difficult to ensure the granularity of the notifications is sufficiently fine; in addition, those told about the events will probably need to query the database and work out for themselves what has changed. There is also a maintenance problem, as those who are designing client applications will have to ensure (probably by liaison with the database design team, if they are the only ones that can add triggers) that suitable events are generated for their needs – and these events are updated or deleted if and when their needs change.
In some systems, there will also be a problem with security. Most database systems provide some sort of user-based controls that can be used to determine (down to the level of individual columns of a table) who can view and update the contents of the database. In embedded systems, such controls are often used in a role-based manner, restricting what can be done by individual components, and thus limit the damage should a component go wrong, but they may also be used to stop people discovering operational information that they are not entitled to know about. If user-written code can register to be notified about particular types of events, then people may be able to make inferences about data that they could not directly query the database about.
A delta tells the client enough information to bring it up to date. Thus, it will report added rows, deleted rows and changes to individual rows: for changed rows, it does not send the whole row contents, just the cell values that changed. Consequently, efficient use is made of the network bandwidth, when client and server are on separate machines. In addition, the job of the client application is greatly simplified:
the presence of an active query does not affect the performance of other queries, nor the performance of transactions that do not alter the table(s) monitored by the active query;
active queries on single objects (identified by their primary key) have no affect on the performance of transactions that do not modify the monitored objects; and,
the affect on the performance of a transaction that does alter data being monitored by an active query is very low, as the work of updating the client is done post-transactionally in a separate thread.
To cope with such situations, if the client is running slowly and cannot keep up with the stream of deltas, the server will start to ‘merge deltas’. Thus, clients will not be told every change, but will be kept as up to date as possible without slowing down the system. This means that the system is responsive even in peak load conditions, and copes well with crashed client applications.
In fact, clients can take advantage of the delta merging mechanism to further reduce the load on the system. At the time the query is launched, the client can specify a ‘minimum delta interval’, in thousandths of a second. Whenever a server sends a delta over an active query with an interval defined, it will ensure the next delta will not be sent until the interval has expired, with all intervening changes merged. A graphic client can choose, say, an interval of a fifth of a second, which is long enough to allow people to read the values without worrying about flicker – yet short enough to appear real-time to those watching the displays. By choosing an appropriate minimal delta interval, and ensuring only the data required is retrieved (e.g., avoiding ‘select *’ and only asking for the columns and rows of relevance to the client), efficient, scalable and responsive systems can be produced.
Whenever a change request is received by the server, the first thing it does is check whether it satisfies the optimistic concurrency constraints: basically, the transaction is only accepted for processing if no other client has made a conflicting change since the last delta was sent. Clearly, the new transaction cannot change a record if something else has just deleted it, but the transaction is also rejected if something else has changed a record attribute that the client wants to alter. Optimistic concurrency is efficient when transactions are expected to succeed, which is usually the case in embedded applications.
Updates through active queries can be grouped into larger transactions. When the client starts such a transaction, the server is automatically told (by the client library code) to block sending deltas on the active queries involved in the transaction. When the client later commits or cancels the transaction, it will – if necessary – receive a merged delta to bring it up to date, but in the case of a successful transaction it does not have to be told about the changes it made!
While this particular example may be of limited use(!), the mechanism allows more sophisticated and meaningful actions. For example, suppose that some of the data values are being fed from external devices according to some polling schedule, and the polling schedule for a particular point is specified in its database record. The query monitoring mechanism is sufficiently powerful that it is possible to detect when an active query is looking at a particular point, and trigger code can automatically increase the poll frequency for that point.
In summary, Active Queries bring an additional and powerful capability to database developers allowing for faster program development and greater system responsiveness. By avoiding polling they make the system more stable and scalable, and the bidirectional nature provides an easy way for a client to update the database. The uses for Active Queries are varied and many!
Polyhedra features >