How-to Guides‎ > ‎

Control memory use


How to keep the memory usage of Polyhedra IMDB down

Polyhedra IMDB is an in-memory relational database system designed for use in embedded systems, and has many features designed to improve the integrity and availability of your data. These include the adoption of a client-server architecture, and the ability to set up servers in a master-standby configuration to protect against a single point of failure. (To protect against more a more disastrous failure Polyhedra imdb allows the generation of snapshot files, which is complemented by a dynamic journal logging mechanism to ensure data loss is minimised.) Memory is always at a premium, though, so we are often asked if there are ways to ensure this is kept under control. Here are some suggestions!
  • Reduce the stack size
To improve performance (especially on multi-core machines) and 'fairness', the Polyhedra server uses a number of separate threads, to handle things such as client connections, connections to the standby server, journal logging, etc. For simplicity, they all use a default stack size. On Linux and Unix platforms, this is the stack set by the command line shell; if you are using bash or one of its equivalents, this can be controlled by the -s option to the ulimit command. So, to fire off rtrdb (the Polyhedra server executable) in the background with a stack size of 50K, you could use a command such as

( ulimit –s 50; rtrdb db ) &

The default value set by bash is often 8192, for 8MB of stack, so if the server is running with 6 threads you save about 47MB by this technique. Also, by default Polyhedra will start a separate thread for each client connection (up to a limit of 1024), so if there are a lots of simultaneous client connections the savings from limiting the stack size can be quite large.

Which brings us to our next technique…
  • Limit the number of connection threads
Polyhedra can be instructed to use a pool of connection threads for performing the work of client applications, rather than setting up a separate thread for each client connection. This can be done by use of the thread_pool_limit resource; a value of, say, 5 should suffice in most cases without significantly affecting performance even on machines with a large number of cores. (See the Polyhedra user guide for information on defining resources, and the rtrdb reference manual for a definition of the resource setting that are relevant to the Polyhedra server. There is also a list of the resources understood by the various Polyhedra components on the Polyhedra developer site: see  the crib-sheet on resource-names.)
  • Reduce the message queue size
To help cope with the fact that a client application might be running slowly, the Polyhedra server is allowed to build up a queue of outgoing messages. The default queue size is large (1000 messages) and you can save space by reducing this to, say, 50, by means of the msg_queue_size resource setting.
  • Set a memory size limit
The memory_limit resource restrict the amount of heap space the database server should grab. In general, the server will fail transactions that cause the amount of RAM used to exceed this value. Note that this limits the total amount of space used, not just the space used for holding data. It is also possible to limit the amount of memory used on behalf of individual sessions, by means of the session_memory_limit resource; this has the effect of limiting the amount of work that can be done in a single query or transaction.
  • Use the Polyhedra32 server rather than Polyhedra64
If you are not wanting to hold very large databases, then using Polyhedra32 rather than Polyhedra64 saves memory, as all the internal pointers in the process are smaller. In particular, it reduces the memory overhead per record. If you are wanting to compile and link 64-bit applications, Enea can supply 64-bit versions of the client libraries, and these can work the same regardless of whether the database server is 32-bit or 64-bit.

  • Use Polyhedra Flash DBMS
Polyhedra Flash DBMS is built from the same code base as Polyhedra IMDB, so contains the same SQL engine and uses the same client-server protocol. (See the comparison chart for a summary of the differences between the main Polyhedra products.) The main difference is that it is specifically designed for memory-constrained systems, where limiting the RAM footprint is more important than the higher speed possible by keeping everything in memory. It was also designed for use with flash-based file systems (where you don't have to worry about head seek time or rotational delays, but updates can be expensive), and so will give better performance than traditional disk-based database systems when used on such media. To help you balance performance against memory use, you can specify the size of the RAM cache to be used.

In addition to the above, there are a number of ways of reducing memory use by exercising care when designing the database schema:
  • Use table inheritance
Polyhedra allows a table to be ‘derived from’ another; the derived table inherits all the attributes of the base table, and can add extra attributes. If your schema has tables whose primary key is a foreign key reference to another table, and your queries will often do a join that uses this foreign key, you should consider whether the referencing table should be redefined as a derived table. There are two main advantages: the queries and updates are faster and simpler to express, and there is a reduction in the record overhead.
  • Use the Historian
Polyhedra is primarily an in-memory database, but it has a special module called the historian whose job is to capture, store and retrieve time-stamped samples. The historian stores the time-series data in files rather than in RAM, and thus can allow terabytes of data to be stored on systems that have only got tens of megabytes of RAM.
  • Use appropriately-sized data types
The database can handle signed integers of varying length (1-byte, 2-byte, 4-byte and 8-byte), and both 32-bit and 64-bit floating point values; choose the type appropriate to your data.
  • Use VARCHAR(n) rather than CHAR(n)
At present, for historical reasons Polyhedra treats CHAR(n) and CHAR as equivalent to LARGE VARCHAR, an unbounded length string. If you know that the string will always be 12 bytes, it is actually more efficient to use VARCHAR(12) as the space is reserved in-record.
  • Consider using blobs
If there are certain composite structures that are passed around as a whole, and where you will never need to query or update individual fields via SQL, it can be more efficient to store the structures in  BINARY LARGE OBJECT attributes rather than break them down into a number of separate column attributes. It can be more efficient both in space and time when passing such values into and out of the database, though the space used when the data is ‘at rest’ is likely to be the same. (There is also the problem that Polyhedra cannot help with endianism issues within the structures, so if some client applications are on machines using different hardware architectures then care will be needed in interpreting the contents of the data structures.)
  • Don’t keep data you don’t need
Pretty obvious really – but if there is data that will never be queried, why bother putting it into the database in the first place?
  • Don’t normalise too far
Normalisation means that a given item of information is only stored in the database in one place; if it can be inferred by the results of an appropriate query, then it should not be stored explicitly. The problem is, that ensuring everything is fully normalised often means information is broken down into too many different tables; the record count increases, and the total space increases (a) because of record overheads and (b) the necessity to add more indexes to allow the information to be retrieved more efficiently. In practice, then, normalise less, and add CL code to achieve the data integrity that normalisation was intended to achieve.
  • Tune the size of the server executable
This is something you do late in your application development: you can relink the server executable to remove some modules that you do not use. Thus, if you are not using, say, the historian module or the CL trigger language, you can build a version of the rtrdb which leaves them out. This reduces the size of the executable, and thus reduces both the amount of disk space used,  and also the amount of RAM used to hold the working copy of the code when the process is running. In practice, the savings from this are a few megabytes at most, and it only worth doing this on very small systems where the memory is particularly tight.

The above lists are not meant to be complete – but used together, these techniques can help ensure that the Polyhedra database makes sensible use of memory, in a way that is appropriate to your system. For further advice, please contact your local Enea sales office (or the Polyhedra helpdesk, if you are already a customer).