How Enea® Element can be configured to manage a fault-tolerant pair of Polyhedra database servers. Polyhedra can be configured in a high-availability mode, where a master server is shadowed by a standby server that is ready to take over at a moment’s notice. There is an arbitration interface that allows external code to monitor the servers, and to tell each of them whether it should be master or hot standby. Where Polyhedra is being run in a system that is using Enea’s Element application development framework, it is possible to use Element to act as the arbitrator for Polyhedra, allowing a consistent approach for controlling and monitoring the HA features of the system. This technical note shows the main features of a sample Element component that can perform this integration. Update: we now have a one-minute video showing this shim in action! IntroductionAn overview of Polyhedra®
Polyhedra is a family of relational database management systems designed for use in embedded applications. It has two main flavours: Polyhedra IMDB and Polyhedra Flash DBMS, with the former being available for 32-bit platforms (Polyhedra32) and 64-bit platforms (Polyhedra64). The difference between these products is that Polyhedra IMDB is in-memory for speed, and has journalling and fault-tolerant mechanisms to ensure data persistence and system availability, whereas Polyhedra Flash DBMS trades performance against RAM footprint by using a tuneable in-memory cache in front of a file-based database (assumed to be on a flash-based file system). In 2012 a new product was released: Polyhedra Lite, a free version (subject to license conditions) of Polyhedra32 IMDB, but omitting some functionality such as support of fault-tolerant configurations.
Polyhedra servers can operate stand-alone, with each instance handling a separate database, and in addition both Polyhedra IMDB and Polyhedra Flash DBMS can operate in master-standby mode, where one server is acting as a hot standby of another server and has a read-only copy of the database. Polyhedra is fully transactional, and satisfies the Atomic, Consistent and Isolated properties needed for ACID compliance. Polyhedra Flash DBMS transactions are Durable, but in Polyhedra IMDB durability can be balanced against performance: critical changes to the data are preserved by streaming journal records to a log file, and client applications can choose whether the success of a transaction is to be reported immediately or when the log file has been flushed. Polyhedra’s arbitration mechanisms
When Polyhedra is to be used in an HA configuration, the servers are started up under the control of an external arbiter, whose role is to tell the server whether it should be master or standby. The original mechanism used a third database server containing special records: the fault-tolerant servers that were providing the main database service would quietly connect to the control server and monitor their control records (using Polyhedra’s active query mechanism) to find out what state each should be in. An overview of Element
The Element product is a middleware software layer situated between the operating system (OS) and application layer providing infrastructure services that address common application needs beyond those supplied by the OS. Application developers can use Element’s powerful application development framework to simplify the design and implementation of complex, highly available applications in high performance, distributed, heterogeneous systems.
Element consists of a suite of core services which serve as a foundation for other aggregate services. The entire complement of services is available to applications through a collection of service-specific Application Programming Interfaces (APIs). Element’s AMF module
The Availability Management Framework (AMF) provides a high availability (HA) framework that is integrated with the Element Service Suite. The AMF is a standards-based implementation, defined by a specification from the Service Availability Forum (SA Forum) called the SAI-AIS-AMF. An Element shim for Polyhedra
Before we start, it is important to emphasise that Polyhedra is not dependent on Element, nor vice versa – and that fault-tolerant Polyhedra configurations are quite possible without using Element! Having said that, on systems that are using Element (and in particular the HA and AMF sub-systems) it makes sense to use Element to configure, control and monitor the HA Polyhedra service alongside the other services it controls.
The complete example shim also contains some example code that can be used (from the Element command shell or its web interface) to list the tables in the database or to shut it down, and some prototype code to handle a field upgrade. The complete shim is about 2000 lines long (including comments and lots of calls to the Element logger), so we won’t go through it line by line in this technical note. Instead, we shall concentrate on the salient details, so that this note both acts as a guide to those who want to adopt and adapt the example shim for use in their own Element Application that uses Polyhedra in an HA fashion, and also illustrates some of the issues involved in controlling third-party HA-aware software from Element. The core of the shim: telling Polyhedra the mode
This is relatively easy, as Polyhedra can be instructed to use LINX for its control messages – but one needs to do a slight patch to Element to ensure it is not confused by messages coming from Polyhedra. The alteration is in the file /* two signal numbers are defined by Polyhedra: * 0x504F4C59 client-server communications ('POLY') * 0x504F4C41 server-arbitrator comms ('POLA') */
#define POLYHEDRA_ARB_SIG 0x504F4C41 #define POLYHEDRA_COMMS_SIG 0x504F4C59 OSBOOLEAN
ElemMsgIsWrongEndian(union SIGNAL **sig) {
#if 0
((unsigned char *)(*sig))[size + 1] ^ ElemEndianGetInfoByte())
#else
and thus avoids problems when a Polyhedra
#endif
(For the shim we are describing here, we don’t actually have to filter out the POLYHEDRA_COMMS_SIG – but it is essential if developing an application which wants to wait for either an Element signal or a Polyhedra client-server communication, and it could be needed in a more powerful shim that provided database inspection tools, so we include it here to cover future needs.) Once this patch is done, the Polyhedra shim can follow the standard Element pattern and have a control loop that just waits for signals, and rely on registered call-back functions to take the appropriate action: for (;;)
(In the above and all subsequent extracts, Element API functions are shown in red, and pEnv will point at the information structure used by the shim to keep track of the operational information. H_ASSERT_M is a macro in the Element library whose job is to kill the process and generate a report if the condition is not satisfied; for simplicity, invocations of this macro have usually been removed from later extracts – but they are present in the full example shim, along with various calls to the Element log functions.) Before entering this loop, you have to register a handler for the arbitration request messages coming from the local Polyhedra database server: status = ElemSigReg(pEnv->pEsdRegistry,
POLYHEDRA_ARB_SIG,
H_ASSERT_M(status == ELEM_STATUS_OK); This tells Element that when a signal of the right type is received it should invoke the polyShimHandleArbSig() callback handler, with one of its arguments pointing at pEnv (though it will have to be cast to the right type within the callback function). The Polyhedra server will expect a message in response indicating the mode in which it should operate, so a suitable definition of the signal structures, the polyShimHandleArbSig() function and its support function could be: #define POLY_ARB_SIGNAL_TYPE_STATUS 'J'
#define POLY_ARB_SIGNAL_TYPE_CONTROL 'A' #define POLY_ARB_SIGNAL_MODE_UNKNOWN 0
#define POLY_ARB_SIGNAL_MODE_MASTER 1
#define POLY_ARB_SIGNAL_MODE_STANDBY 2
/* Set the arbiter's reporting interval to once per minute for now, Also, have a faster interval when we need */
#define POLY_ARB_STAT_INTERVAL (1000*1000 * 60) #define POLY_ARB_STAT_INITIAL_INTERVAL ( 500*1000) #define POLY_ARB_MAX_SERVICE_NAME 255
typedef struct POLY_ARB_DB_STATUS_SIG {
typedef struct POLY_ARB_DB_CONTROL_SIG
} POLY_ARB_DB_CONTROL_SIG; PRIVATE_T void
polyShimHandleArbSig(void *refPtr, union SIGNAL **pSig)
} polyShimTellModeToRtrdb() to allow the functionality to be reused in another part of the shim; more of that later.) PRIVATE_T void
polyShimTellModeToRtrdb(POLYHEDRA_ENV_ST *pEnv,
POLY_ARB_STAT_INTERVAL : POLY_ARB_STAT_INITIAL_INTERVAL;
The polyShimHandleArbSig() function checks the signal is of the expected type, remembers certain information in the shim’s main control structure for later use, and calls the polyShimTellModeToRtrdb() function to send the message back. The polyShimTellModeToRtrdb() function looks inside the control structure (at pEnv‑>amfHaState and pEnv‑>otherJournalService ) to decide what information to put in the message that is to be sent back to the Polyhedra server. This leaves open the questions: where do we get the information that needs to be sent to the server… and what starts off the server, anyway? Finding out the modeTo determine the mode that Element wants the local Polyhedra server to take, it is necessary (a) to tell Element that Polyhedra is an HA service that supports master-standby configurations, and (b) for the shim to register with Element to be told the initial mode (and again whenever the required mode changes). The first can be done using the AMF configuration wizard (described in the Element AMF guide that is distributed in Element release kits); alternatively, you can adapt an AMF ‘script file’ such as the following:add app Polyhedra
set EXTRA_ARGS "-app Polyhedra"
add service Redundant 2N -cbs
add service Non_Redundant NONE
set EXTRA_ARGS "-app Polyhedra -service Redundant -red 2N -scope 1"
add component polyShim COMPONENT_FAILOVER In essence, this tells the Element AMF that Polyhedra can be run in a master + hot standby configuration by starting off the polyShim application on two control blades, and if the master instance of polyShim fails then the other one is to be promoted. To find out the mode assigned to it, the shim has to make its control object global[1]… static POLYHEDRA_SHIM_ST *pGlobalEnv; … and then (in addition to other stuff needed of all Element components) register itself and some call-back functions with the AMF: ELEM_MAIN(polyShim)
NULL
malloc(sizeof(POLYHEDRA_ENV_ST));
...
&pEnv->amfCompName, NULL);
... Two of the call-backs are present simply because the AMF state model requires them for fault-tolerant 2N components, but in our example we shall provide minimal implementations (and do all the work in the remaining call-back function) PRIVATE_T void
polyShimCSIRemoveCb(SaInvocationT invocation,
PRIVATE_T void
polyShimComponentTerminateCb(SaInvocationT invocation,
Now let us look at the other call-back function referenced in the SaAmfCallbacksT structure, polyShimCSISetCb() . The AMF knows (by its position in the structure) that it should call this function to say whether the component should be active or standby, and also call it again whenever the required mode changes. The call-back function (or something it triggers) has to acknowledge to the AMF that it has received the message and that appropriate actions have been initiated. In our case, these other actions are both to propagate the status information to the Polyhedra server, and also to ensure that the standby server can find out where the master server is; this allows the standby to connect to the master’s ‘journal port’ to get an up-to-date copy of the database, and to receive information about subsequent changes. Thus:
PRIVATE_T void polyShimCSISetCb(SaInvocationT invocation, const SaNameT *compName, SaAmfHAStateT haState, SaAmfCSIDescriptorT csiDescriptor) { POLYHEDRA_ENV_ST *pEnv = pGlobalEnv; ELEM_OBJ_HANDLE_T myJournalServiceObj = 0; /* remember the previous state)*/ SaAmfHAStateT prevHaState = pEnv->amfHaState; /* Respond to AMF to accept assignment, saAmfResponse(pEnv->amfHandle, invocation, SA_AIS_OK); if (haState == SA_AMF_HA_ACTIVE) {
/* if the server had previously told us it was in standby mode, promote it. if ((pEnv->rtrdbPidKnown == TRUE) && { union SIGNAL *pElemSig; pElemSig = ElemMsgAllocBuf(sizeof (POLY_ARB_DB_CONTROL_SIG), polyShimTellModeToRtrdb(pEnv, &pElemSig); }
"linx/%s/polyJ", slotLabel);
ELEM_NSS_SCOPE_CLUSTER, myJournalServiceObj); /* Flag as ready. */
standby mode - so we should unpublish */
ElemNssSubscribeOpt(pEnv->pEsdRegistry,
The polyShimHandleOptAdded() function registered above will be called by Element when information has been published about the journal port, and the function stores in pEnv->otherJournalService so that it can be passed to the standby server in the arbitration messages generated by polyShimTellModeToRtrdb() : PRIVATE_T void polyShimHandleOptAdded(void *userParam, PROCESS pid, char *path, ELEM_OBJ_HANDLE_T attrObj) { POLYHEDRA_ENV_ST *pEnv = (POLYHEDRA_ENV_ST *)userParam; char *pOtherJournalService; ElemObjGetStringRef(attrObj, strncpy(pEnv->otherJournalService, pOtherJournalService); pEnv->isReady = TRUE; ElemObjFree(attrObj); } Starting the Polyhedra serverThe simplest way to do this is for the shim to make use of the POSIX
You may have noticed the pEnv->isReady = TRUE; This is used as a signal to the initial control loop that we know either that the Polyhedra server is to start up as master (which would normally only occur when doing a cold boot of the system), or that it is to start as standby and we know where the master is running. A second flag, pEnv->isInited , is set when we know the working directory; the command to ask for this is: ElemRelRegisterReleasePath(pEnv->relHandle,
… and the corresponding call-back function simply records the value and set the flag: polyShimReleasePathCb(void *pUser, char *pRelPath)
(This records not just the directory containing the Element release, but also the subdirectory that should contain the executable; TARGET_BIN is set by the make file and will depend on the hardware architecture on which the software will be running.) So, the initial control loop tests both the flags described above and exits when both are satisfied and we are ready to proceed to the fork() stage: while (!(pEnv->isInited) || !(pEnv->isReady))
kidPid = fork(); if (kidPid == 0)
this one or it will hang around and cause BAD things
containing the poly.cfg file, as file and
pEnv->modName, NULL );
printf ("execl call failed.\n");
exit (10);
} else
(The above code snippet incorporates the main control loop given earlier in this technical note). There are few other details that have to be filled in: for example, the shim should catch POSIX signals so it can detect when the child process has failed – at which time it should reset the original signal handling and reissue the signal at itself; Element will detect the failure, promote the standby (if running) and restart the shim. However, what has been given above should be enough to give a feel for what needs to be done in a basic shim. Other functionality
Having got the overall framework of the shim in place, there are a number of ways this can be enhanced. For example, the shim could publish when the overall service becomes available, so that client applications can register to be told that connections are possible. The shim can also provide Element commands that inspect or update the database, generate snapshots, or perform a controlled shut-down. And of course the shim can make use of Element’s logging functions to record what it is doing; this can not only help in debugging the shim itself, but can be also of benefit in monitoring the behaviour of the overall system. Some of these features are illustrated in the complete example shim. SummaryThis technical note has shown the core elements of an Element shim that can be used to monitor and control Polyhedra servers that are to be run in a fault-tolerant configuration. It illustrates that standard Element facilities can be used to control third-party HA-aware software, and also that Polyhedra makes it easy for system administrators to integrate it into their HA platform. Integration of Element and Polyhedra is particularly easy as Polyhedra allows LINX to be used for HA control messages! A complete copy of the source code of a working shim is available to Enea customers in machine readable form, along with supporting information such as make files and suitable configuration files.[1] When registering a callback function with Element, you can supply a pointer whose value will be passed over when the callback function is invoked. However, as mentioned in the introductory section, the Element AMF functions are based on the standard API published by the SA Forum, and they don’t provide this useful facility. To download a copy of this note in PDF form, click on the down-ward pointing arrow at the end of the line below; if you just click on the name of the file, the web server will try to open it for you for online reading! |
How-to Guides > Element and Polyhedra >