Glossary

action
A specific task the HAM will perform under certain associated conditions. Examples of actions include executing an external process, restarting a process that has died, sending a signal or pulse notification, etc.
availability
The ability of a system to provide its intended service without interruption for extended periods of time.
clustering
A method of distributing processing among several computers in order to reduce the number of SPOFs. QNX native networking offers transparent network-wide processing, which facilitates building clustered HA applications.
condition
An event that will trigger certain actions for the HAM to perform. Examples of conditions include the death of entity, a missed heartbeat, etc.
entity
A process that the HAM will monitor. Entities can explicitly ask to be monitored (i.e. as self-attached entities), or they may be monitored without ever realizing it.
five nines
The celebrated availability metric that refers to a system's ability to remain up and running 99.999% of the time per year.
Guardian
The HAM's “clone”, a stand-in process that the HAM creates to ensure uninterrupted HA management within the QNX environment.
HAM
High Availability Manager.
heartbeat
A “wellness” or “liveness” notification sent at specific intervals by a client to the HAM.
hot swap
The ability to remove or insert a component in a live system.
MMU
Memory Management Unit. A device on many CPUs that alerts the OS if a process tries to access memory that's been allocated to another process.
MTTF
Mean Time To Failure. This is the average length of time that the system will remain in service before failing. You want this to be as long as possible.
MTTR
Mean Time To Repair. This is the amount of time it takes for the system to resume operation after any component fails or is upgraded. You want this to be as small as possible.
Neutrino
Name of the QNX microkernel.
SPOF
Single point of failure. Any particular “weak link” in a system would be considered a SPOF, because its demise would put the entire system at risk.
watchdog
A trusted piece of hardware whose main purpose is to trigger code that will check the sanity of the system. There are software watchdogs as well; the HAM may be considered a “smart watchdog.”