Build resilience into mission-critical CCTV systems using server failover design.

20 January 2023

Resilience is designed into CCTV systems to ensure the continued security, process or operational control of a site, in case any part of a CCTV system fails for any reason.

Many larger CCTV systems, especially those deployed in critical and national infrastructure projects, are mission-critical, with an expected 24/7 operation. An immediate and appropriate response to component failure must be built in, including when that failure is caused deliberately, by malicious attack, terrorism or hacking. It is also the case that larger CCTV systems, with higher quantities of video cameras, power supplies, storage devices, servers and network devices have, by definition, more parts and therefore greater risk of failure.

In this article, Nick Bowden, Managing Director of Security Buying Group, supplying and supporting the Digifort VMS (video management software) and neural analytics platform in the UK, explains how different levels of server failover design can provide cost-effective resilience. This can help mitigate failure and ensure data is protected and site operation maintained.

Failover and downtime

VMS-based CCTV systems, such as Digifort, use servers for recording and control. These provide significantly higher levels of failure protection than CCTV systems using NVRs and DVRs. The hard drives used for storage are arranged in RAID5 format, where a damaged drive can be replaced in the live server, or ‘hot swapped’, and missing video recordings rebuilt without data loss. Two solid state drives (SSDs) run the operating system. These are RAID1 (mirrored), where if one SSD fails, the spare takes over and keeps the system running. This level of redundancy is ideal for many CCTV systems, however, for mission-critical applications, server failover will provide the higher level of protection they demand.

The principles of failover are straightforward. If any live server in a CCTV system fails, recording and control will stop. To continue operation, the CCTV system must recognize the failure has occurred and automatically trigger a standby server, device or system to take over. Speed is of the essence, as quick failover leads to less server down time and data loss. In all the following examples, it is assumed the network infrastructure also has resilience, offering multiple routes between network switches and CCTV system devices, should the primary route fail.

The highest level of server failover is ‘mirrored’, where video recording automatically switches from a live server to a failover server of similar specification. The failover server is often on permanent, “hot standby”, to allow the fastest possible response. It is an expensive option, as it requires duplication of servers, but it provides high levels of resilience. If the live and failover servers are in different, physical locations, this level of failover could withstand a terrorist attack.

A cost-saving alternative is ‘spread servers’, where any one of a number of live servers can fail to a single failover server. For example, in a 5:1 ratio, any one of five live servers could fail to just one server. Spread server architecture is cheaper than mirrored, as the level of duplication is less, but it still offers many of the benefits. The logic in this design is that it is unlikely that more than one live server will fail at once, where data would be lost and is a ‘risk to cost’ choice acceptable to the site.

‘Spread cameras’ is another option, where only critical cameras are failed to the spare. These can be from more than one live server, spreading risk and allowing security managers to proritise camera importance.

Power failover

A sudden power outage can damage servers in a CCTV system. Loss of power to servers is easily addressed, using uninterrupted power supplies (UPS). These not only protect the servers against damage from mains power spikes, but also allow the servers and other network devices to shut down cleanly if power is lost altogether.

Dual redundant PSUs in servers can also be specified for critical applications, where if the primary PSU fails a spare takes over. Switching is quick and alerts can be configured by email or SMS to notify the system administrator that a problem has occurred and a repair required.

Digifort allows CCTV systems to be deployed with different levels of resilience, proportional to their critical nature, using failover server architecture. This ensures security, process and operational control of a site are maintained and camera, power supply, storage, server and network failure are mitigated, even when the threat is malicious attack.