This is part 6 in a series focusing on critical aspects of SQL Server encryption. When a SQL Server customer deploys Transparent Data Encryption (TDE) or Cell Level Encryption (CLE) and protects encryption keys on an encryption key management solution, it is important that the key manager implement reliable business continuity support. Key managers are a part of the critical infrastructure for your applications and should be resilient in the face of common business continuity challenges such as data center damage or destruction (fire, hurricanes, flood, earthquake, etc.), network failures, and hardware failures. Let’s review some aspects of key management resilience.
Key Management Hardware Resilience
Key management systems come in many form factors including network attached hardware security modules (HSMs), virtual machines for VMware and Hyper-V, cloud instances for Microsoft Azure, Amazon Web Services (AWS), IBM SoftLayer, Google Compute Engine, and other cloud platforms, and as multi-tenant key management solutions such as AWS Key Management Service (KMS) and Azure Key Vault.
When a key manager is deployed as a hardware solution it should implement a number of hardware resiliency features including:
- RAID protected hard drives
- Hot swappable hard drives
- Redundant power supplies
- Independent Network Interfaces (NICs)
- Audible alarms
To the greatest extent possible a key management hardware system should be able to protect you from common hardware failure issues.
Key Substitution or Corruption
Key management systems store encryption keys in different types of data stores on non-volatile storage which is subject to key corruption through attack or hardware failure, or subject to key substitution through attack. Key management systems should use common integrity techniques such as hash-based message authentication code (HMAC) or similar technologies to detect this type of failure. Encryption keys should not be returned to a user or application in the event integrity checks fail, and all integrity check failures should be reported in audit and system logs. Additionally the integrity of the key database and application should be checked when the key manager initially starts processing. Early detection and quarantine of bad encryption keys helps prevent data corruption and gives the security administrator the ability to restore proper operation of the key manager.
Real-time Key Mirroring and Access Policy Mirroring
Because key management systems are a part of an organization’s critical infrastructure, they should implement real-time mirroring of encryption keys and access policies to one or more secondary key servers. The real-time nature of key mirroring is important to prevent the loss of an encryption key after it is provisioned but before it has been copied to a secondary system.
Real-time mirroring should also be able to recover from temporary network outages. If keys cannot be mirrored because the connection between the primary and secondary servers is interrupted, the key mirroring facility should automatically recover and resume mirroring when the network is operational again. This reduces the chance that keys are lost due to latency in mirroring.
Many organizations deploy complex distributed networks that require multiple secondary key servers. While most key management installations involve just one production and one secondary key server, good key management mirroring should involve the ability of a primary key server to mirror to multiple secondary key servers.
Active-Active Key Mirroring
Expanding on the topic of encryption key and access policy mirroring, it is important that key management systems fully support role-swap system recovery operations and this involves the dynamic change in roles between a primary and secondary key server. When a primary key server is unavailable a secondary key server automatically steps in to serve various encryption key functions. In this situation it is important that the secondary key server now becomes the primary key server for a period of time. New encryption keys may be created, the status of existing keys may change, and access policies may also change. A good key mirroring architecture will allow for these changes to migrate back to the original primary key server when it becomes available. This is the central feature of Active-Active mirroring implementations.
Key Management Monitoring
Because key management systems are critical infrastructure it is important to deploy monitoring tools to insure a high service level. Key management systems should generate and transmit system log information to a monitoring solution, and the key management system should enable monitoring by external monitoring applications. In the event a key server becomes unavailable it is important to identify the outage quickly.
Key Management System Logging and Audit
Another important aspect of key management business continuity is proper system logging of the key management server. Key management systems are high value targets of cyber criminals and active monitoring of key management system logs can detect an attack early in the cycle.
Additionally, key management systems should audit all management and use of encryption keys and policies. A good key management solution will audit all actions on encryption keys from creation to deletion, all changes to key access policies, and all access to keys by users and applications. These audit logs should be transmitted to a log collection or SIEM monitoring solution in real time.
Key Management Backup and Restore
As critical systems key managers must implement backup and restore functions. In the event of a catastrophic loss of key management infrastructure, restoring to a known good state is a core requirement. Good key management systems enable secure, automated backup of the data encryption keys, key encryption keys, server configuration, and access policies.
Key management systems differ from traditional business applications in one important aspect - data encryption keys should be backup separately from key encryption keys. You should be able to backup data encryption keys automatically or on demand, but you should take care to separately backup and restore key encryption keys. This is a core requirement for key management systems.
In summary, key management systems need all of the major business continuity components that you would expect in mission-critical business applications.