Data Center Redundancy: A Guide to N-Levels and Tier Classifications
To assess data center redundancy, ask detailed questions about backup systems and transition processes rather than relying solely on generic marketing terms.
Virtually all data centers offer some level of redundancy, but the extent of that redundancy varies significantly. Unfortunately, the terminology used within the industry to describe redundancy levels is often unclear and inconsistent.
Understanding data center redundancy ratings and concepts is critical when determining the appropriate level of redundancy for a given workload. This article examines common approaches to data center redundancy and provides practical guidance on selecting a redundancy strategy that aligns with your business requirements.
What Is Data Center Redundancy?
In the context of data centers, redundancy refers to backup system deployment designed to mitigate operational failure.
For example, a data center with power system redundancy might include backup generators and uninterruptible power supply (UPS) units to ensure electricity remains available in the event of a primary power source failure. Similarly, redundancy can apply to cooling systems, networking infrastructure, and other core components of the data center.
It is essential to note that data center redundancy typically does not encompass server redundancy, specifically backup servers that can take over in the event of a primary server failure. Server redundancy is a feature IT departments can implement as part of an IT infrastructure strategy . A redundant data center focuses solely on providing backup systems for its critical operations -- such as power, cooling, and networking – rather than the IT equipment housed within the facility.
Related:Tiers and Beyond: Understanding the Multiple Ways to Classify Modern Data Centers
Ways of Measuring Data Center Redundancy
Data center redundancy is commonly measured using two approaches: N levels and data center tiers. Each method provides a framework for understanding the extent of backup systems available within a facility.
N Levels
This approach quantifies redundancy by comparing the number of components required for normal operations (represented by the letter "N") to the total number of components the data center actually has.
Thus, a data center with N+1 redundancy includes one additional component beyond what is necessary for normal operations. If a single component fails, the extra component takes over, ensuring uninterrupted service.
The highest level of redundancy expressed in N levels is typically 2N, meaning the data center has twice the number of components required for normal operations. Even if the entire set of production systems fails, a complete backup system is available to maintain operations.
Data Center Tiers
Redundancy can also be measured using data center tiers , a classification system defined by the Uptime Institute . There are four tiers in total, with higher tiers indicating greater levels of redundancy and reliability.
Related:Unlocking Efficiency: The Advantages of Multi-Story Data Centers
While the tier system does not specify exact redundancy requirements, redundancy is a key factor in determining a data center’s tier rating.
comparison chart of Uptime Institute's Data Center Tier Classification
The Challenges of Assessing Data Center Redundancy
For businesses seeking to accurately assess the redundancy of a data center, both N-level ratings and data center tiers have notable limitations.
N-Level Rating: While N-level ratings offer a straightforward way to quantify redundancy, they do not always correlate with reliability. For instance, N+1 redundancy can be highly effective for certain systems, such as power systems, where a complete backup can support the entire data center. However, N+1 is less effective for systems like UPS when the facility has hundreds of units. In such cases, a single extra UPS unit might not significantly enhance overall reliability.
Data Center Tiers: While widely recognized, data center tiers are less than ideal for measuring redundancy. The Uptime Institute’s tiering system does not define redundancy requirements with exact precision, making it difficult to assess the true reliability of a facility based solely on its tier rating. Moreover, some data centers claim tier levels without undergoing a third-party evaluation, which can lead to misleading representations of their redundancy capabilities.
Related:How to Prevent Data Center Fires: Lessons from the Biggest Incidents
Additionally, even a data center with robust internal redundancy can fail due to external factors, such as natural disasters or physical attacks. Events like floods, earthquakes, or hurricanes can incapacitate an entire facility, while security breaches or physical damage can disrupt operations. No level of internal redundancy can prevent disruptions if an entire facility is wiped out or disconnected from external networks.
A Practical Approach to Data Center Redundancy
When evaluating whether a data center meets redundancy requirements, N-level ratings and data center tiers provide a helpful starting point. For workloads requiring high reliability, aim for data centers offering at least 2N redundancy or those classified as Tier IV.
That said, it’s important to look beyond generic redundancy descriptors and ask detailed questions:
Redundancy Calculations. Request specifics on how the data center calculates its redundancy figures. For example, how many spare components does it maintain as a percentage of those required for normal operations?
Backup Transition Processes. Inquire about the procedures in place for transitioning to backup systems in the event of a failure. Are these transitions seamless, or is downtime expected?
To further safeguard against a complete facility outage, consider deploying workload replicas in an additional data center. While this approach can be expensive – especially if it requires doubling your total data center capacity – there are more cost-effective alternatives. These include maintaining a scaled-down copy of your production environment in a public cloud . In the event of a data center outage, you can fail over to the cloud and scale up your environment as needed.
About the Author
Technology Analyst
Christopher Tozzi is a technology analyst with subject matter expertise in cloud computing, application development, open source software, virtualization, containers and more. He also lectures at a major university in the Albany, New York, area. His book, "For Fun and Profit: A History of the Free and Open Source Software Revolution ," was published by MIT Press.
You May Also Like