Resources—News, Articles and Events

Data Center Reliability Classification - The Facility

Note: A preview of Sections 1 through 5 of the newly published BICSI 002-2010 has been made available for review.

This is the first of a two part series on Data Center Reliability Classification. In this series we will discuss the latest developments in standards defining the reliability of the data center. In this part we will discuss the recent developments in data center standards for defining and classifying the reliability of the data center facility. Part 2 of this series will discuss processes and guidelines to align the reliability criteria of the facility with the designed reliability of the the critical IT systems within the data center .

Over the years, various vendors have offered numerous proprietary methods of defining the reliability of the physical attributes of the data center facility. These methods have used 3, 4 or 10 different levels of reliability. To some extent, all of these have been successful in providing value to end users in the absence of an open industry standard. However, an open standard that has been in development for several years has recently been published.

Several data center industry experts consisting of end users, manufacturers and data center engineers have teamed up with the BICSI organization to develop an open data center standard that addresses both the facility and technology physical layers. Basic to their philosophy of the standard’s development were two core ideas: 1.) The standard’s development must be open, and 2.) The reliability defnitions should be performance-based, not solutions-based.

Existing proprietary methods promoted by various vendors have presented a number of unnecessary challanges to the data center industry:

  1. Their closed format prevents open participation by the industry. We believe it is important to engage all stakeholders in the development of standards so that all perspectives can be analyzed.
  2. Their reliability definitions had previously been based on specific topologies in favour of “cookie cutter” solutions, which discouraged the implementation of new design options developed by engineers that understand the challenges involved.
  3. Some of the proprietary methods combined design parameters with the operational processes required to maintain a desired reliability. While we agree that both design and operational processes play a significant part in ensuring the reliability of a data center, the attempt to join these metrics added complexity to the framework, and provided no new clarity regarding how the data center should be designed or operated.

The BICSI standards committee has developed a performance-based metric to define the reliability “Classes” of the facility systems. The following is a summary of the 5 Classes that define the various levels of reliability – the “F” designation for each class is used to represent the “facility”.

Class F0 – Single Path without Alternate Power Source

  • The objective of Class F0 is to support the basic environmental and energy requirements of the IT functions without supplementary equipment. Capital cost avoidance is the major driver. There is a high risk of downtime due to planned and unplanned events. However, in Class F0 facilities maintenance can be performed during nonscheduled hours, and downtime of several hours or even days has minimum impact on the mission.
  • A critical power distribution system separate from the general use power systems would not exist. There would be no back-up generator system. The system might deploy power conditioning or surge protective devices to allow the specific equipment to function adequately (utility grade power does not meet the basic requirements of critical equipment). No redundancy of any kind would be used for power or air conditioning for a similar reason.

Class F1 – Single Path

  • The objective of Class F1 is to support the basic environmental and energy requirements of the IT functions. There is a high risk of downtime due to planned and unplanned events. However, in Class F1 facilities, maintenance can be performed during nonscheduled hours, and the impact of downtime is relatively low.
  • The critical power distribution system would deploy a power conditioning device to allow the critical equipment to function adequately (utility grade power does not meet the basic requirements of critical equipment). No redundancy of any kind would be used for power or air conditioning for a similar reason.

Class F2 – Single Path with Redundant Components

  • The objective of Class F2 is to provide a level of reliability higher than that defined in Class F1 to reduce the risk of downtime due to component failure. In Class F2 facilities, there is a moderate risk of downtime due to planned and unplanned events. Maintenance activities can typically be performed during unscheduled hours.
  • In this Class, the critical power system would need redundancy in those parts of the electrical distribution system that are most likely to fail. These would include any products that have a high parts count or moving parts, such as UPS, controls, air conditioning, generators or ATS. In addition, it may be appropriate to specify premium quality devices that provide longer life or better reliability.

Class F3 – Concurrently Maintainable

  • The objective of Class F3 is to provide additional reliability and maintainability to reduce the risk of downtime due to natural disasters, human-driven disasters, planned maintenance, and repair activities. Maintenance and repair activities will typically need to be performed during full production time with no opportunity for curtailed operations.
  • The critical power system in a Class F3 facility must provide for reliable, continuous power even when major components (or, where necessary, major subsystems) are out of service for repair or maintenance. To protect against unplanned downtime, the power system must be able to sustain operations while a dependent component or subsystem is out of service.

Class F4 – Fault Tolerant

  • The objective of Class F4 is to eliminate downtime through the application of all tactics to provide continuous operation regardless of planned or unplanned activities. All recognizable single points of failure from the point of connection to the utility to the point of connection to the critical loads are eliminated. Systems are typically automated to reduce the chances for human error and are staffed 24×7. Rigorous training is provided for the staff to handle any contingency. Compartmentalization and fault tolerance are prime requirements for a Class F4 facility.
  • The critical power system in a Class F4 facility must provide for reliable, continuous power even when major components (or, where necessary, major subsystems) are out of service for repair or maintenance. To protect against unplanned downtime, the power system must be able to sustain operations while a dependent component or subsystem is out of service.

This now takes us to Part 2 on this series of Data Center Reliability Classification, the availability/reliability classifications for the critical IT systems within the data center. Reliable Resources has developed processes and guidelines to align the reliability criteria of the facility (as defined in BICSI 002-2010) with the critical IT systems within the data center.

PDF Download Download the above summary and a more complete description of the Availability/Reliability classifications in PDF format

A preview of Sections 1 through 5 of the BICSI 002-2010 has been made available to review the overall content covered. If you are interested in an electronic or hardcopy of the BICSI 002-2010 Data Center Design and Implementation Best Practices standard, it can be purchased from BICSI.