Data Center Assessments: An Overview
Data Center Assessments provide owners with two key benefits:
- They provide a clearer picture of their data center’s Capacity, Reliability, and Vulnerabilities.
- An assessment is an important early step in developing a road map for future growth, upgrades, and expansion.
Assessments can focus on a specific technical feature or system, or may be broader in scope to encompass the full array of critical and non-critical systems that support the IT enterprise, including copper and fiber communications, and network/storage infrastructure. Assessments are often initiated in response to some event such as power or cooling shortfalls or system failures, or by anticipated conditions such as a new computing initiative, or change in management.
Reliable Resources tailors each assessment to the client’s specific needs and budget. While we use a consistent approach for all of our clients, the focus will vary depending on the specific set of circumstances that exist for your organization.
An assessment can identify capacity shortfalls and gaps in what was thought to be a robust redundant power or cooling topology. These gaps represent vulnerabilities that erode both reliability and availability, putting the facility at risk of unplanned outages. Gaps are identified through a combination of physical verification/observation and document review. We examine electrical one-line diagrams, utility bills, cooling system schematics, control diagrams, architectural and structural drawings, and the actual physical installation. We meet with facilities and IT staff to engage them in a dialogue around how the facilities actually function and how they are operated and maintained. The outcome of these meetings and examinations is an Assessment Report that documents the facility capacity and capabilities in the context of industry-recognized measures of data center reliability and availability, and ranks the severity of vulnerabilities.
Basic Elements of Assessments
Assessment scope can be broken down into five key areas of inquiry:
We examine the electrical one-line diagram to indentify single points of failure and significant operational constraints. For example, redundant UPS modules may be provided, but lack an effective maintenance bypass feature (even if a maintenance bypass is indicated on drawings), or redundant UPS modules may feed a single critical output switchboard. “A” and “B” side power distribution may exist to the UPS and beyond, but mechanical loads may be fed such that effective cooling is not available if one side of the power distribution system is out of service. All critical electrical systems are evaluated, including critical and standby power generation and distribution, utility service entrance, EPO, electrical power monitoring, transient voltage surge suppression, lightning protection, grounding, fire alarm, and security/access control.
Mechanical systems receive the same level of scrutiny as electrical systems. Cooling system capacity and redundancy are closely examined, and failover scenarios analyzed. Chilled water systems are more complex than the DX (direct expansion) air-cooled refrigeration systems typically used in smaller data centers. Redundancy in chilled water systems goes beyond the chillers, piping, and heat rejection and includes commonly overlooked items such as cooling tower makeup water and control system configuration. Mechanical systems subject to review include cooling and temperature/humidity control, air distribution/ventilation, space pressurization control, fire suppression, battery room ventilation, generator fuel storage and delivery systems, generator cooling systems, plumbing/drainage systems, building automation/controls.
The building’s architectural and structural elements contribute significantly to a data center’s overall degree of robustness, even though they are not specifically addressed in the commonly used systems of Tier levels. Wall and roof construction materials, presence of windows, floor capacity, adjacency of spaces relative to other critical and non-critical support functions, and the general arrangement relative to other building functions unrelated to the data center are examined and documented. Spaces such as loading docks, IT storage and staging, burn-in, test/development, NOC, and relationships with the critical mechanical/electrical rooms are considered.
The building site is evaluated to identify vulnerabilities and characteristics that may impact data center operations. Building access, proximity to adjacent hazards such as rivers, rail lines, and airports are just a few of the site considerations that impact a data center.
Evaluation of communications infrastructure addresses flexibility and scalability to support current and future technology trends. Communication cable plant pathways including access to the building from property line through to the carrier’s point of presence and on to the raised floor may be included in an assessment. Other characteristics reviewed include communications cable plant media types, termination methods, routing and cable management, configuration of existing equipment racks and cabinets, suitability to support existing and future server and storage technology, air flow management, power & communication cable management. We can also provide an in-depth analysis of carrier provisioning of communication circuits, and LAN/SAN core routing/switching architecture.
Alignment of IT objectives with facility infrastructure is critical to successful design and operation of a data center. The goal is to consistently deliver the required level of reliability and eliminate single points of failure, while providing concurrent maintainability to reduce downtime to a level that does not compromise the end-user’s business operations. A comprehensive assessment includes an examination of the IT organization’s business objectives to determine appropriate reliability and availability goals for the facility infrastructure. Achieving alignment is accomplished through a systematic process of establishing reliability goals that support the IT and business objectives of the organization, and using those goals to inform the planning and design of the supporting critical infrastructure. Alignment between critical mechanical and electrical infrastructure capacity and reliability is also essential.
IT Capacity/Growth Planning
An impending shortfall of space, power, or cooling is often the precipitating condition that drives organizations to expand or build a new data center. The common question is always “How much space and power do we need?” (followed or preceded closely by “Here’s how much money we have”).
Assessments are commonly performed at the outset of long term planning initiatives, and serve as a critical step in developing a credible growth model that informs the size and capacity questions. Too often, planning and design for a data center upgrade or expansion is undertaken with only a vague idea of how much space, power, or cooling will be necessary to support the business over the life of the facility. Much discussion in early data center planning meetings centers around whether the facility should be designed for 150 w/sf or 200 w/sf, and whether 10,000 sf or 15,000 sf is the right size. These discussions miss the point. The real question is this: What platforms and technologies will be deployed to support the IT goals over the life of the facility, and how much space and power is required to support these platforms? Considering the size of the capital investment, it is unwise to proceed with a facilities upgrade without a firm basis for the plan.
Reliable Resources leads the IT organization through a rational evaluation of IT goals and objectives, examining historic and anticipated growth patterns to develop a model for the data center capacity planning. This model expresses data center size and capacity in terms of square footage to support the anticipated technology deployment, with an associated power requirement. Average facility watts per square foot is an outcome, not a design goal.
Assessments provide a data center owner/manager with a benchmark document that can be used as the departure point for mitigating vulnerabilities, improving reliability and availability, and long term planning.
This White Paper was prepared by Reliable Resources, Inc. Reliable Resources is a multi-disciplinary subject matter expert firm integrating IT consulting with data center facilities planning, evaluation, and design. Reliable Resources focuses solely on data centers and control centers.
For more information on how Reliable Resources can help you get the most out of your Data Center critical infrastructure, please contact us directly at (612) 279-0411, or visit our website at www.relres.com.