Improving Computer Room efficiency with freecooling – National Centre case studyM W Brown CEng MIEEEPCC, University of Edinburgh Facility Manager: Advanced Computing FacilityJune 2008
Overview of the Advanced Computing Facility • The problem • "hector" – outline of requirements • The solution • Initial results • Summary
Advanced Computing Facility • Constructed 1976 for the University of Edinburgh: • 1 x 600 m² Computer Room • 24-stage DX-based cooling (R12!) servicing the room through 4 vast walk-in air-handling units • "conventional" downflow system • Refurbished 2004 as the Advanced Computing Facility: • 2 x 300 m² Computer Rooms (one active, one empty concrete shell) • all new chilled-water based plant services, with capacity of 1.2MW • Major expansion 2007 for "hector" (UK national service): • 2nd Computer Room brought into operation • new-build external plant room to support massive uplift in required capacity • new HV electrical provision (up to 7MW)
Computer Room 2, ACF • General-purpose Computer Room • laid out with 10 x 6m equipment rows, with alternating "hot/cold" aisles • 500mm subfloor • 4m from floor level to ceiling • 10 x 60 kW capacity CCU's arranged along "long walls", supplied from 8° flow/14° return chilled-water system • dual 3-ph underfloor busbars supply power to each row • Large mix of equipment from many suppliers • designed for approx 400 kW heat-rejection to air
General-purpose computer room layout • A typical computer room may be arranged with alternating hot/cold aisles 2 x 600mm tiles wide • Chilled air is supplied through vented floor tiles • Rack-mounted equipment draws in air from the cold aisle through the front, and vents out the back • Room A/C units (chilled-water or DX) arranged along the side walls, typically taking in return air about 2m from floor
Problems with this layout • Supply air gets mixed with room air raising its temperature prior to being captured by the inlet fans • Incomplete rows allow leakage from cold aisle to hot aisle, thereby wasting chilled air • Racks at the ends of the aisles may suffer from: • leakage of warmer air from the side aisles • starvation of chilled air as the underfloor air is forced into the centre by the CCU fans • Return air into the CCU's has mixed with high-level room air and has thus cooled: • this means that the return air onto the coil is cooler, hence narrower (and less efficient) Δt across the coil • the returning air has transferred some of its heat directly to the room air, thus contributing to the inefficient pre-warming of the supply air
Problems with this layout • Recent measurements at ACF Computer Room 2 (conventional layout): • Cold aisle temps (midway) in the range: 16.4° to 18.5° • Hot aisle temps (midway) in the range: 26.5° to 31.2° • Side aisle just 1 tile (600mm) off end of cold aisle: 20.4° • CCU inlet temps (2.2m off ground): mean of 24°
Problems with this layout • To maximise the efficiency of air-side cooling, you need to separate as far as possible supply and return air • However this is not easy in a general-purpose room designed for flexibility - and thus which may contain a variety of equipment with different loads, different rack designs and dimensions, and from a range of suppliers • A general-purpose room is by definition a compromise, but recent developments in water-assisted racking systems should go far towards enabling that supply/return air separation
Improvements • Replacing multiple independent DX-based room-units with chilled-water units serviced from remote central plant • Having an effective BMS system that can measure room conditions as a whole and adjust local plant (CCU's) and remote plant (chillers etc) without the inefficiencies of multiple independent room units hunting against each other • Improving airflow: • avoiding short-circuits into and between aisles • careful selection of placement of vented floor tiles • good underfloor depth with a minimum of obstructions • reduction of return-air mixing by increasing height of CCU inlets
Improvements • Selection of CCU's with VSD control of their fans reduces energy when the preference is to run all units concurrently • Selection of cooling towers with VSD control of their fans allows towers to ramp up and down according to load without big fans kicking in and out • Careful selection of chilled-water flow/return temps, and also condenser water temps – allowing a lower condenser water inlet temp to the chillers may increase fan power to the towers, but compressors then may not have to work so hard in compensation
Air versus water cooling • However, power/space density is going up. . . • RCO Building, University of Edinburgh (1976): • designed round a power/space density of approx 0.5kW/m² • Daresbury Laboratory C Block refurbishment (2002): • designed round a power/space density of approx 2.5kW/m² • ACF (phase 1), University of Edinburgh (2004): • designed round a power/space density of approx 2.5kW/m² • ACF (phase 2), "Hector" UK National Service (2007): • designed round a power/space density of approx 7kW/m²
Air versus water cooling • Rack power is going up: • 2002: IBM p690 (HPC-X UK National Service at Daresbury): 10kW per rack • 2007: Cray XT4 ("hector" UK National Service at Edinburgh): 18kW per rack • 2008: Cray XT5 (various HPC sites in US and elsewhere): 38kW per rack • This is now at (or beyond) the effective limits of direct air-cooling • Suppliers now must either move towards efficient packaging with water-assisted cooling directly in the racking, or more radical methods of direct liquid cooling
Air versus water cooling • Water is a far more efficient heat-transfer medium than air • Why try and cool the entire volume of a Computer Room when most of that air is not being used in the cooling of the equipment ? • Huge amounts of energy are used just moving air around. . .
Air versus water cooling But . . . • Water-cooling infrastructure requires central plant with high capital cost both in plant and physical external space for that plant • Water and expensive electronics are not a good mix, nor are water and high-power electrical supplies. . .
"hector" • UK national HPC service, Oct 2007 – Oct 2013 • Funded by central Government, with EPSRC as the managing agent • £113M project (capital & recurrent) in 3 x 2-yr phases • Technology (phase 1 & 2) provided by Cray • Science Support provided by NAG Ltd • Facility operations by partnership of University of Edinburgh and STFC (Daresbury Laboratory) • Physical location: secure site operated by UoE
"hector" • Phase 1 (accepted: Sep 07): • 60TFlop Cray XT4 • approx max input power of 1.2MVA • approx cooling load of 1.2MW (heat rejection directly to air) • Phase 2 (installation: summer 09): • ~60Tflop Cray XT4 (quadcore upgrade) • ~200TFlop Cray (tba) • approx input power of 1.8MVA • approx cooling load of 300kW (heat rejection directly to air) • approx cooling load of 1500kW (to water via R134a loop) • Phase 3 (installation: summer 11): • technology supplier subject to future tender • anticipate infrastructure requirements approx as per Phase 2
"hector" • We were given a very short time to prepare a computer room specifically to support the three phases of "hector" • Energy efficiency was an obvious requirement – even though as an operator we were unable to accept the risk on energy pricing – wisely as it has turned out. . . • Maximising efficiency became a key design goal in order to: • meet University requirements regarding energy efficiency • be compliant with Government policy regarding energy efficiency in public-sector projects • reduce recurrent expenditure thereby saving tax-payer's money • common sense!
The solution • Phase 1 infrastructure requirements • Outline design for specialised Computer Room • Specification of plant services • Project timeline • Computer Room design details • Chilled-water system design details • Free cooling design and operation
Phase 1 infrastructure requirements • 60 x Cray XT4 (dualcore) systems • input power: in the range 18 -> 20 kVA each • all heat rejected to air • chilled air (recommended on-temp of 13°) drawn in directly from sub-floor by large 3-phase variable-speed blower • heated air ejected directly out of the top of the cabinet (typically at 42°)
Phase 2 infrastructure requirements • 16 x Cray XT4 (upgraded to quadcore) systems • input power: in the range 14 -> 20 kVA each • all heat rejected to air • chilled air (recommended on-temp of 13°) drawn in directly from sub-floor by large 3-phase variable-speed blower • heated air ejected directly out of the top of the cabinet (typically at 42°) • 24 x New Generation Cray cabinets • input power: expected to be ~40 kVA each • phase-change evaporative cooling – air within each cabinet drawn across evaporator pipework containing R134a and returned to room • 1 x XDP (HX) per 4 cabinets • R134a condensed by chilled water (planning assumption: 10°/16°)
Computer room – outline design • Required infrastructure must be able to cope with both Phase 1 and Phase 2 cooling requirements • High-capacity chilled-water main supplying water at 8° to 14 x 80kW capacity CCU's set to supply air off-coil at 13° (+/- 0.4°) • Valved connections installed for 12 x XDP HX units for Phase 2 • Install lowered ceiling designed to capture exhaust air from XT4's, with inlets to CCU's ducted directly from ceiling void • Aim to maximise return air temp to widen Δt across coil and minimise interaction/mixing with room air
Computer Room - outline design • 700mm between top of cabinets and ceiling void – to minimise mixing of exhaust air and room air • VFD control on CCU's, modulated to supply 60m³/sec into the floor void (capability: 120 m³/sec) • At normal operation, chilled-water flow rate is around 40 l/s with 8° flow and 14° return • No room conditioning – control only the supply air into the sub-floor. Room ambient maintained at a comfortable level through minor leakage via cable-ways
Specification of plant services • Central plant was required to provide cooling of up to 2.6MW (with at least N+1 redundancy in all key elements) • Security of electrical supplies and protection against their diminished quality required significant enhanced electrical provision • Maximising of operating efficiency was a key objective
Chilled water system design details • 3 x parallel 1.2MW capacity chillers (duty, standby, reserve) with triple chilled-water circulation pumps (VSD-controlled) always running. 8° flow/14° return • Variable-flow through CCU's and chillers • 6 dry cooling towers for condenser water, with triple condenser water circulation pumps (VSD-controlled) always running. VSD-controlled fans on towers. 32° flow/27° return • 2 x 27,000 lit capacity buffer-vessels
Plant Room B • New 470m² Plant Room constructed Jan-Jul 07 to supply services solely for the "hector" services • In prospective: the Plant Room is 1.5 x the area of the room it services! • Contains all HV switchgear, 4 x transformers, 3200kVA UPS modules, chillers, condenser water/chilled water pumps and main controls • "Lights out" operation – no plant operators
Project timeline • 27 Jan 07: cut ground for construction of 470m² Plant Room B • mid Mar 07: walls to full height • 24 Mar 07: steelwork for roof structure completed • 08 May 07: Computer Room 1 refurbishment completed • 25 May 07: HV switchroom commissioned • mid Jun 07: Cooling towers installed • 02 Jul 07: Plant set to work – final commissioning tests (1MW loadbanks) • 26 Jul 07: Start of delivery/installation of Cray XT4 • Aug 07: Cray XT4 installation/commissioning • 12 Sep 07: Entered final acceptance • 01 Oct 07: Service commenced
Protection against power instability • UPS (static, 10 -20 mins autonomy) for Computer Room loads only. Principally for providing clean high-quality 3-ph/50Hz • Multiple 400kVA (2004) and 800kVA (2007) units supplied from different sides of their LV boards • MUST keep cooling running when the UPS is maintaining power to the Computer Room • Standby 500kVA generators supply power to "essential" services only (pumps, CCU’s, MCC panel etc). Load shed everything else
Electrical provision • 2 incomers to dedicated 11kV HV sub-network for the facility • 6 x transformers • 2 x 1.5 MVA supply original (phase 1) parts of building • 2 x 1.6/2.4 MVA supply "hector" UPS switchboard and hence Computer Room connected loads • 2 x 1.6 MVA supply all mechanical services for "hector" • 3 x dual-section LV boards, each supplied by 2 x TX • 2 x 500 kVA diesel generators • 8 x static UPS modules: • 2 x 100 kVA for "hector" MCC panel and chilled-water circ. pumps) • 2 x 400 kVA (for Computer Room 2) • 4 x 800 kVA (for Computer Room 1)
Cooling system performance • The average off-coil air temperature is maintained with ease in the range: 12.7° - 13.3° (in excess of design spec) • The average chilled-water flow temperature is maintained in the range: 7.7° - 8.3° (load independent) • The average chilled-water return temperature is maintained in the range: 13.7° - 14.3° • 60 m³ per sec of air at mean 13° is supplied into the sub-floor • Chilled-water flow rate is maintained at 40 lit per sec
Free cooling design and operation • Stage 1: (when OAT < 13°) • valves open to allow return chilled-water to divert via secondary cooling towers • fan-speeds on all towers set to 30% • mechanical services power drops by about 10% (200 kW to 180 kW) • Stage 2: (when return chilled-water off towers < 9.7°) • fans modulate between 30% and 70% (aim to achieve 8°) • duty chiller backs right off unless chiller entering temp > 9.7° • further power reduction of about 15% (180 kW to about 150 kW) • Stage 3: (when return chilled-water off towers < 8.5°) • duty chiller setpoint raised to 11.5° to keep chiller off • max power reduction down to around the 60 kW baseload required to maintain flows of air and water
Free cooling design and operation • Stage 1 freecooling commences when the OAT is < 13° • Stage 2 freecooling is load-dependent but appears to take over from Stage 1 when OAT is around 6° • On observed loadings, the chiller appears to shut down when the OAT is around 2.5°, but typically the chiller is held off until the temperature has risen to around 4° • Despite this being the week with mid-summer, Stage 3 freecooling was engaged between 17/2145 and 18/0815
"Hector" Phase 2 • Planning underway for the technology refresh due in mid 2009 • Ongoing discussions with Cray on the operating parameters for their XDP heat-exchanger unit – we are hoping to influence their design such that the chilled-water off temperatures can be maximised, thereby increasing the possibility of "free cooling"
Conclusions • Annual savings of energy in Gigawatt hours are projected • "Hector" efficiencies are due to: • extensive use of VSD on pumps and fan motors • maximising the separation of supply/room air through direct injection into the base of the cabinets and effective capture of the exhaust air • careful selection of chilled water flow/return temperatures that maximises changes of being able to "free cool" • optimising the design for the specific (albeit perhaps unusual) requirements of the Cray XT4 system • provision of secondary loops through the cooling towers giving efficient mode of "free cooling" • being at 56 degrees North !
Acknowledgements • People too numerous to mention have supplied me with information for this presentation, but we should acknowledge: • David Barratt (Engineering Services Manager, University of Edinburgh) • David Somervell (Energy Manager, University of Edinburgh) • Lawrence Valentine (Crown House Technology) • The bulk of the design of the "hector" cooling infrastructure flowed from the pen of Lawrence Valentine, and significant energy efficiencies have been the direct result of his skills