Download presentation
Presentation is loading. Please wait.
Published byDarleen Hudson Modified over 9 years ago
1
HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution
2
Background Facility built piece-meal over the years Old data center dates back to ~70’s New floor space (~60% of total) built-refurbished since 2008 UPS power provided for most of the RACF 1 MW (battery only runs for ~30 min) for old data center 1.3 MW (flywheel + generator runs for ~days) for new floor space Direct monitoring of battery UPS with proprietary software No direct monitoring for flywheel+generator system Operational oversight at BNL Expensive, proprietary proposed solutions were rejected
3
Old Data Center
4
New Data Center
5
UPS monitoring in new data center Requirements include: Must be fully configurable and controlled by RACF Alarm notification mechanism over multiple channels Cell phone, SMS/text messaging and email Direct interface with monitoring computer Commercially available and supported Cheap (ie, no yearly maintenance contracts) Stand-alone battery back-up (in case of power loss) Ability to integrate with existing battery UPS monitoring system
6
Synapsense Found with a simple google search Purchased model IMS-1000 (single room system) Cheap (~US$900 each unit) Requires some electrical work to install Connect to phone line and internet Installed one unit in Sigma-7 in 2010 and another in CDCE in 2011 Initially configured to notify over phone and email only (no integration with existing auto-shutdown mechanism) Call down list feature with auto-escalation enabled (ie, if person A doesn’t acknowledge the alarm, the system calls person B, etc) Supervisor on call down list – effective way to motivate staff After extensive testing, no further development for several years (other priorities took over)
7
Anatomy of BNL Configuration IMS-1000 Utility power and default power source for IMS-1000 unit UPS 1 UPS 2 Battery back-up for IMS-1000 unit Alarm signal via telephone line Alarm signal via Internet Inputs Outputs
8
Alarm & Notification Mechanism UPS Alarm Inform 1 st contact person Begin countdown for automatic shutdown Alert data center supervisor
9
Alarm & Notification Mechanism UPS Alarm Inform 1 st contact person If no answer, escalate to 2 nd, 3 rd and 4 th contacts Begin countdown for automatic shutdown Alert data center supervisor
10
Alarm & Notification Mechanism UPS Alarm Inform 1 st contact person If no answer, escalate to 2 nd, 3 rd and 4 th contacts If no answer from any responder, call the boss Begin countdown for automatic shutdown Alert data center supervisor
11
Alarm & Notification Mechanism UPS Alarm Inform 1 st contact person If no answer, escalate to 2 nd, 3 rd and 4 th contacts If no answer from any responder, call the boss Begin countdown for automatic shutdown Alert data center supervisor Shutdown worker nodes and non-critical servers
12
Work Timeline UPS Alarm Inform 1 st contact person If no answer, escalate to 2 nd, 3 rd and 4 th contacts If no answer from any responder, call the boss Begin countdown for automatic shutdown Alert data center supervisor Shutdown worker nodes and non-critical servers Before Summer 2014 After Summer 2014
13
IMS-1000 Unit Wall-mounted box Installed IMS-1000 unit
14
IMS-1000 Unit (continued) Close-up view of unit Input sensor (UPS)
15
IMS-1000 Web Interface
16
UPS and Cooling Most of the IT equipment connected to UPS-backed power, but CRAC (Computer Room Air Conditioning) are not. Dangerous overheating can occur in a matter of minutes
17
March 18, 2014 6:20 am 6:40 am
18
Recent developments The cooling incident on March 18 gave us tangible evidence that investing a little more time on configuring Sensaphone is a good idea UPS monitoring via Sensaphone was integrated with existing auto- shutdown of (most) IT equipment due to cooling or utility power interruptions – completed summer 2014 Beyond email/phone alarm acknowledgement Brookhaven’s utility division on-call staff notified Shutdown if temperature passes threshold or time limits Selected CRAC units now on UPS back-up power and domestic water back-up (for utility chilled water) – completed September 2014 Plan to add more CRAC units to UPS and domestic water back-up in next 2-3 years
19
September 2014 Temperature fluctuations resulting from engineering work to integrate CRAC units to domestic back-up water (in case utility chilled water plant is down)
20
Conclusions Sensaphone is a low-cost solution for UPS monitoring of a data center Easy to configure (ours was done by a technician and a summer student) Portable and flexible Can monitor multiple power sources if needed Can monitor other parameters such as humidity, temperature, etc Durable – has worked quietly and reliably for ~4 years Free, technical support (via email and phone) available
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.