Environmental Monitoring and Alerting for Computing Room Facilities Wednesday, November 17, 2004 9:00 am – 10:00 am Gerry Bellendir, Jack MacNerland, David.

Slides:



Advertisements
Similar presentations
Presented by Nikita Shah 5th IT ( )
Advertisements

Yokogawa Network Solutions Presents:
T-Mon SERVER CONNECTOR
State Fire Marshal Question and Answer Session with the Louisiana Automatic Fire Alarm Association March 19, 1999.
G-Eye Extending your monitoring & control capabilities
Poseidon 4002 Rack monitoring 1. Poseidon: Monitor & Control 2.
Utility Management Providence Health System - Oregon Environment of Care.
Lighting and Wastewater Controls Smithfield Foods Engineering.
CD Operations Part of Division Office Gerry Bellendir, Assoc. Head for Operations David Ritchie, Assistant Head for Operations Amy Pavnica, Senior Safety.
ActiveXperts Network Monitor Monitors servers, workstations and devices for availability Alerts and corrects.
ActiveXperts Network Monitor Monitors servers, workstations and devices for availability Alerts and corrects.
June 2010 At A Glance The Room Alert Adapter software in conjunction with AVTECH Room Alert™ devices assists in monitoring computer room environments as.
CIT 470: Advanced Network and System Administration
ICW Water Leak at FCC Forces a Partial Computing Outage at Feynman Computing Center J. MacNerland, M. Thomas, G. Bellendir, A. Walters May 31, 2007.
Remote Monitoring and Management Solutions ®
Advanced Workgroup System. Printer Admin Utility Monitors printers over IP networks Views Sharp and non-Sharp SNMP Devices Provided Standard with Sharp.
BayTech Global Power Management GPM Software for BayTech Remote Power Controllers  Discovers BayTech devices on existing subnet  Build Logical and Functional.
 Wireless, Web-Based Monitoring System  Alarm/Event Notifications by Text Message and or  VPN Connection for Fast Response to Alarms and Events.
Dillard’s Energy Management
Lecture slides prepared for “Business Data Communications”, 7/e, by William Stallings and Tom Case, Chapter 8 “TCP/IP”.
OpStor - A multi vendor storage resource management and capacity forecasting software.
Welcome to Networking! 1. Connect your computer to the network with a cable 2. Copy the Networking folder from the flash drive to the computer or your.
Rapid Intervention Team & MAYDAY Procedures
Computerized Networking of HIV Providers Networking Fundamentals Presented by: Tom Lang – LCG Technologies Corp. May 8, 2003.
G4 Control and Management Solution for Data- Centers and Computer Rooms.
Management Products & Remote Monitoring Service Providing the Total Solution.
AXIS 2460 System Overview  AXIS ETRAX 100LX  ARTPEC-1  4 MB FLASH  32 MB RAM  Up to 4 internal IDE hard disks  Linux 2.4.
Information Systems CS-507 Lecture 40. Availability of tools and techniques on the Internet or as commercially available software that an intruder can.
All Experimenters Meetings Windows 7 Migration 1 April 18, 2011 W7 AEM Presentation.
NetVuze Precision Network Meter standalone, easy to use, networked with data logging.
ACCOUNTABILITY shouldn’t mean working 24/7.
Systems technologies // healthcare solutions. What is messenger? Multi-user messaging platform Sends messages to: Wireless Telephones (WiFi / DECT) Pocket.
An Introduction to IBM Systems Director
IT Infrastructure Chap 1: Definition
Copyright 2010 – Johnson Controls, Inc. 1 A Day in the Life of a Smart Campus Clay Nesler VP, Global Energy & Sustainability Johnson Controls
PPD Computing “Business Continuity” David Kelsey 3 May 2012.
Preventing Common Causes of loss. Common Causes of Loss of Data Accidental Erasure – close a file and don’t save it, – write over the original file when.
Event Management & ITIL V3
Monitoring EMS Infrastructure Ann Moore San Diego Gas & Electric September 13, 2004 EMS Users Group Meeting-St. Louis.
Mr C Johnston ICT Teacher BTEC IT Unit 05 - Lesson 05 Network Protocols.
CHAPTER 12 Copyright © 2007 Thomson Delmar Learning 12.1 Protective Systems.
Infrastructure Gina Dickson – Product Manager Kevin Jackson – Sales Manager.
LAT Environmental Test PDR1 GLAST LAT Project3-4 May 2005 LAT Environmental Test Planning and Design Review 3-4 May 2005 NRL-SLAC Networking LAT Environmental.
HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
® BayTech The Power to Control Global Power Management.
APC Web/SNMP Management Card and PowerChute Network Shutdown
Console Operations (Service Desk). Console Operators are tasked with a wide variety of functions and responsibilities We are the first point of contact.
By Bear Mountain Software, Inc.. How Reliable Are ? ? ? ? Your NT Server Networks Messaging Systems IP-based Services ?
ITGS Network Architecture. ITGS Network architecture –The way computers are logically organized on a network, and the role each takes. Client/server network.
Building Control System (Metasys) Project Status David J. Ritchie CDO/Operations March 18, 2003.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Computer Centre Consolidation Project Vincent Doré IT Technical.
A Service-Based SLA Model HEPIX -- CERN May 6, 2008 Tony Chan -- BNL.
FESS/Ops Shut Down Summary Week Ending 9/7/12. – FESS needs a CUB Cooling Outage for installation of Barriers in the CUB Chiller 4160 Volt Starter Cabinet.
CD-doc-650 Fermilab Computing Division Physical Infrastructure Requirements for Computers FY04 – 07 (1/4/05)
Remote monitoring solutions to protect mission critical infrastructures Remote monitoring and control solutions to guard your mission-critical IT equipment.
Physical Security Ch9 Part I Security Methods and Practice CET4884 Principles of Information Security, Fourth Edition.
Contingency Management Indiana University of Pennsylvania John P. Draganosky.
TEMPERATURE MONITORING IN GCC By Constantine Mukasa Bethune-Cookman College Supervisor: David Ritchie SIST
1 04 April 2007 L. Roy Detector Safety System for infrastructure - General fire detection (barracks, cavern) - Sniffer system (smoke detection above the.
CD-doc-650 Fermilab Computing Division Physical Infrastructure Requirements for Computers FY04 – 07 (1/4/05)
Poseidon 4002 Rack monitoring.
Configure the intercom IP
IP-based 8-port Switched Power Manager
Declaring intent in leo
How SCADA Systems Work?.
Fire Alarm Systems for Emergency Operations of Elevators
ACS Webinar
‘‘ BUILDING AUTOMATION’’
Network Monitoring System
Presentation transcript:

Environmental Monitoring and Alerting for Computing Room Facilities Wednesday, November 17, :00 am – 10:00 am Gerry Bellendir, Jack MacNerland, David Ritchie, and Mark Thomas

Agenda FCC New Muon -> LCC HDCF -> GCC Futures Vulnerabilities Discussion, Questions, etc.

FCC Presented by Jack MacNerland Smoke detection Sprinklers Under Floor Fire Supression Tape robot fire suppression Power Logic Electrical Panel Monitoring Security at FCC

FCC (cont’d) Presented by Mark Thomas Firus –New developments –Installed FIRUS Terminal in OPS Office so can monitor chillers at New Muon. –Set up page to show critical info for FCC, New Muon, HDCF, and Casey’s Pond. –Com Center monitors night; FESS monitors day; –CD/OPS monitors also.

FCC (cont’d) Presented by David Ritchie Metasys –UPS and Generator Monitoring and Alerting via Metasys –Current and future Status (see Appendix A) Other Monitoring –CSS (Stan Naymola): Two types… lm_sensors. –Can shutdown systems that are hot. –Self-contained, works independently of any other system. –If >50% of the nodes are down, it notifies. –single nodes that turn themselves off - recorded in logs for investigation. –CSS (cont’d): Independent temperature monitor located in the top of a rack. –Recorded in ganglia as record of room temperature. – s when temp crosses highs and lows. –Does not page. –CDF (Glenn Cooper): CDF nodes just have straight lm_sensors, uses the RPM put together by the Farms group.

New Muon -> LCC Presented by Jack MacNerland Smoke detection Sprinklers Under Floor Fire Suppression Security (Pegasys)

New Muon (cont’d) Presented by Mark Thomas Firus –The usual fire protection system –Chillers

New Muon (cont’d) Presented by David Ritchie Metasys – See Appendix A. Other – CDF - see above lm_sensors discussion Other – Lattice QCD (Don Holmgren)… –Omega temp. box In use for a couple of years Alarms on high/low temperature, dry contact input. Only Notification: dial out to 4 phone number rotation until acknowledged. Currently: Call Center, Amitoj's office number, DH office number. –Discussion… Vulnerability: Omega box not able to reach someone (pre-call-center, post-operator-exit) Addressed with Netbotz unit –connects to the network, can send , push files via FTP, and serve data via HTTP. – has "last call" pager when power loss. Have not switched to the Netbotz for notifying the call center; Still use Omega box. Have the Netbotz unit set to send to lqcd principals on various alarms. Also have trend plots and live web page… Other – Lattice QCD ( cont’d) –IPMI Reads out cpu and system temperatures, fans. Includes vendor-specified thresholds. –When a sufficient number of nodes are over temperature, we automatically declare an alarm and shutdown… »Batch queues, »Operating systems, and »Power off the nodes via IPMI. –Independently, the Netbotz and Omega boxes can trigger an alarm which causes the LQCD and/or ISA groups to manually initiate shutdowns if necessary. We maintain trend plots for all measured quantities, and have automated mailings listing nodes with bad fans and/or high temperatures. The trend plots are available by clicking on the vertical bars on: or via individual nodes, bin/stat?health=MRTG=qcd bin/stat?health=MRTG=qcd0102

HDCF -> GCC Presented by Jack MacNerland Smoke detection Sprinklers Under Floor Fire Suppression Security at GCC (Pegasys) Presented by Mark Thomas Firus UPS Monitoring and Alerting via Metasys –Connection under development – See Appendix A.

HDCF -> GCC (cont’d) Presented by David Ritchie Other – lm_sensors (see above) Other – auto-shutdown when UPS goes to batteries. –Zonatherm / Liebert have automatic shutdown capability may be acceptable to shut down the PCs in GCC upon the UPS going to batteries Involves: –Agent PC running Liebert-provided software which senses UPS dry contacts status. –Software (SNMP) notifying, IP-by-IP address, each PC that it should shutdown. –Cost ~$5,000. –Outstanding issues Must hand-installed s/w in all ~1400 PCs and Must manually enter 1400 IP addresses –Liebert seems interested in joint effort.

Other Matters Futures –Facilities Environmental Event Notification Scheme –Next Generation Metasys Vulnerabilities –FCC has loss of Casey's Pond Water or anything in that causality chain as its main vulnerability (JM) –New Muon has loss of electrical and/or loss of water as its primary vulnerability (age?, ownership?) (JM/DR) –HDCF has loss of cooling without consequent loss of power as its main vulnerability (JM/DR) Discussion, Questions, etc.

Metasys – Current FESS (Mike Michalak) — Status as of 11/12: –FCC is operational. (Power Logic panel monitoring work still required?). –HDCF network connected to Metasys panel Mike: should have HDCF up on the Metasys System Extended Architecture (Next Generation) next week (week of 11/15?). power outage required to tie in the power meters. –New Muon has no Metasys. NAE purchased for New Muon Ready to plan connections at New Muon. Network connection will be required. Metasys System Extended Architecture (MSEA) will be installed with the new CRAC units as part of the New Muon project which started on 11/15. Monitoring of chilled water temperature, chiller status, and pump status on MSEA will then begin.

Metasys - Future FESS (Ted Thorson) — Technology: Status is: –New Metasys system is ready for deployment awaiting the approval of the Critical System Plan, a pre-requisite to buying the PIX firewall and VPN concentrator. –All existing equipment on Ethernet and –All existing equipment migrated to the new system. –However, no one will be able to see the equipment at HDCF or New Muon until the new system can be deployed. FESS (Roger Slisz) — Critical System Coordinator: To do list is: –Procure a VPN concentrator and a PIX firewall device. –Secure VPN accounts for initial round of named users –Complete third draft of the CSP –Train initial round of named users on how MESA works and what they can and can not do with it. This has been a long complex project begun in February It is now perhaps close to first deployment.