Presentation is loading. Please wait.

Presentation is loading. Please wait.

NERC Published Lessons Learned

Similar presentations


Presentation on theme: "NERC Published Lessons Learned"— Presentation transcript:

1 NERC Published Lessons Learned
May 2017

2 NERC Lessons Learned May 2017
Four new NERC Lessons Learned (LL) were published since March 2017: LL – “Loss of Monitoring Due to Authentication Software Update” LL – “Line Frequency Excursion Causes Control Center Evacuation” LL – “Loss of SCADA Operating and Monitoring Ability”

3 Loss of Monitoring Due to Authentication Software Update
Entity making planned change to the Active Directory domain used to authenticate users on SCADA/EMS New domain was successfully verified in the test environment, operator training simulator, & production environment with the Energy Control Center (ECC) as the enabled site After failing over to the Backup Control Center during a final test of the updated authentication software, operator logins were rejected and support staff were unable to force a transfer back to the ECC

4 Loss of Monitoring Due to Authentication Software Update
Entity lost remote monitoring and control RTU polling ceased, & the ICCP datalink to its neighboring TOP and RC was not operational Entity activated its contingency plan requiring manning critical stations and manual methods for maintaining visibility of those sites and staffed generators EMS/SCADA vendor was contacted to identify a recovery method System was fully functional 51 mins after initial event

5 Loss of Monitoring Due to Authentication Software Update
Investigation found the authentication failure at the BCC was caused by incorrect configuration of site localization settings for the SCADA/EMS system at the BCC Incorrect configuration caused a mismatch between the authentication files on disk (which did match between sites) and the authentication records in the system database (which differed between sites), resulting in the inability for operators to login to the system

6 Loss of Monitoring Due to Authentication Software Update
Document each localization setting and its purpose and review settings before and after upgrades or major changes Provide for site failover testing in the test environment Use recorded lines to coordinate staff during testing Have a recovery procedure for all-servers-down startup Include support staff in training on loss of supervisory control so they understand the timings of the process Maintain at least one “local” user account that is not authenticated via the SCADA AD domain

7 Line Frequency Excursion Causes UPS Shutdown and Control Center Evacuation
Energy Control Center (ECC) primary source spiked to 65Hz due to events in a neighboring entity’s system ECC UPSs shutdown & bypassed due to settings limiting frequency to protect the UPS No voltage loss, so the UPS did not change to its battery backup and diesel generators did not start SCADA workstations and other computer equipment power supplies tripped to protect themselves Several business servers and HVAC units remained connected to the critical bus on bypass power

8 Line Frequency Excursion Causes UPS Shutdown and Control Center Evacuation
ECC switched over to its secondary source following operation of backup protection on the primary source’s transformer UPSs remained in bypass and would not pick up critical bus load ECC personnel evacuated to their alternate control center UPSs’ internal logic would not allow them to pick up load immediately when reenergized to protect the UPS from feeding a large surge load Business servers and HVAC still on the critical bus fed by the secondary source with the UPSs in bypass, (prevented UPS from returning) After internal logic was discovered, all load was powered off With all load removed, the UPSs were restarted and critical bus load was added slowly, ultimately restoring the ECC on secondary source power to the UPSs

9 Line Frequency Excursion Causes UPS Shutdown and Control Center Evacuation
When UPSs are in bypass and after ECC power has been switched to a known good source (secondary source, or diesel generators), turn off all loads fed from the UPS bypass and then bring the UPSs back on This is the recovery option that was actually used and would have taken less time than evacuating to the alternate control center if the issue had been known Sometimes UPS logic is set up to require associated batteries be near full charge and an external power source be available before an off-line UPS can be restarted

10 Line Frequency Excursion Causes UPS Shutdown and Control Center Evacuation
Optimal fix is to prevent power source excursions (frequency hi/low, voltage hi/low) from defeating UPS power Done by setting up protection and controls to switch input power to another source prior to exceeding one of the UPS front-end protection logic set points Alternatively, the UPS vendor may provide an option to have the UPS logic changed to switch loads to battery power until an acceptable power source is available instead of simply going into bypass

11 Line Frequency Excursion Causes UPS Shutdown and Control Center Evacuation
Despite having four sources of power available, ECC evacuation was initiated due to a transient that their UPSs were not set up to handle Vulnerability may not be unique to that ECC and could be present in other locations using UPS power (data centers, generating station control rooms, etc.) Entities need to ensure they understand their UPS settings, their UPS’ limitations, and how to recover from various potential failure scenarios

12 Loss of SCADA Operating & Monitoring Ability – Decrip.
During building elevator maintenance, power was inadvertently removed from telecommunications equipment that fed data to an entity’s SCADA system When UPS batteries were exhausted, the Energy Control Center (ECC) and commercial operations center lost SCADA operating and monitoring capability Entity’s state estimator and contingency analysis were rendered inoperable for a 50-minute time period Loss of SCADA connectivity (e.g., monitoring and control) was also experienced with the entity’s HVD/LVD substations and the RC

13 Loss of SCADA Operating & Monitoring Ability - CA
Install alarms for power supplies for critical telecom equipment Identify all critical equipment that needs alarming Coordinate critical equipment work between maintenance and the ECC prior to and during work Include requirements in contract language with service providers to coordinate critical equipment work with the ECC prior to and during work

14 Loss of SCADA Operating & Monitoring Ability - CA
Update facility maintenance procedures to establish semiannual alignment meetings for Facilities, System Control, Merchant Operations, & IT to review the critical asset list and update based on current requirements Identify and label critical equipment needing UPS with the message “Do not de-energize. Prior to performing work on this equipment, contact XXXX.” Develop naming conventions and ID types of equipment for each criticality level Update Facilities “Significant Event Job Aid” with additional contacts for work in critical equipment rooms

15 Loss of SCADA Operating & Monitoring Ability - CA
Preparation and planning of non-routine maintenance work requires an additional level of caution to ensure all potentially impacted stakeholders and critical assets are properly identified Due to changing regulatory requirements, entities need to establish a process for periodic review and updating of critical asset inventories and backup power needs

16 Ways to Access Lessons Learned
On the NERC website, Go to > Click on the “Program Areas & Departments” tab and click “Reliability Risk Management” Then on the left side menu under “Event Analysis” click “Lessons Learned” Link to Lessons Learned At the lower left corner of the NERC.com homepage a/Pages/Lessons-Learned.aspx At the lower left corner of the NERC.com homepage

17 Contact Hassan Hamdar (hamdar@frcc.com) or
FRCC EA Contract NERC LL:


Download ppt "NERC Published Lessons Learned"

Similar presentations


Ads by Google