1 TDTWG Report to RMS SCR Addressing ERCOT System Outages Tuesday, May 10
2 Background Original design for ERCOT system architecture did not include “high availability” for ERCOT Systems ERCOT built systems to comply with Protocol timing specific for ERCOT Increased criticality of Market process timing and transaction volume has driven the need for ERCOT Systems to be more robust than originally designed and built. Several Market system changes for ERCOT have been approved and many have been implemented but those did not include “high availability” of the ERCOT systems supporting the Retail Market. ERCOT Unplanned System Outages have become burdensome on Market Participants and have in some cases impacted processes directly related to supporting customers.
3 Background continued… While it is difficult to determine the exact impact or to what extent a customer or Market Participant has been impacted by an ERCOT System Outage, it is extremely apparent that these Outages if allowed to grow in number may eventually pose a detriment to the Texas Retail Market. At the request of RMS, TDTWG was asked to review the reasons for the outages and determine if anything can be done. TDTWG has completed a review of each outage as well as the activities necessary to restore successful system processing in each event. The product of that work is the SCR being presented at this meeting.
4 TDTWG Approach The TDTWG approach included reviewing procedures for systems in an attempt to ensure all aspects surrounding an ERCOT Outage were taken into consideration. In order to do this ERCOT IT provided multiple overviews of processes and systems to help TDTWG members be able to take an informed approach prior to proceeding with their analysis. These include: Presentation of existing system architecture Understanding of current processing capability/limitations Details of transaction process timing Overview and discussion of internal processes supporting systems TDTWG completed a detailed review of system outages including: date, length of outage, system affected, description of outage, action necessary to resolve the outage including what is necessary to ensure the outage should not happen again. This information is contained in the “Outage Appendix” of the SCR.
5 Existing NAESB The ERCOT Network provides Internet redundancy in the form of dual ISP connections to disparate providers in addition to high speed metro links between sites. The ERCOT Network provides high-availability in the form of redundant firewalls, switches, and routers. NAESB is currently only available in Taylor. There is only one sender and one receiver. However test systems are available in Austin with the capability of back end connectivity to Taylor. A single server failure has the capability to take NAESB offline Network maintenance can render NAESB unavailable.
6 While ERCOT should complete a full evaluation of systems and processes, TDTWG agrees with this as the recommended enhancement for NAESB Reliability. TDTWG would like ERCOT to take this recommendation into consideration while developing their evaluation of systems. Install an extra sender and receiver at each site Load balance these new servers behind a redundant set of content switches A single server failure no longer has the capability to completely take NAESB offline Firewall or Router maintenance will no longer render NAESB unavailable. This will mimic the failover functionality of our other high availability systems. Recommendation to improve NAESB Reliability
7 Next Steps SCR process to begin TDTWG will follow the SCR through the process ERCOT will complete their Evaluation and respond to the Market Market workshops may be held throughout the process as needed
8 Timeline and Process for SCR SCR to be posted by May 20 SCR to be reviewed by the Market for 21 days Following the 21 day comment period, ERCOT Market rules will send all comments received for the SCR to the RMS listserve SCR and comments will be presented at the June 15 RMS meeting RMS may vote to move forward with the SCR at the June 15 RMS meeting Following the RMS vote, ERCOT will begin an impact analysis Impact analysis will be presented at the July 13 RMS meeting – RMS may vote to approve the SCR which will include the impact analysis Market workshop may be called to review At that point, if approved ERCOT will update the SCR with the RMS recommendation and send to PRS for prioritization August 4, TAC considers for approval Following August 4, TAC recommendation is posted by ERCOT and a 30 day period for review and comments October ERCOT Board meeting, Board to consider
9 Section 21, System Change Requests Timeline for the SCR XXXXX 21 Day Comment Period ERCOT Posts SCR by 5/20 June 15 th RMS Consideration 1 st Consideration May 15th 21 Day Comment Period August 4th TACConsideration ERCOT Posts SCR Rec ERCOT Posts TAC Rec ERCOT Posts BOD Decision 25-Day IA Period July 13 th RMS Consideration 2nd Consideration X ERCOT Updates RMS Rec; send to PRS for priority X OctoberBODConsideration 30-Day IA Period
10 Questions?