Information Technology Outage Report Dave Pagliai Manager, IT Support Services October 2015 ERCOT Public.

Slides:



Advertisements
Similar presentations
Information Technology Update Aaron Smallwood Manager, IT Business & Customer Services.
Advertisements

Information Technology Report Dave Pagliai Manager, IT Support Services March 2015 ERCOT Public.
Module 12: Microsoft Windows 2000 Clustering. Overview Application of Clustering Technology Testing Tools.
1 RMS Workshop Retail Systems Disaster Recovery ERCOT May 6 th, 2014.
Information Technology Report Dave Pagliai Manager, IT Support Services January 2015 ERCOT Public.
POI to MIS Transition PR0066_01. 2 Project Details Purpose: Decommission the POI (Planning and Operation Information) site and move all identified documents.
Slide 1 of 10 Client Digital Certificate Upgrade.
Information Technology Report Trey Felton Manager, IT Service Delivery January 2012 ERCOT Public.
Emergency Database Failover: Impacts & Recovery Plan
Enterprise Content Alignment Program October 8, 2014.
January 8, 2009 TAC Texas Nodal Program Implementation: P rogram Update Ron Hinsley.
MISUG Meeting Materials ERCOT 04/03/ Agenda 04/03/ Antitrust AdmonitionJ. Lavas9:30 a.m. 2.Introduction/Agenda OverviewJ. Lavas9:35 a.m.
Retail Market Subcommittee Update to TAC Kathy Scott April 24,
ERCOT Retail Market IT Update Aaron Smallwood Director, IT Operations Retail Market Subcommittee April 7 th, 2015.
1 Market Trials Outage Scheduling Qualifications Weekly Update April 02, 2010.
ERCOT and Utilicast Public Document ERCOT Board of Directors Meeting April 22, 2009 Nodal Program Oversight Report 10 – Infrastructure.
IT Update Trey Felton ERCOT IT Service Delivery February 2012 ERCOT Public.
Information Technology Report Dave Pagliai Manager, IT Support Services February 2015 ERCOT Public.
MP Online Data Entry - Ph 1 Laura Dronen Project Manager, ERCOT ROS April 2 nd, 2015.
RMS Update to TAC May 8, RMS Update to TAC ► At April 9 RMS Meeting:  Antitrust Training  RMS Voting Items: ► NPRR097Changes to Section 8 to Incorporate.
1 Siebel 7 Upgrade Project PR40066_03. 2  Key Functional Areas… –Compliance –Legal –Lodestar –Market Participants –Network Modeling –Retail Client Services.
1 Nodal Stabilization Market Call December 14, 2010.
ERCOT IT Update Ken Shoquist VP, CIO Information Technology Board Meeting November 2003.
Information Technology Update ERCOT Board of Directors Meeting January 17th, 2005.
September 8, 2008 TPTF Nodal Core Projects Updates Nodal Project Managers.
SCR786 - Alt Remove SIM dates from CERT Enhance CERT hardware and/or software Processing times at volume may not meet Performance Measures criteria Do.
ERCOT SCR745 Update ERCOT Outage Evaluation Phase 1 and Phase 2 TDTWG April 2, 2008.
June 10, 2009 RMS PR90006, Commercial Systems Information Lifecycle Management (ILM) Hope Parrish, ERCOT.
ERCOT Project Update ERCOT Outage Evaluation Phase 2 (SCR745) TDTWG May 7, 2008.
January 15, 2008 Monthly Board of Directors Meeting Texas Nodal Market Implementation Program Update Jerry Sullivan.
ERCOT Project Update Commercial Operations Subcommittee April 10, 2007 Adam Martinez Market Operations Division Projects Organization Please Note: These.
MISUG Meeting Materials ERCOT 02/14/ Agenda 02/14/ Antitrust AdmonitionJ. Lavas9:30 a.m. 2.IntroductionJ. Lavas9:35 a.m Accomplishments.
Information Technology Report Trey Felton Manager, IT Service Delivery September 2011 ERCOT Public.
Retail Transaction Processing Year End Review and Recent Issues RMS January 2007.
9/12/2006 TPTF MIS: TML Gaps & other Content Pat Harris A portal is a web site or service that offers a broad array of resources and services such as ,
February 20, 2006 Nodal Architecture Overview Jeyant Tamby 20 Feb 2006.
Information Technology Report Trey Felton Manager, IT Service Delivery December 2011 ERCOT Public.
Information Technology Report Trey Felton Manager, IT Service Delivery October 2011 ERCOT Public.
Information Technology Report Dave Pagliai Manager, IT Support Services October 2015 ERCOT Public.
Price Correction Process Resmi Surendran QMWG 04/10/2015.
Proposed Scope for Market Data Working Group (MDWG) MISUG December 7, 2015.
1 Market Trials Outage Scheduling Weekly Update August 20, 2010.
Integrated Release Approach and Update TPTF 11/10/2008 Matt Mereness.
9/12/2006 TPTF MIS / Training Status & Update Pat Harris A portal is a web site or service that offers a broad array of resources and services such as.
Content Management System Project Charter Enterprise Content Alignment Program December 3, 2014.
September 21, Data Extract Projects Jackie Ashbaugh Commercial Operations Data Integrity & Administration September 21, 2006.
ERCOT Service Availability Metrics and Retail Systems Update April 2007.
Information Technology Service Availability Metrics Trey Felton IT Account Manager COPS/RMS January 2010.
1 TDTWG Update to RMS Tuesday March 3, Primary Activities 1.ERCOT System Outages and Failures 2.MarkeTrak Performance 3.Discussed 4 th QTR Performance.
1 ERCOT Retail Release Overview. 2 How Are Changes Managed? Retail Testing Business Teams Development Teams Release Management Management of: Migration.
Information Technology Service Availability Metrics March 2008.
Commercial Operations Subcommittee (COPS) Update to RMS 11/3/2015.
Information Technology Update Aaron Smallwood Manager, IT Business & Customer Services.
Information Technology Report Dave Pagliai Manager, IT Support Services September 2015 ERCOT Public.
RMS Update to TAC September 4, RMS Update to TAC August 13 RMS Meeting Summary: August 13 RMS Meeting Summary:
1 TDTWG Report to RMS SCR Addressing ERCOT System Outages Tuesday, May 10.
Information Technology Report Dave Pagliai Manager, IT Support Services February 2016 ERCOT Public.
Information Technology Report Dave Pagliai Manager, IT Support Services October 2015 ERCOT Public.
Lead from the front Texas Nodal 1 TDWG Nodal Update – June 6, Texas Nodal Market Implementation Server.
Chapter 16 Client/Server Computing Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
EMMS Infrastructure. 2 2 Scope Perform an analysis and comparison of the costs and risks associated with the three most plausible courses of action for.
RMS Update to TAC June 7, RMS Activity Summary RMGRR052 File Naming Convention for Customer Billing Contact Information URGENT (Vote) RMS Procedures:
Information Technology Report Trey Felton Manager, IT Service Delivery July 2011 COPS/RMS.
Information Technology Report
2011 Prioritization Update to Market Subcommittees
MARS Taskforce RMS Update December 9, 2009.
OCITF/OCWG Project Update
DEWG Jackie Ashbaugh ERCOT
ERCOT SCR745 Update ERCOT Outage Evaluation Phase 1 and Phase 2
Maximum Availability Architecture Enterprise Technology Centre.
Presentation transcript:

Information Technology Outage Report Dave Pagliai Manager, IT Support Services October 2015 ERCOT Public

2 October 2015 Background UNIX Servers -Multiple hosts per frame -Largely database servers -Four frames per Production data center -Vendor identifies potential hardware issue with ERCOT frames -ERCOT executing a plan to replace all impacted frame hardware ERCOT Application Classifications -Core: Grid and Market -Non-Core: Market Data Transparency -Commercial Systems: Retail, Settlements, Web Services

3 ERCOT PublicOctober 2015 Timeline 09/20/15 17:42 UNIX Production frame failure in Taylor data center Impacts:  Components of the ERCOT Market Information System (MIS) including: o Report publishing to the MIS o Downloading of extracts and reports from MIS and External Web Services (EWS)  MIS and ERCOT.com dashboards and displays  MOTE  NMMS  Find ESIID (intermittent) -Vendor contacted for frame hardware replacement 09/21/15 04:10 Frame hardware replacement completed, all impacted hosts back online 05:00 All impacted databases online -Vendor identified further errors on the impacted frame, requested additional downtime to address -ERCOT planned for site failover of impacted applications/databases to Bastrop data center, beginning at 17:00 to minimize Market impact -ERCOT Release 5 implementation schedule delayed 17:00 – 19:30 ERCOT executes site failover of impacted applications/databases to Bastrop data center

4 ERCOT PublicOctober 2015 Timeline 09/22/15 01:28 ERCOT/vendor complete repair of impacted frame -ERCOT Release 5 implementation resumed -ERCOT accelerates plan to replace remaining impacted frame hardware 10/02/15 -ERCOT executes site failover of applications impacted by 09/20/15 failure to Taylor data center 10/03/15 -ERCOT/vendor complete repair of two frames in Bastrop data center Future efforts: 10/08/15 – 10/11/15 -Site failovers of applications from Taylor data center to Bastrop data center -Repair all impacted frames in Taylor data center

5 ERCOT PublicOctober 2015 Questions