Download presentation
Presentation is loading. Please wait.
Published byOpal Oliver Modified over 9 years ago
1
Retail Transaction Processing Year End Review and Recent Issues RMS January 2007
2
2006 Net Service Availability
3
December 2006 Net Service Availability
4
December 2006 Outage Analysis – Breakdown Not Complete
5
Retail Transaction Processing Service Availability Workshop to be scheduled for February or March depending on availability of space and schedules To be discussed: –Raising service availability target to 99.9% –Effective date of the availability target increase –Addition of service degradation metrics & reporting –Market participant input Changes will be made and presented to RMS for approval
6
December Transaction Processing Issues 867 Transaction Processing Issues (multiple occurrences from 12/15 – 12/27) –Transactions completed ANSI compliance checks but failed to complete TX Set checks Market Participant Impact –Impacted 867 transactions were not forwarded –Potential delay of completion of service orders –Potential delay of MP invoicing –Delayed usage loading to Lodestar could potentially impact initial settlements –1 to 7 day delay in reprocessing, majority reprocessed within 2 days Root Cause –PaperFree file server failures, current analysis pointing to same root cause as duplicates issue Solution –Architecture change tested, attempted phase one migration to production on 1/8 but rolled back due to problems with implementation, planned migration to production on Sunday, January 14th Market Notices –12/18 - 4:53 pm - Retail Transaction Processing - 867 Transactions –12/28 - 4:37 pm - Update: Retail Transaction Processing - 867 Transactions –12/29 - 2:35 pm - Update: Retail Transaction Processing – 867 Transactions –01/02 - 4:51 pm - Update: Retail Transaction Processing – 867 Transactions
7
December Transaction Processing Issues NAESB Outage (12/5) –Outage attributed to PaperFree file server failures, current analysis pointing to same root cause as duplicates issue and 867 processing issue –Market notice sent 12/5 –Attempted fix to be implemented on January 14 th RBP Stabilization Code Releases (12/14) – Controlled outage –Multiple outages while code fixes were migrated into production following RBP implementation –Market notice sent 12/14 TIBCO Database outage (12/14) –Partitioning error occurred in the PaperFree to TIBCO database –Market notice sent 12/14 –Fix complete TIBCO Adapter (12/21) –Following an emergency TIBCO migration to fix a customer care process issue, TIBCO adapters were not turned back on, training issue and learning curve with TIBCO software –Market notice sent 12/22 –Fix complete
8
January Transaction Processing Issues Database Indexing (1/2) –Added Q1 2007 partition to incorrect table space with insufficient space available –Inbound transactions were held while database partition was pointed to correct table space and tables were re-indexed –Fix complete –Market notices sent 1/2 – 1/4 Siebel Batch (1/3) –Service order without ESI_ID caused batch to hold, record was manually skipped to allow batch processing to continue –Root cause unknown, previously unknown problem –SIR written to allow batch processing to continue if encountered in the future –Market notice sent 1/4 PaperFree and NAESB Servers Memory Failure (1/3) – Controlled Outage –Memory failure occurred and the cluster failed over as designed. During replacement of the failed hardware, the cluster did not recognize the new hardware and the replacement required a total outage to reconfigure the cluster. –Fix complete - hardware replaced in approximately two hours –Market notice sent 1/4
9
January Transaction Processing Issues Siebel to Lodestar Batch (1/3) –‘ESIID service history’ table partition was split to allow data for Q1 2007 data to populate the database, the table partition split should have worked as performed, however a bug in Oracle 9.2.04 caused a problem –Market notice not sent because problem was fixed before market was impacted –Fix complete NAESB to PaperFree Communication Failures (1/5) –PaperFree file server having difficulties pulling data from the NAESB server –Still under investigation, analysis not complete –Market notice sent 1/5
10
Architecture Change & Attempted Fix Analysis and troubleshooting of duplicates, PaperFree file server, and 867 forwarding problems has pointed to a potential cause Communication protocol used to communicate between PaperFree file servers and PaperFree processor servers drops connections randomly and intermittently, and the PaperFree application experiences multiple problems when this occurs Implemented in phases, this architecture change would remove the need for this communication protocol in the retail environment –Phase One – January 14th –Phase Two – 7 to 14 days after phase one Key points: –Phased approach to ensure the change is effective and to eliminate multiple simultaneous changes –Following the change, redundancy would still be in place for the retail environment
11
Architecture Change & Attempted Fix PF File Server PaperFree Process Servers Failover Server PF File & Process Server Phase One Communication Protocol Problems Phase Two PF File & Process Server Clustered Process Servers Clustered
12
Retail Transaction Processing Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.