Presentation is loading. Please wait.

Presentation is loading. Please wait.

Infrastructure Reliability Common Systems Group UW Madison Roger Hanson 5 Jan 2005 Common Systems Group UW Madison Roger Hanson.

Similar presentations


Presentation on theme: "Infrastructure Reliability Common Systems Group UW Madison Roger Hanson 5 Jan 2005 Common Systems Group UW Madison Roger Hanson."— Presentation transcript:

1 Infrastructure Reliability Common Systems Group Experience @ UW Madison Roger Hanson 5 Jan 2005 Common Systems Group Experience @ UW Madison Roger Hanson 5 Jan 2005

2 2 University of Wisconsin-Madison

3 3 Overview Basics – Redundant Hardware Test Environments Change Management Version control Testing processes Collaboration Service Management Basics – Redundant Hardware Test Environments Change Management Version control Testing processes Collaboration Service Management

4 4 Background MyUW Portal WiscMail campus Mail Service In Production in 2001 New Complex Environments –Layer 4 Switching –Directory Enabled Systems –ES Storage Area Networks MyUW Portal WiscMail campus Mail Service In Production in 2001 New Complex Environments –Layer 4 Switching –Directory Enabled Systems –ES Storage Area Networks

5 5 Campus Portal Access to over 130 modules 1.8M Logins in Sept. 04 49K+ Unique Logins in Sept. Campus Portal Access to over 130 modules 1.8M Logins in Sept. 04 49K+ Unique Logins in Sept.

6 6 Hardware - Portal

7 7 Campus Mail system Nearly 90K accounts Daily Message Peak over 3M messages Service objective –Never down –Message delivery in less than 2 minutes Campus Mail system Nearly 90K accounts Daily Message Peak over 3M messages Service objective –Never down –Message delivery in less than 2 minutes

8 8 Hardware - Email

9 9 Basics – Redundant Hardware Clustered Server Environment Spares (Hot/Warm/Cold) Automated Load Balancing Automated fail over Clustered Server Environment Spares (Hot/Warm/Cold) Automated Load Balancing Automated fail over

10 10 Test Environments Test Cycle –Test –Development –QA –Production QA (also called Integrated Test Environment) Test Cycle –Test –Development –QA –Production QA (also called Integrated Test Environment)

11 11 Change Management Use of Change Information System –Tracking –Notification Use of Code Migration Request process –Files promoted –Configuration steps –Test process –Backout plans Use of Change Information System –Tracking –Notification Use of Code Migration Request process –Files promoted –Configuration steps –Test process –Backout plans

12 12 Version Control Use CVS –http://www.gnu.org/software/cvs/http://www.gnu.org/software/cvs/ –Develop in private or shared environments –Code is published into repository –Code is then copied to environment (dev, test, qa, and prod) Use CVS –http://www.gnu.org/software/cvs/http://www.gnu.org/software/cvs/ –Develop in private or shared environments –Code is published into repository –Code is then copied to environment (dev, test, qa, and prod)

13 13 Testing Process Unit testing Integrated Testing (QA) Log analysis from testing Written test plans Load Tests Testing tools (Empirix) System Monitoring (Wiley Introscope) Unit testing Integrated Testing (QA) Log analysis from testing Written test plans Load Tests Testing tools (Empirix) System Monitoring (Wiley Introscope)

14 14 Collaboration Wiki Document Repository/Sharing Email Lists IM E-mail Wiki Document Repository/Sharing Email Lists IM E-mail

15 15 Service Management Major direction at UW to improve reliability CIO asking for 5 9s on key systems Consulting assistance Manage the service not the servers Adopt customer’s perspective Major direction at UW to improve reliability CIO asking for 5 9s on key systems Consulting assistance Manage the service not the servers Adopt customer’s perspective

16 16 Service Management Models –Information Technology Library –Based on British Telecom agency processes –Service Support processes Incident management Problem management Change management Release management Configuration management Models –Information Technology Library –Based on British Telecom agency processes –Service Support processes Incident management Problem management Change management Release management Configuration management

17 17 Service Management Models –Microsoft Operations Framework Combines ITIL processes with recommendations for technical processes http://www.microsoft.com/mof Models –Microsoft Operations Framework Combines ITIL processes with recommendations for technical processes http://www.microsoft.com/mof

18 18 Next steps Define service level objectives for key services Determine how to measure service reliability Engage Data Center staff Define service level objectives for key services Determine how to measure service reliability Engage Data Center staff

19 19 Observations Infrastructure complexity –Teams of specialists Funding for environments Staffing Process costs Infrastructure complexity –Teams of specialists Funding for environments Staffing Process costs

20 20 Questions Roger Hanson Internet Infrastructure Applications rlhanson@wisc.edu Roger Hanson Internet Infrastructure Applications rlhanson@wisc.edu


Download ppt "Infrastructure Reliability Common Systems Group UW Madison Roger Hanson 5 Jan 2005 Common Systems Group UW Madison Roger Hanson."

Similar presentations


Ads by Google