Download presentation
Presentation is loading. Please wait.
Published byLilian Wiggins Modified over 9 years ago
1
PPD Computing “Business Continuity” David Kelsey 3 May 2012
2
The RAL electrical work and risks SSE will replace two old HV switch-boards in RAL main sub-station – Will take ~6 months from mid May 2012 Normally we have two 132 kV supplies and 11 kV transformers – One is sufficient to power RAL so we have a live spare During the work – Only one transformer is live – If that fails we have no fast failover – But no digging allowed near the underground cables from Harwell Estimated time for SSE to patch to second supply is <48 hours Increased risk of power outages during this period – Increased risk is difficult to quantify Bottom line – Need to plan for short breaks in electrical power and possibly up to ~48 hours 03/05/20122Kelsey, PPD IT continuity
3
PPD Business Continuity planning PPD has a Business Continuity Plan – Started with the Y2K problem – And Disaster Recovery plan – This is good practice and useful anyway E.g. What do we do if R1 burns down? Or RAL is closed for other reasons? As part of this plan – PPD Computing Group has plans – for different time-scales 1-2 days; ~1 week; several weeks or more This is a good time to review and revise the plans! 03/05/20123Kelsey, PPD IT continuity
4
If RAL power is off … Services UP (generators) Core network – Parts of R26, parts of R89 – Off-site connections (JANET and DL) CLRC Windows Domain Exchange mail servers VPN? (not yet sure?) Also failover of some services to DL (e.g. Exchange servers) – We can VPN in to DL to access SSC services (from home) Central STFC web server – For advice about RAL status Most Services are DOWN Telephones – Landlines, Vodafone mast Access control & gates Fire Alarms Catering Water pumps Many computer services Etc etc etc NO COFFEE :=( RAL WILL BE SHUT! – Access only for small number of authorised staff 03/05/20124Kelsey, PPD IT continuity
5
What will be down in PPD (R1)? R1 will have no power We (Computing Group) will not be here! – Unless coming in to retrieve machines and/or backups Machine rooms will be down (we have no generators) No PPD Windows or Linux servers (including file servers) – No H drive, No T drive, etc. – No web servers PPD Windows domain will be down No network No printers No Scientific Computing Tier 2/3 compute service No dCache service – no access to scientific data No video conferencing Pointsec recovery will be unavailable 03/05/20125Kelsey, PPD IT continuity
6
What is computing group doing? Identifying those things that can be done now in advance – E.g. Check and test configuration of our UPS units (for orderly shutdown) We will provide best efforts support to keep PPD working from homes or other institutes – But without PPD compute servers being up Make changes in advance to help make laptops useable from elsewhere while PPD is down – E.g. Sophos (Windows) already reconfigured to failover to Sophos site for updates Provide documentation in advance – How to re-configure devices Windows security updates etc – Advice on failover to Exchange at DL – Etc – To be automatically copied to laptops 03/05/20126Kelsey, PPD IT continuity
7
What should PPD groups do? We (CG) cannot make IT service plans for individuals or groups Develop your own Business Continuity Plan – Only you know which services are critical Establish communication means with all members of your group – Phone, non-STFC email Plan for lack of PPD computing services – Mission-critical software, data, computer power E.g. just before conferences! Access to high-speed networking, videoconferencing, printing, web services not available – Negotiate alternative work locations for staff This is all part of the wider PPD Business Continuity Plan 03/05/20127Kelsey, PPD IT continuity
8
What do individuals need to do? Have access to a laptop (or home PC) Have a copy of all important files (H and T drives) – E.g. via Windows Offline Files – or rsync copy on MACs – And paper files from your office! Have current documentation and contact details For regular PPD Tier 3 analysis users – Make a plan What data do you need? How much CPU? Can you submit elsewhere? (the Grid or CERN or Amazon?) – Do not leave everything until the very last minute :=) 03/05/20128Kelsey, PPD IT continuity
9
Communication Cascade: STFC senior management -> Director –> Div Heads –> Group leaders -> all staff Collect and store important contact details – Phone numbers – Non-STFC email addresses – Contact details for Computing Group – And not just kept on the PPD file server! 03/05/20129Kelsey, PPD IT continuity
10
PPD IT Forum A meeting of the “PPD IT Forum” (i.e. All Staff and Visitors welcome!) planned for – Thursday 17 th May 2012 – CR03 R61 – 11:00 to 12:30 To present more details and discuss issues and concerns Please come! 03/05/201210Kelsey, PPD IT continuity
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.