Download presentation
Presentation is loading. Please wait.
Published byGeorgia Jacobs Modified over 8 years ago
1
September 20061 Database Projects J.Trumbo CSS-DSG Sept. 19 2006
2
September 20062 Outline What’s included Issues –San technology for databases –Infrastructure machines –Health of d0ora2 Oracle Contract Status Oracle 10 upgrade Advanced Security Option Backup & Recovery Accomplishments in a Nutshell Oem 10 installation Cad – 2 new database machines, 2 new Windows machines Minos gets new machines D0 luminosity SDSS Freeware Unexpected Pleasures Moving Forward
3
September 20063 Move to SAN? The Clarion array will reach end of life Aug. 07. This array has a history of hardware problems the last couple of years, and a history of being difficult to diagnose those problems. We must address this quickly. New San technology is available. The San would be the direction we move in future, replacing current local disk arrays with San. Benefits include: –Clustering –Dynamic allocation of storage –Centrally Managed
4
September 20064 Benefits of Current CSS SAN Conventional RAID Groups –Standard Luns (<=2TB) Centralized Management Dynamically allocate storage Cluster able
5
September 20065 Unaddressed Issues Downtimes –Maintenance Tuning/Optimization Restructuring –Unscheduled Amount of storage use Cost
6
September 20066 Next Generation Arrays 3PAR Data and Compellent –Reduce amount of storage use Thin provisioning R/W snapshots –Reduce maintenance outages Dynamic Optimization/Tuning Non-disruptive upgrades –Reduce cost Non-disruptive tiered-storage
7
September 20067 3TB2TB 4TB5TB6TB3.25TB4.35TB5.25TB Reducing Amount of Storage Use: Thin Provisioning 1TB Year 1Year 2Year 3Year 4 Existing Projects New Projects
8
September 20068 Reducing Amount of Storage Use: R/W Snapshots Production Development Integration Development R/W Snapshot Integration R/W Snapshot Delta from Developers 3x Storage Needed 1+x storage needed Traditional New Refresh DB for Dev and Int much faster Much smaller in size than production DB Delta from Integrators
9
September 20069 Reduce Amount of Storage Use Thin Provisioning –Recover unused storage which has been allocated –Allows projects to purchase storage as needed vs. all up front R/W Snapshots –Copy of actual data but much smaller –Can be used for development, integration and/or DR recovery –Guidelines for using snapshot storage systems for Oracle Databases http://www.oracle.com/technology/deploy/availability/pdf/oscp_snapshot_use.pdf
10
September 200610 Reducing Maintenance Downtimes 4+1 RAID5 Group I/O Bound/Disk Change Copy Data off Recreate RAID Group Repopulate new storage Traditional Dynamic Optimization 4+1 RAID5 Group I/O Bound/Disk Change Add Storage Dynamically Optimize across new storage NO DOWNTIME Add Storage Downtime
11
September 200611 Reducing DR Recovery: R/W Snapshots for near CDP 2:00PM 2:15PM 2:30PM 2:45PM 3:00PM 3:15PM 3:30PM 3:45PM SNAP Production DB Corruption at ~3:00PM? New S/W @ 2:35PM Detection at 3:15PM Recover to 2:30 PM Snapshot Determine course of action Mount 2:45 + 3:00PM Snapshots for analysis
12
September 200612 Reduce maintenance outages Dynamic Optimization/Tuning –Increase bandwidth, IOPS, size of storage dynamically w/o disrupting application –Move from one RAID level to another as needed Non-disruptive upgrades –F/W upgrades are non-disruptive to application if multi-attached
13
September 200613 Reduce Cost: Non-disruptive Tiered Storage High Value Data Older Data Fibre Channel Storage Lower-Cost FATA Storage Hold current FC growth Move older data to lower Tier storage Expand only in lower Tier? ~2.5:1 cost advantage New, high-value data on Fibre-Channel disk
14
September 200614 Reduce Cost Non-disruptive Tiered Storage –Tiered storage = Place data on appropriate storage type relative to data’s importance and requirements –Move from expensive Fibre-Channel disk to lower-cost Fibre ATA drives as data ages. Movement is transparent to applications.
15
September 200615 Virtualization Storage Pool Connection Access SAN Strategic Direction RAID Storage Automated Tiering Thin Provisioning TCP/IPFCP SnapshotsNAS File Protocol NFS/CIFS/FTP ~200MB/s Campus wide SAN Block Protocol SCSI-3 >200MB/s Inside FCC Dynamic Resize Dynamic Tuning Thin Provisioning Automated Tiering Snapshots FCP or TCP/IP
16
September 200616 Questions Cost? –Similar to comparable EMC storage array Will it scale to a very large database? –3par reference “Yes”, –Compellent reference questionable, no comparable customer. Do R/W snapshots work with Oracle? –Reference customers say, “Yes.”. Longevity of company –Relatively new –Privately held –Street rumors of “IPO soon”
17
September 200617 Infrastructure Machines CST applications being defined as ‘major’ applications and should be moved to a more secure dev/int/prd environment. Currently, fncdug1 has 3 production and 3 integration instances. Not terabytes of data, but lots of users, lots of applications and 6 instances on 1 machine. Separate the database from the applications on g1. Apps include several apache servers, miser, matrix, users and growing. We would like to move the databases to an exclusive database server machine to release the database from the 3 rd party dependencies as well as maximize the database resources.
18
September 200618 Plans for Infrastructure Applications Separate instance for high-availability applications (helpdesk). Remove Remedy’s dependency on MISCOMP and ESHTRK database. Separate boxes for prod/int/devl databases and web servers. Increase capacity and response time to accommodate new applications (leave requests, effort reporting, budget).
19
September 200619 Health of D0ora2 The cpu load on d0ora2 has been increasing. It is often at 100%. There is an increasing load on the box, lots of grid jobs, lots of dbserver jobs, a growing database. Dsg’s dream solution…move the database to a stand alone database machine. Else Must move to Oracle 10 client from Oracle 8 client on apps side. (this s/b done in any case). Add more memory. We can double the memory from 16g to 32g. (fcdfora4 is a 32g box). This will allow us to increase Oracle’s SGA. Removed int & prod cron jobs that are not utilized. Remove the dbservers from d0ora2. Fix the queries that come out of the dimensions code so they do not traverse the same table 2x.
20
September 200620 Oracle Contract Status June 1, 2005 the lab initiated 2 new contracts with Oracle. The 1st contract covers all Fermi employees. This was needed to accommodate training and payroll (upcoming) functions that are done individually. This contract includes RAC, real application clustering, as well as unlimited use of databases and most Oracle tools.
21
September 200621 Oracle Contract Status The 2 nd contract is to cover all non employee users that access Oracle dbs, mostly our experiment users. This agreement is an annual lease of licenses, trued up by a count of kerberos users. This contract runs for 5 years. June 1, 2006 is the start of year 2. This contract acknowledges the ever changing user community. It is not a per name basis, it is a simply count. The 2006 contract purchased and additional 176 users over 2005. This contract does not include RAC.
22
September 200622 Oracle 10 Upgrade Upgrade to Oracle 10 is complete except for infrastructure databases (miscomp). This upgrade is delayed due to dependencies of 3 rd party application upgrades, particularly Matrix. The operating systems on fncduh1/g1 must also be upgraded.
23
September 200623 Oracle Advanced Security Option Advanced Security is an Oracle product. Purpose of Advance Security is to kerberize database logons. Have done proof of product, and several limitations prevent us from implementation. Oracle’s June Futamasa, is the DOE rep. She has promised help in getting ASO fixed. We have also contacted Mary Ann Davidson, Chief Security Officer, Oracle Corp. She too has made promises, no deliverables thus far.
24
September 200624 Oracle Advanced Security Organized a working group including Argonne Fermi Sandia Los Alamos Cern U.S. Navy Larwence Livermore Lack of response from Oracle has hindered this. Fermi will be pushing harder now that the Oracle 10 upgrade is finishing.
25
September 200625 Accomplishments in a Nutshell Standardized large backups to san and dcache/enstore Oracle Enterprise Manager (OEM10) on new machines Cad w/Oracle database on new machines Minos dev/int/prod on new machines D0 Luminosity dev/int/prod on new machines Upgrade to Oracle 10 almost complete including: Installation and testing of OEM 10 Installation of Oracle Application Server machines Streams testing and planning for Oracle 10 Linux testing and planning for Oracle 10
26
September 200626 Accomplishments in a Nutshell Continued maintenance, patching, refreshing, accounts, etc. of operating systems and databases with a > 99% uptime. In fact I believe last year cdfonprd had a 100% level. Continued maintenance of Sam schema for 3 experiments. Continued consult to application owners on schema design and implementation. Incorporation and review of security issues raised by the site assist team. Keeping pace with SDSS load schedule and releases, starting new initiatives including runs_db and monitoring.
27
September 200627 Backup & Recovery Backup and recovery for very large databases always a huge issue. No space to test full recoveries (though this is changing as Ray gets additional space. Moving to san storage for rman backups. Growing need for additional san space to accommodate growing backup files. San solution is more economical. Currently doing rmans to san: D0ora2, d0 offline production, no longer space on d0ora2 for a full backup. D0lum2, d0 luminosity production Bzora1, cdf online production, keeping 1 on local disk, 1 on san.
28
September 200628 Backup & Recovery Dsg has been moving to dcache/enstore for tape backups of rman files for our larger databases. Tibs is handling the small databases, infrastructure, cad. Our homegrown product rman_dcache sends and retrieves rman files from dcache. Rman_dcache still needs development work. We have been too short handed to truly finish the product. Work is done as resources can manage. Since the revamp of stken, and minos moving to it own area, the backups to dcache have been increasingly dependable. Thanks to the isa-group, they have been great. The isa-group is suggesting databases get a dedicated dcache pool fy 07 to minimize problems.
29
September 200629 Backup/Recover Database backups going to dcache/entstore: D0ofprd1 (d0 offline) D0ofprd1_readonly (d0 offline events) D0oflump (d0 luminosity) Minosprd (minos sam) Cdfonprd (cdf online) Cdfofpr2 (cdf offline) D0ofprd1, READ ONLY EVENT TABLESPACES We hope dcache/enstore will allow us to deprecate the aging tape robot currently attached to d0ora2 at some time.
30
September 200630 Replacement of OEM boxes Oracle Enterprise Manager 10 has been installed on new machines running RH Enterprise Linux. OEM 10 was a prerequisite for Oracle 10. However we have run into several issues with OEM 10. No charts. Some lost functionality from OEM9. Must move the oms daemon to a 32b chip machine till 64b chip is an Oracle supported configuration. Have been waiting for a year. Rumor has it, due Dec. 06. Working with Oracle.
31
September 200631 Cad Cad got 2 new database machines and 2 new middle tier windows machines this year. SKovich and NStanfield setup the new Sun boxes for the databases, ARomero is supporting the Windows boxes. DSG & CSI host bi weekly meetings with TParker & LCarpenter to plan & push the project forward. Shooting for implementation by Dec. 2006. Issues include: Software delivery delays Dependency on stakeholders testing Dependency on stakeholders data scrubbing. Stakeholders are the PPD, TD, CD, ADMS, ADCRYO.
32
September 200632 Minos Minos has the first installation of sam on a 64b Linux Sun box, Oracle 10. This installation has been smooth and without incident. Dev/int/prod now all running on their new hardware.
33
September 200633 D0 Luminosity D0 luminosity application has been installed on 2 new Linux Sun boxes (dev/prod). Luminosity went production in Dec. 05. The D0 luminosity team is helpful and communicates freely when it has special needs such as huge data loads.
34
September 200634 SDSS S.Lebedeva has become the lead dba for SDSS. She has been able to cross train A.Kumar to assist. This year SDSS has: Loaded and backed up to enstore, DR4 & 5. Initiated loaded the new RunsDb. Extensive documentation where there was none. Backed up DR1-5 to enstore. Worked on the web interface. Began using ngop and remedy. Web interface and configurations completed. Supporting image preparation on linux side. New hire J.Platson cross training full time on SDSS.
35
September 200635 Freeware Dsg is continually attempting to find time and resources to Update of web documentation. Cross train dbas as time allows. Establishment of test freeware databases. Since the demand for freeware support is not as high as Oracle (yet), and Dsg currently supports no production freeware database, Dsg staff attempts train as time allows. Dsg still depends primarily on S.Lebedeva as our freeware expert.
36
September 200636 Unexpected Pleasures Dsg was down 3 people most fy06. It has been difficult to hire, the market is tough and we support a diverse set of databases. Even though, loads of work have been accomplished, thanks to the dbas and sysadmins. We have not been able to accomplish several other things: Cross training Dsg is too dependant on individuals with specific expertise in areas. There has been no time to cross train. Kerberization of Oracle RJetton was handling, AKumar will now have to come up to speed, and really PUSH Oracle Corporation, we are waiting on Oracle.
37
September 200637 Unexpected Pleasures Rac testing Began investigation, DSG does not have hardware to test on. Will be budgeting for an inexpensive test suite. New responsibilities Have gotten a bit of experience with ESH databases on Windows, still working on cross training and mastering the new platform. Training and development Group members have indicated there is no time to learn new technologies, classes, etc. They are correct. Hopefully with new hire this will be changing.
38
September 200638 Moving Forward New employee and cross training Replacement of g1/h1. Adding additional structure to the business end of CD. Its not just miscomp any more. CST major apps machines. A 24x7 app machine. Replacement of aging disk arrays, particularly the Clarion array on d0ora2. Expecting to use san technology to store database files as well as rman files. D0 online responsibilities. Upgrade infrastructure databases to Oracle 10. Nova? Neutron Therapy?
39
September 200639 Proposed SYSADM Proposed DBA YTD CAD.5.17 CMS.25 0 SDSS1.75.67 CDF1.01.5.13 D0.75.24 MISCOMP.5 0 ESH.250 MINOS0 DB Admin1.12 Exp Support1.67 Total3.05.04.0
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.