Presentation is loading. Please wait.

Presentation is loading. Please wait.

February, 20071 Databases Project Update J.Trumbo LSC/DBI/DBA February 27, 2007.

Similar presentations


Presentation on theme: "February, 20071 Databases Project Update J.Trumbo LSC/DBI/DBA February 27, 2007."— Presentation transcript:

1 February, 20071 Databases Project Update J.Trumbo LSC/DBI/DBA February 27, 2007

2 February, 20072 Outline What’s included San technology for databases Infrastructure machines Health of D0ora2 Oracle 10 upgrade Advanced Security Option D0 Online transition Backup & Recovery Cad Minos D0 luminosity SDSS Freeware Nova ESH Training Accomplishments in a Nutshell Moving Forward

3 February, 20073 San Technology D0ora2 Disks D0 experienced data corruption on the Clarion array over the holidays. Abandoning the mount point that was a common thread in the corruptions seemed to be the root cause. This issue amplified the urgency of purchasing new storage for d0 offline. A new san has been requisitioned to replace the Clarion array on d0ora2. Minimally sized, initial hardware purchase to move off d0ofprd1, excluding event data. Longer term, with additional purchases, more database instances can be added. Only database files will be on this san, no backups or other app files. Will be starting a plan soon!

4 February, 20074 Hardware Purchased 1x S400 Storage Array –2 nodes (controllers) –2 disk chassis –32 146GB FC disks –16x 500GB FATA disks –dynamic optimization –virtual copy –thin provisioning –1 year 24x7 maintenance and installation

5 February, 20075 San Features Next generation array –Reduce amount of storage use Thin provisioning R/W snapshots –Reduce maintenance outages Dynamic Optimization/Tuning Non-disruptive upgrades –Reduce cost Non-disruptive tiered-storage

6 February, 20076 Infrastructure Machines Requested new infrastructure machines were not purchased last year…will try again this year. CST applications being defined as ‘major’ applications and should be moved to a more isolated dev/int/prd hardware environment. Separate instance for high-availability applications (Helpdesk/Remedy), removing Remedy’s dependency on MISCOMP and ESHTRK database so Remedy is unaffected by miscomp downtimes. Separate hardware for this would be ideal.

7 February, 20077 Plans for Infrastructure Applications Purchase a new dev/prod database server boxes for Infrastructure databases. –Fncduh1/g1 are 5-6 years old. –Currently, fncdug1 has 3 production and 3 integration instances. Not terabytes of data, but lots of users, lots of applications and 6 instances on 1 machine. 2 int dbs are shutdown to preserve resources for production. –G1 apps include several apache servers, miser, matrix, users and growing, resources are tight. Allow g1 to continue to serve applications, but… –Move the databases off g1&h1 to a new box. Move the databases to an exclusive database server machine to release the database from the 3 rd party dependencies as well as maximize the database resources. This new production box will use the san for disk.

8 February, 20078 Health of D0ora2, improved! Last report the cpu load on d0ora2 was often at 100%. We still hit 100%, but are no longer consistently at 100%. DBI/DBA’s dream is to make d0ora2 a database server machine period, no other applications running on it, till then... Actions included: S.White has implemented a new version dbserver using Oracle 10 client from Oracle 8 client on apps side. Use of Oracle 8 client apps is minimized and being deprecated. Doubled the memory from 16g to 32g. Removed int & prod cron jobs that are not utilized. Removed most the dbservers from d0ora2. R.Herber thorough investigation into the queries on datafiles found full table scans being used due to high occurrence of identical characters in the 1 st 14 digits of filename, rendering index histograms from data analysis faulty. A special analysis on datafiles removing the histogram has been invoked. Started tracking long transactions and addressing them with users. Discontinued event recording to the database, Feb 6, 2007. What else was on the list? Fix the queries that come out of the dimensions code so they do not traverse the same table 2x.

9 February, 20079 Oracle v10 Upgrade Completed the upgrade to Oracle 10 on all databases with the exception of the infrastructure (miscomp) instances. Upgrade included: Completely new OEM (monitoring) tool. New streams functionality. New tuning parameters. New security methods. New optimizer. Rman configuration modifications. Infrastructure databases cannot be upgraded to v10 till Matrix is retired. They have been upgraded to the terminal release of v9.

10 February, 200710 Oracle Advanced Security Option Advanced Security (ASO) is the Oracle product to kerberize Oracle database access. ASO does not adhere to MIT kerberos standards, and thus, has been unusable at Fermilab. Oracle’s June Futamasa, is the DOE rep. She has promised help in getting ASO fixed. Oracle gotten our issues to ‘bug’ status. We have prepared the test environment and have deployed three ASO bug patches. We have been assigned a developer at Oracle. We are continuing work with J.Futamasa and Oracle development team, holding regular meetings.

11 February, 200711 D0 Online Transition D0 online databases and machine transition from D0 to DBI/DBA is under way. Steps include: –Adding addition space for both dev and prod –Moving the production database that has been running on the dev machine to the production machine, specifically, histdb. –Bringing machines to a current patch level (databases have been maintained to current patch levels). Actually an upgrade to the os needed for improved cluster functionality and the aging version. However, RedHat provides no upgrade patch for clustering. RH’s suggested upgrade path is ‘buy 3 new machines and reinstall’. We do not have hardware for that suggested upgrade path. –Getting rman backups established to enstore and tested –Understanding and documenting failover technology.

12 February, 200712 Backup & Recovery DBI/DBA has standardized on dcache/enstore for tape backups of rman files for our larger databases. Tibs is handling the small databases (<50G), infrastructure, cad. Our homegrown product rman_dcache sends and retrieves rman files from dcache. Rman_dcache still needs a bit development work. We have been too short handed to truly finish the product and make it database platform independent. Work is done as resources can manage. The isa-group suggested last fall that the databases get a dedicated dcache pool fy 07 to minimize problems. This solution was introduced a few weeks ago, backed out and will be rescheduled at a later date.

13 February, 200713 Backup/Recover Database backups going to dcache/entstore: D0ofprd1 (d0 offline) Halted D0ofprd1_readonly (d0 offline events deprecated) D0oflump (d0 luminosity) Minosprd (minos sam) Cdfonprd (cdf online) Cdfofpr2 (cdf offline) D0onl (d0 online) … not quite there yet D0onl (d0 online readonly … not quite there yet

14 February, 200714 Cad Supporting Cad’s 2 database machines and 2 new middle tier windows machines this year. SKovich and NStanfield setup the new Sun boxes for the databases, ARomero is supporting the Windows boxes. Larry Carpenter the cad manager, has left the lab. T.Parker is interim manager. Continuing hosting bi weekly meetings with cad. TD’s goal of implementation of Team Center by Dec. 2006 was missed. Data scrubbing is still not complete, however a test implementation of Ideas to Team Center was about to be launched when L.Carpenter left. T.Parker will need to pick up the pieces and move forward. Stakeholders are the PPD, TD, CD, ADMS, ADCRYO.

15 February, 200715 Minos Minos sam is running just fine. L.Buckley-Geer has requested us to DBI/DBA minos’ MySql database. This is under discussion.

16 February, 200716 D0 Luminosity D0 luminosity application is running with no real issues for DBI/DBA. D0 lum application owners have been testing constant changes code. Space is tight on dev for testing. Additional disks have been purchased for the production machine. These additional disks were planned at time of the machine purchase. They have arrived in receiving, I believe they are in prep. We will be adding these disks to the machine as soon as the array becomes available.

17 February, 200717 SDSS S.Lebedeva, lead DBI/DBA, J.Platson DBI/DBA in training for SDSS. In the last 6 months SDSS dbas have: Loaded Dr6, backed up raw data to enstore. Used SqlServer 2005 for DR6, major upgrade. Continued documentation. Worked on the web interface. Reviewed monitoring tools. Purchased & installed Idera sql server monitoring tool. Incorporated the Blue Arc into SDSS processes adding flexibility and cheaper disk availability.

18 February, 200718 Freeware DBI/DBA is continually attempting to find time and resources to Update/maintain web documentation. Cross train DBI/DBAs as time allows. Establishment of test and production freeware database environments. With the reorg DBI/DBA took over support for the MySql database server which services ~12 users. Would like to host comparable service for Postgres, but need hardware, training and people. We are working toward this. DBI/DBA plans to establish deeper background and support level for freeware soon. Assuming no unforeseen issues, we should be able to start putting some effort into our freeware environment. The 5 year leased licenses being used to service non Fermi employees for Oracle expire June 1, 2010. If serious consideration is being given to deprecate the Oracle lease license by the 2010 expiration, the project to move SAM (and possibly other apps) to freeware needs to be resurrected and given resources. The project to prove Sam under Postgres was on the Taking Stock task list for several years, and was officially removed last taking stock meeting. A Postgres Sam schema exists, but has never been proven. Else, modify Sam and possibly other experiment apps, allowing only Fermi employees to access the database.

19 February, 200719 Nova DBI/DBA has begun attending regularly scheduled meetings to discuss Nova. Work thus far, includes requirements documents for the online database and design discussion at both the application and database levels. This project is in it infancy but is progressing. It is expected to demand additional resources. DBI/DBA intends to provide support and direction as needed and the project progresses. No doubt, there will be more on this next report!

20 February, 200720 ESH Standardized ESH environment on Windows. Completed general system documentation. Established rman backups for recovery consistency. Exports are continued, used to refresh dev with prod as needed. Tested recoveries. Setup recovery scripts. Participated in audit of ESH database. Documented recovery testing for audit.

21 February, 200721 Training 6 months ago I reported ‘DSG is too dependant on individuals with specific expertise in areas. There has been no time to cross train.’ Though SDSS is in much better shape, there has been no improvement in other areas, however, I have hope this situation will soon be easing. DBI/DBA needs to have resources to Cross train existing responsibilities Train, practice & master new technologies Attend classes

22 February, 200722 Accomplishments in a Nutshell Upgraded databases to Oracle v10. Upgraded OEM grid control to v10.2 Established a Cad working group, meeting every other Monday, making small steps to transition to team center. Requisition for San storage for D0ora2. D0 online transition under way. Established standard environment for ESH. Setup a production USCMS oracle instance to accommodate tier 1 data transfer, job request data.

23 February, 200723 Accomplishments in a Nutshell Moved the CMS Pixel databases to Cern. Continued maintenance, patching, refreshing, accounts, etc. of operating systems and databases with a > 99% uptime. I believe cdfonprd continues at a 100% level for over 1 year. Continued maintenance of Sam schema for 3 experiments. Deployed modifications to the Sam Request schema. Continued consult to application owners on schema design and implementation.

24 February, 200724 Accomplishments in a Nutshell SDSS Smooth transition of SDSS to SqlServer 2005 from 2000. Implementation of DR6. New monitoring software for SDSS. Jim Gray offer to include S.Lebedeva as co author in Microsoft Tech Report: "SkyServer Traffic Report - The First Five Years": http://research.microsoft.com/research/pubs/view.aspx?t ype=Technical%20Report&id=1236 http://research.microsoft.com/research/pubs/view.aspx?t ype=Technical%20Report&id=1236 With help from R.Pasetes & A.Romero, resolved issues with using Blue Arch with SDSS Windows, saving time, space and preventing fragmentation on SDSS boxes.

25 February, 200725 Moving Forward Replacement of the Clarion array on d0ora2, moving the database to the new san. New hardware, a miscomp database server machines, for dev/int&prod. G1/h1 retained to server applications. New hardware for CST major apps machines. A 24x7 database app machine. (Remedy and others) New hardware for ESH oracle instances to move them off windows. Kerberizing Oracle Continue migration of D0 online responsibilities. Nova

26 February, 200726 Moving Forward Continue attempting to find resources to cross train. Minos MySql responsibilities and transition. Freeware More in-depth training in freeware. –Dba training. –Establish a dev/prod Postgres environment for Fnal small database users, modeled after the existing MySql environment. Requires hardware. –Establish and publish standards, procedures and best practices documents for freeware databases. –Strengthen security baselines.


Download ppt "February, 20071 Databases Project Update J.Trumbo LSC/DBI/DBA February 27, 2007."

Similar presentations


Ads by Google