Database Competence Centre openlab Major Review Meeting nd February 2012 Maaike Limper Zbigniew Baranowski Luigi Gallerani Mariusz Piorkowski Anton Topurov Nicolas Bernard Marescaux
Outline Progress on - Integrated Virtualization, - Enterprise Manager 12c, - Replication Technologies, - 11g deployment for WLCG, Outlook for openlab IV new projects - New openlab fellow – Maaike Limper, - Accelerator logging data-mining. CERN openlab major review February 20122
Oracle Enterprise Manager Infrastructure Provisioning, Self Service Portal tests
EM 12c Infrastructure Provisioning Clean install of EM 12 (released version) OVM 3.0 installed on 3 hosts Re-testing of problematic features Most of the reported errors fixed Much faster setup then in Beta CERN openlab major review February 20124
EM12 Structure CERN openlab major review February 20125
EM 12 Self Service Portal Not tested during Beta Allows end users to provision own VMs Self Service Admin controls available actions VM images available for deployment Disk, memory and CPU allowances Scheduling windows, VMs expirations, etc Works well from both admin and end user side CERN openlab major review February 20126
7 EM 12 Self Service Portal
Self Service Portal CERN openlab major review February 20128
Replication technologies update
Continuation of replication testing Tested technologies Streams GoldenGate Active DataGuard Goal Validation of above technologies with all CERN’s streamed production workload Direct performance comparison On exact hardware and software configuration Same that will be used in production CERN openlab major review February
Final results CERN openlab major review February LFC for LHCB workload description: -13 days of production data - redo volume 15G - 7,235,172 changes - ~2M transactions LFC for LHCB workload description: -13 days of production data - redo volume 15G - 7,235,172 changes - ~2M transactions ATLAS COOL workload description: - 3 days of production data - redo volume 22,4 G - 15,313,748 changes - 63,481 transactions ATLAS COOL workload description: - 3 days of production data - redo volume 22,4 G - 15,313,748 changes - 63,481 transactions ATLAS PVSS workload description: - 3 days of production data - redo volume 11,1 G - 8,162,001 changes - 567,227 transactions ATLAS PVSS workload description: - 3 days of production data - redo volume 11,1 G - 8,162,001 changes - 567,227 transactions LOWER - BETTER
Final results CERN openlab major review February LFC for LHCB workload description: -13 days of production data - redo volume 15G - 7,235,172 changes - ~2M transactions LFC for LHCB workload description: -13 days of production data - redo volume 15G - 7,235,172 changes - ~2M transactions ATLAS COOL workload description: - 3 days of production data - redo volume 22,4 G - 15,313,748 changes - 63,481 transactions ATLAS COOL workload description: - 3 days of production data - redo volume 22,4 G - 15,313,748 changes - 63,481 transactions ATLAS PVSS workload description: - 3 days of production data - redo volume 11,1 G - 8,162,001 changes - 567,227 transactions ATLAS PVSS workload description: - 3 days of production data - redo volume 11,1 G - 8,162,001 changes - 567,227 transactions HIGHER - BETTER
Summary All Oracle replication solutions had been validated with CERN’s production datasets ADG was the fastest one in each case GoldenGate software is developing very fast all data types used by CERN are now supported! We could not benefit from parallelism optimization Not compatible with production sets Streams issues discovered during previous testing have been reported to Oracle Support fixes are included in latest Oracle version ( ) and we are ready for deployment… CERN openlab major review February
Next steps – Active Data Guard ADG appears to be not only the fastest replication technology but also the easiest to establish and maintain It makes it the best candidate for online – offline replication First production setup will be deployed for CMS and ALICE in February Opens possibilities for offline analysis (data mining) of large data sets like accelerator logs CERN openlab major review February
11g Deployment for WLCG
11g deployment overview Narrow upgrade window during LHC shutdown (January – March) all experiments databases will be upgraded to 11g until middle of February at T0 beginning of March at T1 Additionally tested with RAT (Real Application Testing) Replication related challenges due to tight schedule upgrades cannot be performed in order recommended by Oracle upgrades of T1s targets requires coordination and actions on T0 Report covering problems experienced during Streams replication upgrades will be provided to Oracle CERN openlab major review February
11g deployment in worldwide databases distribution ATLAS – the core is already running 11g CERN openlab major review February g STREAMS11 g STREAMS 11 g T1 T g
LHCB CERN openlab major review February LHCB online database – 6 th of February 11g deployment in worldwide databases distribution 11 g
Openlab IV projects
openlab IV New Oracle Openlab fellow: - Maaike Limper, started January 2012 Project outline: - Investigate possibility of doing LHC-scale data reconstruction and physics analysis within an Oracle database Focus on Physics Analysis Analysis mainly done from (root)-n-tuples Tree structure of root-n-tuples could be (easily?) converted into database structure External procedure agent on database to handle C++ physics analysis code 20CERN openlab major review February 2012
Physics Analysis and Databases Challenges: Optimize database structure: physics objects can be defined as vectors-of-vectors with varying length per event, could be stored as blobs or separate tables Flexibility for different types of analysis: subsets of data objects could be defined through “views” for different physics groups (SUSY, Higgs, etc.) Analysis time: large number of users to access same data at the same time, optimize performance to be competitive with current analysis scheme Database size: could take significantly more space than root- files; data for results with many different versions of the reconstruction software to be stored 21CERN openlab major review February 2012
Accelerator logging data mining Problem solving for early detection of the accelerator Identify non-optimal response: discharge trajectory different from expected envelope Response of valves etc. Make use of technology Active DataGuard to analyze data at the same time as it is captured/loaded, RAC, parallel query for processing Large cache and high number of cores with fast interconnect 22CERN openlab major review February 2012
Outreach
Presentations “GoldenGate vs Oracle Streams for Worldwide Distribution of Experimental Physics Data” Eva Defonte Perez & Zbigniew Baranowski, UKOUG, Birmingham 5-7 December 2011 “Testing Storage for Oracle RAC 11g with NAS, ASM and SSD Flash Cache” Luca Canali & Dawid Wojcik, UKOUG, Birmingham 5-7 December 2011 “Oracle Enterprise Manager: The Key to Building a Virtualized Stack” Anton Topurov, UKOUG, Birmingham 5-7 December 2011 “Going deeper into Real Application Testing” Mariusz Piorkowski, UKOUG, Birmingham 5-7 December 2011 CERN openlab major review February
Questions? CERN openlab major review February