CERN/IT/DB DB US Visit Oracle Visit August 20 – 24 2001 [ plus related news ]

2 CERN/IT/DB Introduction  Oracle Strawman Model  Issues & Concerns  Current Status & Future Directions  AOB

3 CERN/IT/DB LHC Datatypes & Oracle  RAW: 1PB/yr  ESD: ~100TB/yr  AOD: ~10TB/yr  TAG: ~100GB-1TB/yr  ~1 ‘DB’ / month  ~1 ‘DB’ / year  ~1 ‘DB’  ~1 ‘DB’ combined with AOD Maybe possible to soften these to ~1 ‘DB’ for all ESD Would there be a strong advantage? Different ‘DB’s have different access patterns, access control, schema, … etc.

4 CERN/IT/DB Oracle Deployment DAQ cluster: current data – no history export tablespaces to RAW cluster to/from MSS ESD cluster: 1/year? 1? AOD/TAG 1 total? to RCs to/from RCs reconstruct‘shift’ analysis

5 CERN/IT/DB RAWRAW ESDESD AODAOD TAG random seq. 1PB/yr 100TB/yr 10TB/yr 1TB/yr Data Users Tier0 Tier1

6 CERN/IT/DB Building Blocks  ~100TB “databases” – clusters?  OCCI / OTT  Transportable tablespaces & other techniques for data import / export / exchange

7 CERN/IT/DB BT Visit – July 2001  Oracle VLDB site: Enormous Proof of Concept test in 1999  80TB disk, 40TB mirrored, 37TB usable  Performed using Oracle 8i, EMC storage  “Single instance” – i.e. not cluster  Based on same techniques as identified on paper by IT-DB  Demonstrated > 2 years ago!  No concerns for building 100TB today!

8 CERN/IT/DB Size of the Largest RDBMS in Commercial Use for DSS Source: Database Scalability Program 2000 Terabytes 3 50 100 199620002005 Projected By Respondents

9 CERN/IT/DB Issues & Concerns  VLDB support  Cluster issues (RAC on Linux etc.)  C++ binding / object model definition  Storage issues

10 CERN/IT/DB VLDB Support  RAW DB model revised to active partitions as part of DB (catalog) + offline partitions (not part of DB) + “historical data” (maybe in separate DB?)  Such a strategy is used by many Data Warehouse sites in production today  Does not require any special features  But Oracle like suggestion of extending “resumable statements” to provide “automatic but controlled” access to offline data

11 CERN/IT/DB VLDB support cont.  Oracle addressing limits of current architecture  Already permits 2EB databases  Limits on e.g. # files, partitions etc are expected to be significantly increased beyond Oracle 9i  An area of work, but not concern…

12 CERN/IT/DB Cluster Issues  Real Application Clusters = RAC = significant advance over previous OPS  Should be good fit for HEP read-mostly data  Supported on Linux by COMPAQ, FastTango  Not critical to overall model, but could simplify deployment significantly  e.g. small number of clusters: 1 / data type  COMPAQ Oracle competency centre in Valbonne…

13 CERN/IT/DB OCCI / OTT  On-going work with Oracle developers to fix bugs / provide enhancements to meet HEP requirements  Fixes prior to start of ß, during ß & post ß …  Currently using HEP data models  More examples welcome…  Enhancements via pre-releases, dot releases and 9i R2

14 CERN/IT/DB Storage Issues  Oracle number format provides greater precision than IEEE double  Solutions being investigated to allow efficient storage of floats / doubles / ints without user specifying precision / range  Target: next major Oracle release?

15 CERN/IT/DB Side Visits  FNAL  Objectivity  IBM  COMPAQ  FastTango / NetAppliance

16 CERN/IT/DB Summary  Oracle demonstrably interested in continuing to work with HEP on VLDB / LHC issues, both regarding 9i and also future Oracle versions  Strong support from Oracle server group  Strong interest from Oracle Grid team

