Presentation is loading. Please wait.

Presentation is loading. Please wait.

Glast Collaboration Data Server and Data Catalog

Similar presentations


Presentation on theme: "Glast Collaboration Data Server and Data Catalog"— Presentation transcript:

1 Glast Collaboration Data Server and Data Catalog
Tony Johnson DC2 Planning Meeting June 2005

2 Contents What Exists Ntuple Pruner/Peeler
Data Server (for Internal Collaboration Use) What is Planned What is Wanted?

3 Data Server Portal Web Portal Provides access to existing data server functionality Currently: NTuple Pruner (Tom Glanzman) Selection of Data via Cuts on Merit Tuple Works with datasets in pipeline data catalog Download of Data via FTP after submission of batch job Allows access to Root Merit Tuple Event Peeler (coming very soon) (Tom Glanzman) Selection of Data via run/event number (uploaded file) Access to Root Merit tuple and/or full Root tuple “Data Server” (Jean-Paul LeFevre) Allows rapid selection of events based on Energy, Origin (decl, ra), Time, Gamma Quality Stored in “meta-data” database Additional MeritTuple cuts Supports adding cuts to personal “favorites” list Currently configured to work with DC1 Root merit tuple only

4 Screen Shots

5 Data Server Issues Currently trying both Oracle (10g) and MySQL (5) using “spatial” extensions Do not fully support spherical geometry Forced to make rectangular selections rather than circular Problems at poles Performance seems OK – at least for 50 million events Selection performance scales by number of events selected, rather than total events in database Indexes seem very slow to build many hours to add 1,000,000 events – and this seems to scale by total database size Still under investigation, maybe can be improved by tuning Need to decide very soon how much effort to put into database vs. a custom solution

6 Data Catalog Plans Working on new “Glast Data Catalog”
Less tightly coupled to pipeline than current catalog Allows domain specific, user-defined, hierarchical “meta-data” to be associated with each dataset, e.g. Simulation physics, test setup parameters Pointers to pipeline task Pointers to e-logbook entries Web interface will allow browsing data hierarchy or searching based on meta-data Implementation based on earlier “Grid” data catalog developed at SLAC. Uses XML for import/export of data (stored in Oracle XML database)

7 DC2 Data Catalog?

8 Data Server Plans Continue to enhance pruner/peeler
Add TCut capability to peeler Add access to other data types (SVAC tuple) Data Server Enhance ergonomics of web interface Support search using sky catalog Work on handling larger data volumes Add ability to download events in different formats FITS, run/event #, Different Root tuples Add ability to browse events using event display Use xrootd server to stream data Eliminate waiting for batch job and FTP transfer Experiment with SLAC “Peta-Cache” system Initially use xrootd to serve existing Root tuples Highest performance may require storing tuples in some other format

9 Data Pump – Streaming data directly to users
Data Server TCut Format Converter TCut Format Converter TCut Format Converter Multiple Threads xrootd Root Files

10 Conclusions Initial data server available
Would like some people to try it and give feedback Lots of work to do Need to set goals/priorities for DC2 work Understand timescales Understand what data volume will be Understand what typical queries will be

11 Hierarchical Data Catalog


Download ppt "Glast Collaboration Data Server and Data Catalog"

Similar presentations


Ads by Google