Download presentation
Presentation is loading. Please wait.
Published byDamian Haynes Modified over 9 years ago
1
Application Use Cases NIKHEF, Amsterdam, December 12, 13
2
Use Cases part of a development process – requirements gathering – use cases – architectural design – fast prototyping – implementation text describing a real case
3
D0 @Fermi
4
A D0 use case produce 1 million events using the Pythia event generator and the GEANT D0 detector simulation program. Add pile-up events during digitisation and before reconstruction. After reconstruction and analysis all raw and resulting data is conserved. Any production should be exactly reproducible.
5
Pythia GEANT-3 Simulation Reconstruction Analysis cards D0 MC Flow Chart Min. bias geometry INPUTOUTPUT
6
Requirements Gathering discussing use cases often delivers system requirements Design & Architecture new components extending existing components (standards!) CRC’s
7
Refinements # min.bias events depending on luminosity add 0< <10 min.bias events per event any event should be (exactly) reproducible min.bias events are also generated with Pythia min.bias events could also be measured data could be in a file of ## events could be stored one by one in a database
8
Options min.bias events generated on the fly min.bias events from a file on the grid min.bias events from file on the CE min.bias events from generated file etc. and always (exactly) reproducable !
9
HEPCAL Document requirements from all 4 LHC experiments requirements from EO and Bio ~50 use cases discussed within Architectural Task Force design is input for Global Grid Forum follow always standard protocols
10
ATLAS
11
An implementation of distributed analysis in ALICE using natural parallelism of processing Local Remote Selection Parameters Procedure Proc.C PROOF CPU TagD B RD B DB 1 DB 4 DB 5 DB 6 DB 3 DB 2 Bring the job to the data and not the data to the job
14
ATLAS/LHCb Software Framework (Based on Services) Converter Algorithm Event Data Service Persistency Service Data Files Algorithm Transient Event Store Detec. Data Service Persistency Service Data Files Transient Detector Store Message Service JobOptions Service Particle Prop. Service Other Services Histogram Service Persistency Service Data Files Transient Histogram Store Application Manager Converter The Gaudi/Athena Framework – Services will interface to Grid (e.g. Persistency)
16
A CMS Data Grid Job The vision for 2003
17
Common Applications Work Several discussions between application WPMs and technical coordination to consider the common needs of all applications HEPEOBio Common applicative layer EDG software Globus
18
reconstruction simulation analysis interactive physics analysis batch physics analysis batch physics analysis detector event summary data raw data event reprocessing event reprocessing event simulation event simulation analysis objects (extracted by physics topic) Data Handling and Computation for Physics Analysis event filter (selection & reconstruction) event filter (selection & reconstruction) processed data les.robertson@cern.ch CER N
19
LCG/Pool on the Grid File Catalog Collections Replica Location Service Grid Dataset Registry Grid Resources Experiment Framework User Application LCG POOLGrid Middleware RootI/O Replica Manager
20
Applications in DataGrid HEP Bio Informatics and Health Earth Observation
21
Challenges for a biomedical grid The biomedical community has NO strong center of gravity in Europe –No equivalent of CERN (High-Energy Physics) or ESA (Earth Observation) –Many high-level laboratories of comparable size and influence without a practical activity backbone (EMB-net, national centers,…) leading to: Little awareness of common needs Few common standards Small common long-term investment The biomedical community is very large (tens of thousands of potential users) The biomedical community is often distant from computer science issues
22
Biomedical requirements Large user community(thousands of users) –anonymous/group login Data management –data updates and data versioning –Large volume management (a hospital can accumulate TBs of images in a year) Security –disk / network encryption Limited response time –fast queues High priority jobs –privileged users Interactivity –communication between user interface and computation Parallelization –MPI site-wide / grid-wide –Thousands of images –Operated on by 10’s of algorithms Pipeline processing –pipeline description language / scheduling
23
Biomedical projects in DataGrid Distributed Algorithms. New distributed "grid-aware" algorithms (bio-info algorithms, data mining, …) Grid Service Portals. Service providers taking advantage of the DataGrid computational power and storage capacity. Cooperative Framework. Use the DataGrid as a cooperative framework for sharing resources, algorithms, and organize experiments in a cooperative manner. Cooperative Framework Grid Service Portals Distributed Algorithms EDG Middleware WP10 applications
24
The grid impact on data handling DataGrid will allow mirroring of databases –An alternative to the current costly replication mechanism –Allowing web portals on the grid to access updated databases Biomedical Replica Catalog Trembl(EBI) Swissprot (Geneva)
25
Web portals for biologists Biologist enters sequences through web interface Pipelined execution of bio-informatics algorithms –Genomics comparative analysis (thousands of files of ~Gbyte) Genome comparison takes days of CPU (~n**2) –Phylogenetics –2D, 3D molecular structure of proteins… The algorithms are currently executed on a local cluster –Big labs have big clusters … –But growing pressure on resources – Grid will help More and more biologists compare larger and larger sequences (whole genomes)… to more and more genomes… with fancier and fancier algorithms !!
26
The Visual DataGrid Blast, a first genomics application on DataGrid A graphical interface to enter query sequences and select the reference database A script to execute the BLAST algorithm on the grid A graphical interface to analyze result Accessible from the web portal genius.ct.infn.it
27
Summary of added value provided by Grid for BioMed applications Data mining on genomics databases (exponential growth). Indexing of medical databases (Tb/hospital/year). Collaborative framework for large scale experiments (e.g. epidemiological studies). Parallel processing for –Databases analysis –Complex 3D modelling
28
Earth Observation (WP9) Global Ozone (GOME) Satellite Data Processing and Validation by KNMI, IPSL and ESA The DataGrid testbed provides a collaborative processing environment for 3 geographically distributed EO sites (Holland, France, Italy) 28
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.