DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Particle Physics Data Grid Richard P. Mount SLAC Grid Workshop.

Slides:



Advertisements
Similar presentations
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
Advertisements

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Particle Physics Data Grid PPDG Data Handling System Reagan.
High Performance Computing Course Notes Grid Computing.
Foundations for an LHC Data Grid Stu Loken Berkeley Lab.
Distributed IT Infrastructure for U.S. ATLAS Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Aug Arie Shoshani Particle Physics Data Grid Request Management working group.
EU-GRID Work Program Massimo Sgaravatto – INFN Padova Cristina Vistoli – INFN Cnaf as INFN members of the EU-GRID technical team.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Data Grids for Next Generation Experiments Harvey B Newman California Institute of Technology ACAT2000 Fermilab, October 19, 2000
Jean-Yves Nief, CC-IN2P3 Wilko Kroeger, SCCS/SLAC Adil Hasan, CCLRC/RAL HEPiX, SLAC October 11th – 13th, 2005 BaBar data distribution using the Storage.
Other servers Java client, ROOT (analysis tool), IGUANA (CMS viz. tool), ROOT-CAVES client (analysis sharing tool), … any app that can make XML-RPC/SOAP.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
CERN/IT/DB Multi-PB Distributed Databases Jamie Shiers IT Division, DB Group, CERN, Geneva, Switzerland February 2001.
Richard P. Mount CHEP 2000Data Analysis for SLAC Physics Richard P. Mount CHEP 2000 Padova February 10, 2000.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
DataGrid Middleware: Enabling Big Science on Big Data One of the most demanding and important challenges that we face as we attempt to construct the distributed.
9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird.
1 Report from NGI Testbed meeting at Berkeley, Jul 21-22, 1999 Les Cottrell – SLAC,
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Towards a US (and LHC) Grid Environment for HENP Experiments.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
WG Goals and Workplan We have a charter, we have a group of interested people…what are our plans? goalsOur goals should reflect what we have listed in.
Mass Storage System Forum HEPiX Vancouver, 24/10/2003 Don Petravick (FNAL) Olof Bärring (CERN)
Data GRID Activity in Japan Yoshiyuki WATASE KEK (High energy Accelerator Research Organization) Tsukuba, Japan
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
File and Object Replication in Data Grids Chin-Yi Tsai.
Tier 1 Facility Status and Current Activities Rich Baker Brookhaven National Laboratory NSF/DOE Review of ATLAS Computing June 20, 2002.
PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
Ruth Pordes, Fermilab CD, and A PPDG Coordinator Some Aspects of The Particle Physics Data Grid Collaboratory Pilot (PPDG) and The Grid Physics Network.
August 26, 1999: MONARC Regional Reps Meeting Harvey Newman (CIT) MONARC Second Regional Centre Representatives Meeting Harvey B. Newman (Caltech) CERN.
1 Grid Related Activities at Caltech Koen Holtman Caltech/CMS PPDG meeting, Argonne July 13-14, 2000.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
High Energy Physics Data Management Richard P. Mount Stanford Linear Accelerator Center DOE Office of Science Data Management Workshop, SLAC March 16-18,
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
1 Overview of IEPM-BW - Bandwidth Testing of Bulk Data Transfer Tools Connie Logg & Les Cottrell – SLAC/Stanford University Presented at the Internet 2.
JLAB Computing Facilities Development Ian Bird Jefferson Lab 2 November 2001.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer.
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL Grid Computing for HEP L. E. Price Argonne National Laboratory HEP-CCC Meeting CERN,
1 GCA Application in STAR GCA Collaboration Grand Challenge Architecture and its Interface to STAR Sasha Vaniachine presenting for the Grand Challenge.
Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)1 The GriPhyN Project (Grid Physics Network) Paul Avery University of Florida
The Particle Physics Data Grid Collaboratory Pilot Richard P. Mount For the PPDG Collaboration DOE SciDAC PI Meeting January 15, 2002.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
PPDGLHC Computing ReviewNovember 15, 2000 PPDG The Particle Physics Data Grid Making today’s Grid software work for HENP experiments, Driving GRID science.
SLAC Status, Les CottrellESnet International Meeting, Kyoto July 24-25, 2000 SLAC Update Les Cottrell & Richard Mount July 24, 2000.
Networking: Applications and Services Antonia Ghiselli, INFN Stu Loken, LBNL Chairs.
July 26, 1999MONARC Meeting CERN MONARC Meeting CERN July 26, 1999.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
U.S. ATLAS Computing Facilities Overview Bruce G. Gibbard Brookhaven National Laboratory U.S. LHC Software and Computing Review Brookhaven National Laboratory.
Bulk Data Transfer Activities We regard data transfers as “first class citizens,” just like computational jobs. We have transferred ~3 TB of DPOSS data.
U.S. ATLAS Computing Facilities DOE/NFS Review of US LHC Software & Computing Projects Bruce G. Gibbard, BNL January 2000.
U.S. ATLAS Computing Facilities U.S. ATLAS Physics & Computing Review Bruce G. Gibbard, BNL January 2000.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL DOE/NSF Review of US LHC Software and Computing Fermilab Nov 29, 2001.
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
DOE/NSF Quarterly review January 1999 Particle Physics Data Grid Applications David Malon Argonne National Laboratory
1 Efficient Data Access for Distributed Computing at RHIC A. Vaniachine Efficient Data Access for Distributed Computing at RHIC A. Vaniachine Lawrence.
1 Particle Physics Data Grid (PPDG) project Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99.
Hall D Computing Facilities Ian Bird 16 March 2001.
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
UK GridPP Tier-1/A Centre at CLRC
Readiness of ATLAS Computing - A personal view
Network Requirements Javier Orellana
High Speed File Replication
Nuclear Physics Data Management Needs Bruce G. Gibbard
Presentation transcript:

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Particle Physics Data Grid Richard P. Mount SLAC Grid Workshop Padova, February 12, 2000

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC PPDG: What it is not A physical grid –Network links, –Routers and switches are not funded by PPDG

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Particle Physics Data Grid Universities, DoE Accelerator Labs, DoE Computer Science Particle Physics: a Network-Hungry Collaborative Application –Petabytes of compressed experimental data; –Nationwide and worldwide university-dominated collaborations analyze the data; –Close DoE-NSF collaboration on construction and operation of most experiments; –The PPDG lays the foundation for lifting the network constraint from particle-physics research. Short-Term Targets: –High-speed site-to-site replication of newly acquired particle-physics data (> 100 Mbytes/s); –Multi-site cached file-access to thousands of ~10 Gbyte files.

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC PPDG Collaborators Particle Accelerator Computer Physics Laboratory Science ANLXX LBNL XX BNLXXx CaltechXX FermilabXXx Jefferson LabXXx SLACXXx SDSCX WisconsinX

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC PPDG Funding FY 1999: –PPDG NGI Project approved with $1.2M from DoE Next Generation Internet program. FY –DoE NGI program not funded –Continued PPDG funding being negotiated

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Particle Physics Data Models Particle physics data models are complex! –Rich hierarchy of hundreds of complex data types (classes) –Many relations between them –Different access patterns (Multiple Viewpoints) Event TrackList TrackerCalorimeter Track Track Track Track Track HitList Hit Hit Hit Hit Hit

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Data Volumes Quantum Physics yields predictions of probabilities; Understanding physics means measuring probabilities; Precise measurements of new physics require analysis of hundreds of millions of collisions (each recorded collision yields ~1Mbyte of compressed data)

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Access Patterns Raw Data ~1000 Tbytes AOD ~10 TB Reco-V1 ~1000 TbytesReco-V2 ~1000 Tbytes ESD-V1.1 ~100 Tbytes ESD-V1.2 ~100 Tbytes ESD-V2.1 ~100 Tbytes ESD-V2.2 ~100 Tbytes Access Rates (aggregate, average) 100 Mbytes/s (2-5 physicists) 1000 Mbytes/s (10-20 physicists) 2000 Mbytes/s (~100 physicists) 4000 Mbytes/s (~300 physicists) Typical particle physics experiment in : On year of acquisition and analysis of data

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Data Grid Hierarchy Regional Centers Concept LHC Grid Hierarchy Example Tier0: CERN Tier1: National “Regional” Center Tier2: Regional Center Tier3: Institute Workgroup Server Tier4: Individual Desktop Total 5 Levels

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC PPDG as an NGI Problem PPDG Goals The ability to query and partially retrieve hundreds of terabytes across Wide Area Networks within seconds, Making effective data analysis from ten to one hundred US universities possible. PPDG is taking advantage of NGI services in three areas: –Differentiated Services: to allow particle-physics bulk data transport to coexist with interactive and real-time remote collaboration sessions, and other network traffic. –Distributed caching: to allow for rapid data delivery in response to multiple “interleaved” requests –“Robustness”: Matchmaking and Request/Resource co-scheduling: to manage workflow and use computing and net resources efficiently; to achieve high throughput

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC First Year PPDG Deliverables Implement and Run two services in support of the major physics experiments at BNL, FNAL, JLAB, SLAC: –“High-Speed Site-to-Site File Replication Service”; Data replication up to 100 Mbytes/s –“Multi-Site Cached File Access Service”: Based on deployment of file-cataloging, and transparent cache-management and data movement middleware –First Year: Optimized cached read access to file in the range of 1-10 Gbytes, from a total data set of order One Petabyte Using middleware components already developed by the Proponents

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC PPDG Site-to-Site Replication Service u Network Protocols Tuned for High Throughput u Use of DiffServ for (1) Predictable high priority delivery of high - bandwidth data streams (2) Reliable background transfers u Use of integrated instrumentation to detect/diagnose/correct problems in long-lived high speed transfers [NetLogger + DoE/NGI developments] u Coordinated reservaton/allocation techniques for storage-to-storage performance SECONDARY SITE CPU, Disk, Tape Robot PRIMARY SITE Data Acquisition, CPU, Disk, Tape Robot

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Typical HENP Primary Site ~Today (SLAC) 15 Tbytes disk cache 800 Tbytes robotic tape capacity 10,000 Specfp/Specint 95 Tens of Gbit Ethernet connections Hundreds of 100 Mbit/s Ethernet connections Gigabit WAN access.

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC PPDG Multi-site Cached File Access System University CPU, Disk, Users PRIMARY SITE Data Acquisition, Tape, CPU, Disk, Robot Satellite Site Tape, CPU, Disk, Robot Satellite Site Tape, CPU, Disk, Robot University CPU, Disk, Users University Users Satellite Site Tape, CPU, Disk, Robot

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC PPDG Middleware Components

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC First Year PPDG “System” Components Middleware Components (Initial Choice): See PPDG Proposal Page 15 Object and File-Based Objectivity/DB (SLAC enhanced) Application Services GC Query Object, Event Iterator, Query Monitor FNAL SAM System Resource ManagementStart with Human Intervention (but begin to deploy resource discovery & mgmnt tools) File Access Service Components of OOFS (SLAC) Cache ManagerGC Cache Manager (LBNL) Mass Storage ManagerHPSS, Enstore, OSM (Site-dependent) Matchmaking Service Condor (U. Wisconsin) File Replication Index MCAT (SDSC) Transfer Cost Estimation ServiceGlobus (ANL) File Fetching ServiceComponents of OOFS File Movers(s) SRB (SDSC); Site specific End-to-end Network ServicesGlobus tools for QoS reservation Security and authenticationGlobus (ANL)

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Request Interpreter Storage Access service Request Manager Cache Manager Request to move files {file: from,to} logical request (property predicates / event set) Local Site Manager To Network File Access service Fig 1: Architecture for the general scenario - needed APIs files to be retrieved {file:events} Logical Index service Storage Reservation service Request to reserve space {cache_location: # bytes} Matchmaking Service File Replica Catalog GLOBUS Services Layer Remote Services Resource Planner Application (data request) Client (file request) Local Resource Manager Cache Manager Properties, Events, Files Index

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC PPDG First Year Milestones Project StartAugust, 1999 Decision on existing middleware to be October, 1999 integrated into the first-year Data Grid; First demonstration of high-speed January, 2000 site-to-site data replication; First demonstration of multi-site February, 1999 cached file access (3 sites); Deployment of high-speed site-to-site July, 2000 data replication in support of two particle-physics experiments; Deployment of multi-site cached file August, 2000 access in partial support of at least two particle-physics experiments.

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Longer-Term Goals (of PPDG, GriPhyN...) Agent Computing on Virtual Data

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Why Agent Computing? LHC Grid Hierarchy Example Tier0: CERN Tier1: National “Regional” Center Tier2: Regional Center Tier3: Institute Workgroup Server Tier4: Individual Desktop Total 5 Levels

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Why Virtual Data? Raw Data ~1000 Tbytes AOD ~10 TB Reco-V1 ~1000 TbytesReco-V2 ~1000 Tbytes ESD-V1.1 ~100 Tbytes ESD-V1.2 ~100 Tbytes ESD-V2.1 ~100 Tbytes ESD-V2.2 ~100 Tbytes Access Rates (aggregate, average) 100 Mbytes/s (2-5 physicists) 1000 Mbytes/s (10-20 physicists) 2000 Mbytes/s (~100 physicists) 4000 Mbytes/s (~300 physicists) Typical particle physics experiment in : On year of acquisition and analysis of data

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Existing Achievements SLAC-LBNL memory-to-memory transfer at 57 Mbytes/s over NTON; Caltech tests of writing into Objectivity DB at 175 Mbytes/s

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Cold Reality (Writing into the BaBar Object Database at SLAC) 60 days ago: ~2.5 Mbytes/s 3 days ago: ~15 Mbytes/s

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC Testbed Requirements Site-to-Site Replication Service –100 Mbyte/s goal possible through the resurrection of NTON (SLAC-LLNL-Caltech-LBNL are working on this). Multi-site Cached File Access System –Will use OC12, OC3, (even T3) as available (even 20 Mits/s international links) –Need “Bulk Transfer” service: Latency unimportant Tbytes/day throughput important (Need prioritzed service to achieve this on international links) Coexistence with other network users important. (This is the main PPDG need for differentiated services on ESnet)