Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mardi Gras Conference (February 3, 2005)Paul Avery1 University of Florida Grid3 and Open Science Grid Mardi Gras Conference Louisiana.

Similar presentations


Presentation on theme: "Mardi Gras Conference (February 3, 2005)Paul Avery1 University of Florida Grid3 and Open Science Grid Mardi Gras Conference Louisiana."— Presentation transcript:

1 Mardi Gras Conference (February 3, 2005)Paul Avery1 University of Florida avery@phys.ufl.edu Grid3 and Open Science Grid Mardi Gras Conference Louisiana State University Baton Rouge February 3, 2005

2 Mardi Gras Conference (February 3, 2005)Paul Avery2 Data Grids & Collaborative Research  Scientific discovery increasingly dependent on collaboration  Computationally & data intensive analyses  Resources and collaborations distributed internationally  Dominant factor: data growth (1 Petabyte = 1000 TB)  2000~0.5 Petabyte  2005~10 Petabytes  2012~100 Petabytes  2018~1000 Petabytes?  Drives need for powerful linked resources: “Data Grids”  ComputationMassive, distributed CPU  Data storage and accessDistributed hi-speed disk and tape  Data movementInternational optical networks  Collaborative research and Data Grids  Data discovery, resource sharing, distributed analysis, etc. How to collect, manage, access and interpret this quantity of data?

3 Mardi Gras Conference (February 3, 2005)Paul Avery3 Data Intensive Disciplines  High energy & nuclear physics  Belle, BaBar, Tevatron, RHIC, JLAB  Large Hadron Collider (LHC)  Astronomy  Digital sky surveys, “Virtual” Observatories  VLBI arrays: multiple- Gbps data streams  Gravity wave searches  LIGO, GEO, VIRGO, TAMA, ACIGA, …  Earth and climate systems  Earth Observation, climate modeling, oceanography, …  Biology, medicine, imaging  Genome databases  Proteomics (protein structure & interactions, drug delivery, …)  High-res brain scans (1-10  m, time dependent)

4 Mardi Gras Conference (February 3, 2005)Paul Avery4  U.S. Projects  GriPhyN (NSF)  iVDGL (NSF)  Particle Physics Data Grid (DOE)  Open Science Grid  UltraLight  TeraGrid (NSF)  DOE Science Grid (DOE)  NEESgrid (NSF)  NSF Middleware Initiative (NSF) Background: Data Grid Projects  EU, Asia projects  EGEE (EU)  LCG (CERN)  DataGrid  EU national Projects  DataTAG (EU)  CrossGrid (EU)  GridLab (EU)  Japanese, Korea Projects  Not exclusively HEP (but many driven/led by HEP)  Many 10s x $M brought into the field  Large impact on other sciences, education HEP led projects

5 Mardi Gras Conference (February 3, 2005)Paul Avery5 U.S. “Trillium” Grid Consortium  Trillium = PPDG + GriPhyN + iVDGL  Particle Physics Data Grid:$12M (DOE)(1999 – 2004+)  GriPhyN:$12M (NSF)(2000 – 2005)  iVDGL:$14M (NSF)(2001 – 2006)  Basic composition (~150 people)  PPDG:4 universities, 6 labs  GriPhyN:12 universities, SDSC, 3 labs  iVDGL:18 universities, SDSC, 4 labs, foreign partners  Expts:BaBar, D0, STAR, Jlab, CMS, ATLAS, LIGO, SDSS/NVO  Complementarity of projects  GriPhyN:CS research, Virtual Data Toolkit (VDT) development  PPDG:“End to end” Grid services, monitoring, analysis  iVDGL:Grid laboratory deployment using VDT  Experiments provide frontier challenges  Unified entity when collaborating internationally

6 Mardi Gras Conference (February 3, 2005)Paul Avery6 Goal: Peta-scale Data Grids for Global Science Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, CPUs, networks) Resource Management Services Security and Policy Services Other Grid Services Interactive User Tools Production Team Single Researcher Workgroups Raw data source  PetaOps  Petabytes  Performance

7 Mardi Gras Conference (February 3, 2005)Paul Avery7 Trillium Science Drivers  Experiments at Large Hadron Collider  100s of Petabytes2007 - ?  High Energy & Nuclear Physics expts  ~1 Petabyte (1000 TB)1997 – present  LIGO (gravity wave search)  100s of Terabytes2002 – present  Sloan Digital Sky Survey  10s of Terabytes2001 – present Data growth Community growth 2007 2005 2003 2001 2009 Future Grid resources  Massive CPU (PetaOps)  Large distributed datasets (>100PB)  Global communities (1000s)

8 Mardi Gras Conference (February 3, 2005)Paul Avery8 Sloan Digital Sky Survey (SDSS) Using Virtual Data in GriPhyN Galaxy cluster size distribution Sloan Data

9 Mardi Gras Conference (February 3, 2005)Paul Avery9 The LIGO Scientific Collaboration (LSC) and the LIGO Grid LIGO Grid: 6 US sites * LHO, LLO: observatory sites * LSC - LIGO Scientific Collaboration - iVDGL supported iVDGL has enabled LSC to establish a persistent production grid  Cardiff AEI/Golm + 3 EU sites (Cardiff/UK, AEI/Germany) Birmingham

10 Mardi Gras Conference (February 3, 2005)Paul Avery10 Search for Origin of Mass & Supersymmetry (2007 – ?) TOTEM LHCb ALICE  27 km Tunnel in Switzerland & France CMS ATLAS Large Hadron Collider (LHC) @ CERN

11 Mardi Gras Conference (February 3, 2005)Paul Avery11 LHC Data Rates: Detector to Storage Level 1 Trigger: Special Hardware 40 MHz 75 KHz 75 GB/sec 5 KHz 5 GB/sec Level 2 Trigger: Commodity CPUs 100 Hz 0.1 – 1.5 GB/sec Level 3 Trigger: Commodity CPUs Raw Data to storage (+ simulated data) Physics filtering ~TBytes/sec

12 Mardi Gras Conference (February 3, 2005)Paul Avery12 10 9 collisions/sec, selectivity: 1 in 10 13 Complexity: Higgs Decay into 4 muons

13 Mardi Gras Conference (February 3, 2005)Paul Avery13 LHC: Petascale Global Science  Complexity:Millions of individual detector channels  Scale:PetaOps (CPU), 100s of Petabytes (Data)  Distribution:Global distribution of people & resources CMS Example- 2007 5000+ Physicists 250+ Institutes 60+ Countries BaBar/D0 Example - 2004 700+ Physicists 100+ Institutes 35+ Countries

14 Mardi Gras Conference (February 3, 2005)Paul Avery14 ATLAS CMS LHC Global Collaborations  1000 – 4000 per experiment  USA is 20 – 25% of total

15 Mardi Gras Conference (February 3, 2005)Paul Avery15 CMS Experiment LHC Global Data Grid (2007+) Online System CERN Computer Center USA Korea Russia UK Maryland 0.1 - 1.5 GB/s >10 Gb/s 10-40 Gb/s 2.5-10 Gb/s Tier 0 Tier 1 Tier 3 Tier 2 Physics caches PCs Iowa UCSDCaltech U Florida  5000 physicists, 60 countries  10s of Petabytes/yr by 2008  1000 Petabytes in < 10 yrs? FIU Tier 4

16 Mardi Gras Conference (February 3, 2005)Paul Avery16 CMS: Grid Enabled Analysis (GAE) Architecture Scheduler Catalogs Grid Services Web Server Execution Priority Manager Grid Wide Execution Service Data Manage- ment Fully- Concrete Planner Fully- Abstract Planner Analysis Client Virtual Data Replica Applications Monitoring Partially- Abstract Planner Metadata HTTP, SOAP, XML- RPC Chimera Sphinx MonALISA ROOT (analysis tool) Python Cojac (detector viz)/ IGUANA (cms viz) Clarens MCRunjob BOSS RefDB POOL ORCA ROOT FAMOS VDT-Server MOPDB Discovery ACL management Cert. based access uClients talk standard protocols to “Grid Services Web Server” uSimple Web service API allows simple or complex analysis clients uTypical clients: ROOT, Web Browser, …. uClarens portal hides complexity uKey features: Global Scheduler, Catalogs, Monitoring, Grid-wide Execution service Analysis Client

17 Mardi Gras Conference (February 3, 2005)Paul Avery17 Collaborative Research by Globally Distributed Teams  Non-hierarchical: Chaotic analyses + productions  Superimpose significant random data flows

18 Mardi Gras Conference (February 3, 2005)Paul Avery18 Trillium Grid Tools: Virtual Data Toolkit Sources (CVS) Patching GPT src bundles NMI Build & Test Condor pool (37 computers) … Build Test Package VDT Build Contributors (VDS, etc.) Build Pacman cache RPMs Binaries Test Use NMI processes later

19 Mardi Gras Conference (February 3, 2005)Paul Avery19 VDT Growth Over 3 Years VDT 1.1.3, 1.1.4 & 1.1.5 pre-SC 2002 VDT 1.0 Globus 2.0b Condor 6.3.1 VDT 1.1.7 Switch to Globus 2.2 VDT 1.1.11 Grid3 VDT 1.1.8 First real use by LCG VDT 1.1.14 May 10

20 Mardi Gras Conference (February 3, 2005)Paul Avery20  Language: define software environments  Interpreter: create, install, configure, update, verify environments  Version 3.0.2 released Jan. 2005  LCG/Scram  ATLAS/CMT  CMS DPE/tar/make  LIGO/tar/make  OpenSource/tar/make  Globus/GPT  NPACI/TeraGrid/tar/make  D0/UPS-UPD  Commercial/tar/make Combine and manage software from arbitrary sources. % pacman –get iVDGL:Grid3 “1 button install”: Reduce burden on administrators Remote experts define installation/ config/updating for everyone at once VDT ATLA S NPAC I D- Zero iVDGL UCHEP % pacman VDT CMS/DPE LIGO Packaging of Grid Software: Pacman

21 Mardi Gras Conference (February 3, 2005)Paul Avery21 Collaborative Relationships CS Perspective Computer Science Research Virtual Data Toolkit Partner science projects Partner networking projects Partner outreach projects Larger Science Community Globus, Condor, NMI, iVDGL, PPDG EU DataGrid, LHC Experiments, QuarkNet, CHEPREO, Dig. Divide Production Deployment Tech Transfer Techniques & software Requirements Prototyping & experiments Other linkages  Work force  CS researchers  Industry U.S.Grids Int’l Outreach

22 Mardi Gras Conference (February 3, 2005)Paul Avery22 Grid3: An Operational National Grid  35 sites, 3500 CPUs: Universities + 4 national labs  Part of LHC Grid  Running since October 2003  Applications in HEP, LIGO, SDSS, Genomics, CS http://www.ivdgl.org/grid3

23 Mardi Gras Conference (February 3, 2005)Paul Avery23 Grid3 Applications  High energy physics  US-ATLAS analysis (DIAL),  US-ATLAS GEANT3 simulation (GCE)  US-CMS GEANT4 simulation (MOP)  BTeV simulation  Gravity waves  LIGO: blind search for continuous sources  Digital astronomy  SDSS: cluster finding (maxBcg)  Bioinformatics  Bio-molecular analysis (SnB)  Genome analysis (GADU/Gnare)  CS Demonstrators  Job Exerciser, GridFTP, NetLogger-grid2003

24 Mardi Gras Conference (February 3, 2005)Paul Avery24 Grid3 Shared Use Over 6 months cms dc04 atlas dc2 Sep 10 Usage: CPUs

25 Mardi Gras Conference (February 3, 2005)Paul Avery25 Grid3  Open Science Grid  Iteratively build & extend Grid3, to national infrastructure  Shared resources, benefiting broad set of disciplines  Realization of the critical need for operations  More formal organization needed because of scale  Grid3  Open Science Grid  Build OSG from laboratories, universities, campus grids, etc.  Argonne, Fermilab, SLAC, Brookhaven, Berkeley Lab, Jeff. Lab  UW Madison, U Florida, Purdue, Chicago, Caltech, Harvard, etc.  Further develop OSG  Partnerships and contributions from other sciences, universities  Incorporation of advanced networking  Focus on general services, operations, end-to-end performance

26 Mardi Gras Conference (February 3, 2005)Paul Avery26 http://www.opensciencegrid.org

27 Mardi Gras Conference (February 3, 2005)Paul Avery27 Open Science Grid Basics  OSG infrastructure  A large CPU & storage Grid infrastructure supporting science  Grid middleware based on Virtual Data Toolkit (VDT)  Loosely coupled, consistent infrastructure: “Grid of Grids”  Emphasis on “end to end” services for applications  OSG collaboration builds on Grid3  Computer and application scientists  Facility, technology and resource providers  Grid3  OSG-0  OSG-1  OSG-2  …  Fundamental unit is the Virtual Organization (VO)  E.g., an experimental collaboration, a research group, a class  Simplifies organization and logistics  Distributed ownership of resources  Local facility policies, priorities, and capabilities must be supported

28 Mardi Gras Conference (February 3, 2005)Paul Avery28 OSG Integration: Applications, Infrastructure, Facilities

29 Mardi Gras Conference (February 3, 2005)Paul Avery29 Enterprise Technical Groups Research Grid Projects VOs Researchers Sites Service Providers Universities, Labs activity 1 activity 1 activity 1 Activities Advisory Committee Core OSG Staff (few FTEs, manager) OSG Council (all members above a certain threshold, Chair, officers) Executive Board (8-15 representatives Chair, Officers) OSG Organization

30 Mardi Gras Conference (February 3, 2005)Paul Avery30 OSG Technical Groups and Activities  Technical Groups address and coordinate a technical area  Propose and carry out activities related to their given areas  Liaise & collaborate with other peer projects (U.S. & international)  Participate in relevant standards organizations.  Chairs participate in Blueprint, Grid Integration and Deployment activities  Activities are well-defined, scoped set of tasks contributing to the OSG  Each Activity has deliverables and a plan  … is self-organized and operated  … is overseen & sponsored by one or more Technical Groups

31 Mardi Gras Conference (February 3, 2005)Paul Avery31 OSG Technical Groups (7 currently)  Governance  Charter, organization, by-laws, agreements, formal processes  Policy  VO & site policy, authorization, priorities, privilege & access rights  Security  Common security principles, security infrastructure  Monitoring and Information Services  Resource monitoring, information services, auditing, troubleshooting  Storage  Storage services at remote sites, interfaces, interoperability  Support Centers  Infrastructure and services for user support, helpdesk, trouble ticket  Education and Outreach  Networking?

32 Mardi Gras Conference (February 3, 2005)Paul Avery32 OSG Activities (5 currently)  Blueprint  Defining principles and best practices for OSG  Deployment  Deployment of resources & services  Incident Response  Plans and procedures for responding to security incidents  Integration  Testing & validating & integrating new services and technologies  Data Resource Management (DRM)  Deployment of specific Storage Resource Management technology

33 Mardi Gras Conference (February 3, 2005)Paul Avery33 OSG Short Term Plans  Maintain Grid3 operations  In parallel with extending Grid3 to OSG  OSG technology advances for Spring 2005 deployment  Add full Storage Elements  Extend Authorization services  Extend Data Management services  Interface to sub-Grids  Extend monitoring, testing, accounting  Add new VOs + OSG-wide VO Services  Add Discovery Service  Service challenges & collaboration with the LCG  Make the switch to “Open Science Grid” in Spring 2005

34 Mardi Gras Conference (February 3, 2005)Paul Avery34 Open Science Grid Meetings  Sep. 17, 2003 @ NSF  Strong interest of NSF education people  Jan. 12, 2004 @ Fermilab  Initial stakeholders meeting, 1 st discussion of governance  May 20-21, 2004 @ Univ. of Chicago  Joint Trillium Steering meeting to define OSG program  July 2004 @ Wisconsin  First attempt to define OSG Blueprint (document)  Sep. 9-10, 2004 @ Harvard  Major OSG workshop: Technical, Governance, Sciences  Dec. 15-17, 2004 @ UCSD  Major meeting for Technical Groups  Feb. 15-17, 2005 @ U Chicago  Integration meeting

35 Mardi Gras Conference (February 3, 2005)Paul Avery35 Networks

36 Mardi Gras Conference (February 3, 2005)Paul Avery36 Networks and Grids for Global Science  Network backbones and major links are advancing rapidly  To the 10G range in < 3 years; faster than Moore’s Law  New HENP and DOE Roadmaps: a factor ~1000 BW Growth per decade  We are learning to use long distance 10 Gbps networks effectively  2004 Developments: to 7 - 7.5 Gbps flows with TCP over 16-25 kkm  Transition to community-operated optical R&E networks  US, CA, NL, PL, CZ, SK, KR, JP …  Emergence of a new generation of “hybrid” optical networks  We must work to close to digital divide  To allow scientists in all world regions to take part in discoveries  Regional, last mile, local bottlenecks and compromises in network quality are now on the critical path  Important examples on the road to closing the digital divide  CLARA, CHEPREO, and the Brazil HEPGrid in Latin America  Optical networking in Central and Southeast Europe  APAN Links in the Asia Pacific: GLORIAD and TEIN  Leadership and Outreach: HEP Groups in Europe, US, Japan, & Korea

37 Mardi Gras Conference (February 3, 2005)Paul Avery37 HEP Bandwidth Roadmap (Gb/s)

38 Mardi Gras Conference (February 3, 2005)Paul Avery38 Evolving Science Requirements for Networks (DOE High Perf. Network Workshop) Science Areas Today End2End Throughput 5 years End2End Throughput 5-10 Years End2End Throughput Remarks High Energy Physics 0.5 Gb/s100 Gb/s 1000 Gb/s High bulk throughput Climate (Data & Computation) 0.5 Gb/s160-200 Gb/s N x 1000 Gb/s High bulk throughput SNS NanoScience Not yet started 1 Gb/s1000 Gb/s + QoS for Control Channel Remote control and time critical throughput Fusion Energy0.066 Gb/s (500 MB/s burst) 0.2 Gb/s (500MB/ 20 sec. burst) N x 1000 Gb/s Time critical throughput Astrophysics0.013 Gb/s (1 TB/week) N*N multicast 1000 Gb/s Computational steering and collaborations Genomics Data & Computation 0.091 Gb/s (1 TB/day) 100s of users1000 Gb/s + QoS for Control Channel High throughput and steering See http://www.doecollaboratory.org/meetings/hpnpw /

39 Mardi Gras Conference (February 3, 2005)Paul Avery39 UltraLight: Advanced Networking in Applications 10 Gb/s+ network Caltech, UF, FIU, UM, MIT SLAC, FNAL Int’l partners Level(3), Cisco, NLR Funded by ITR2004

40 Mardi Gras Conference (February 3, 2005)Paul Avery40 Education and Outreach

41 Mardi Gras Conference (February 3, 2005)Paul Avery41 NEWS: Bulletin: ONE TWO WELCOME BULLETIN General Information Registration Travel Information Hotel Registration Participant List How to Get UERJ/Hotel Computer Accounts Useful Phone Numbers Program Contact us: Secretariat Chairmen Grids and the Digital Divide Rio de Janeiro, Feb. 16-20, 2004 Background  World Summit on Information Society  HEP Standing Committee on Inter-regional Connectivity (SCIC) Themes  Global collaborations, Grids and addressing the Digital Divide Next meeting: May 2005 (Korea) http://www.uerj.br/lishep2004

42 Mardi Gras Conference (February 3, 2005)Paul Avery42 Second Digital Divide Grid Meeting Prof. Dongchul Son Center for High Energy Physics Kyungpook National University International Workshop on HEP Networking, Grids and Digital Divide Issues for Global e-Science May 23-27, 2005 Daegu, Korea

43 Mardi Gras Conference (February 3, 2005)Paul Avery43 iVDGL, GriPhyN Education / Outreach Basics  $200K/yr  Led by UT Brownsville  Workshops, portals  Partnerships with CHEPREO, QuarkNet, …

44 Mardi Gras Conference (February 3, 2005)Paul Avery44 June 21-25 Grid Summer School  First of its kind in the U.S. (South Padre Island, Texas)  36 students, diverse origins and types (M, F, MSIs, etc)  Marks new direction for U.S. Grid efforts  First attempt to systematically train people in Grid technologies  First attempt to gather relevant materials in one place  Today:Students in CS and Physics  Later:Students, postdocs, junior & senior scientists  Reaching a wider audience  Put lectures, exercises, video, on the web  More tutorials, perhaps 2+/year  Dedicated resources for remote tutorials  Create “Grid book”, e.g. Georgia Tech  New funding opportunities  NSF: new training & education programs

45 CHEPREO: Center for High Energy Physics Research and Educational Outreach Florida International University  Physics Learning Center  CMS Research  iVDGL Grid Activities  AMPATH network (S. America)  Funded September 2003  $4M initially (3 years)  4 NSF Directorates!

46 Mardi Gras Conference (February 3, 2005)Paul Avery46 QuarkNet/GriPhyN e-Lab Project

47 Mardi Gras Conference (February 3, 2005)Paul Avery47 Chiron/QuarkNet Architecture

48 Mardi Gras Conference (February 3, 2005)Paul Avery48 Muon Lifetime Analysis Workflow

49 Mardi Gras Conference (February 3, 2005)Paul Avery49 QuarkNet Portal Architecture  Simpler interface for non-experts  Builds on Chiron portal

50 Mardi Gras Conference (February 3, 2005)Paul Avery50 Summary  Grids enable 21 st century collaborative science  Linking research communities and resources for scientific discovery  Needed by LHC global collaborations pursuing “petascale” science  Grid3 was an important first step in developing US Grids  Value of planning, coordination, testbeds, rapid feedback  Value of building & sustaining community relationships  Value of learning how to operate Grid as a facility  Value of delegation, services, documentation, packaging  Grids drive need for advanced optical networks  Grids impact education and outreach  Providing technologies & resources for training, education, outreach  Addressing the Digital Divide  OSG: a scalable computing infrastructure for science?  Strategies needed to cope with increasingly large scale

51 Mardi Gras Conference (February 3, 2005)Paul Avery51 Grid Project References  Grid3  www.ivdgl.org/grid3  Open Science Grid  www.opensciencegrid.org  GriPhyN  www.griphyn.org  iVDGL  www.ivdgl.org  PPDG  www.ppdg.net  CHEPREO  www.chepreo.org  UltraLight  ultralight.cacr.caltech.edu  Globus  www.globus.org  LCG  www.cern.ch/lcg  EU DataGrid  www.eu-datagrid.org  EGEE  www.eu-egee.org


Download ppt "Mardi Gras Conference (February 3, 2005)Paul Avery1 University of Florida Grid3 and Open Science Grid Mardi Gras Conference Louisiana."

Similar presentations


Ads by Google