Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Open Science Grid.. An introduction Ruth Pordes Fermilab.

Similar presentations


Presentation on theme: "1 Open Science Grid.. An introduction Ruth Pordes Fermilab."— Presentation transcript:

1 1 Open Science Grid.. An introduction Ruth Pordes Fermilab

2 2 OSG Provenance 19992000200120022005200320042006200720082009 PPDG GriPhyN iVDGL TrilliumGrid3 OSG (DOE) (DOE+NSF) (NSF)

3 3 Introducing myself at Fermilab for 25 years (well and 2 years in the “pioneer” ‘70s), started on data acquisition for High Energy Physics experiments, a “builder” of the Sloan Digital Sky Survey, led development of a common data acquisition system for 6 experiments at Fermilab (DART), coordinator of the CDF/D0 Joint Run II offline projects (with Dane), coordinator of the Particle Physics Data Grid SciDAC I collaboratory, founder of Trillium collaboration of iVDGL, GridPhyN, PPDG, and GLUE interoperability between US and EU. Now I am variously: Executive Director of the Open Science Grid, an Associate Head of the Computing Division at Fermilab, and US CMS Grid Services and Interfaces Coordinator.

4 4 A Common Grid Infrastructure

5 5 Overlaid by community computational environments of single to large groups of researchers located locally to worldwide

6 6 Grid of Grids - from Local to Global Community Campus National

7 7 Current OSG deployment 96 Resources across production & integration infrastructures 20 Virtual Organizations +6 operations Includes 25% non-physics. ~20,000 CPUs (from 30 to 4000 shared between OSG and local use) ~6 PB Tapes ~4 PB Shared Disk Jobs Running on OSG over 9 months Sustaining through OSG submissions: 3,000-4,000 simultaneous jobs. ~10K jobs/day ~50K CPUhours/day. Peak short validation jobs ~15K Using production & research networks

8 8 Examples of Sharing SiteMax # Jobs ASGC_OSG9 BU_ATLAS_Tier2154 CIT_CMS_T299 FIU-PG58 FNAL_GPFARM17 OSG_LIGO_PSU1 OU_OCHEP_SWT282 Purdue-ITaP3 UC_ATLAS_MWT288 UFlorida-IHEPA1 UFlorida-PG (CMS)1 UMATLAS UWMadisonCMS594 UWMilwaukee2 osg-gw-2.t2.ucsd.edu2 CPUHours55,000 VOMax # Jobs ATLAS 2 CDF279 CMS559 COMPBIOGRID10 GADU1 LIGO75 Average # of Jobs (~300 batch slots) 253 CPUHours30,000 #Jobs Completed50,000 last week at UCSD -- CMS Site last week of ATLAS

9 9 OSG Consortium Contributors Project

10 10 OSG Project

11 11 OSG & its goals Project receiving ~$6/M/Year for 5 years from DOE and NSF for effort to sustain and evolve the distributed facility, bring on board new communities and capabilities and EOT. Hardware resources contributed by OSG Consortium members. Goals: Support data storage, distribution & computation for High Energy, Nuclear & Astro Physics collaborations, in particular delivering to the needs of LHC and LIGO science. Engage and benefit other Research & Science of all scales through progressively supporting their applications. Educate & train students, administrators & educators. Provide a petascale Distributed Facility across the US with guaranteed & opportunistic access to shared compute & storage resources. Interface, Federate and Collaborate with Campus, Regional, other national & international Grids, in particular with EGEE & TeraGrid. Provide an Integrated, Robust Software Stack for Facility & Applications, tested on a well provisioned at-scale validation facility. Evolve the capabilities by deploying externally developed new technologies through joint projects with the development groups.

12 12 Middleware Stack and Deployment OSG Middleware is deployed on existing farms and storage systems. OSG Middleware interfaces to the existing installations of OS, utilities and batch systems. VOs have VO scoped environments in which they deploy applications (and other files), execute code and store data. VOs are responsible for and have control over their end-to-end distributed system using the OSG infrastructure. End-to-end s/w StackDeployment into Production Integration Grid has ~15 sites

13 13 OSG will support Global Data Transfer, Storage & Access at GBytes/sec 365 days a year e.g. CMS Data To / From Tape at Tier-1 Need to triple in ~1 year. Data To / From Tape at Tier-1 Need to triple in ~1 year. Data to Disk Caches - Data Samples 200MB/sec 600MB/sec Tier-2 sites data distributed to ~7 Tier-1s, CERN + Tier-2s OSG must enable data placement, disk usage, resource managament policies, of 10s Gbit/Sec data movement, 10s Petabyte tape stores, local shared disk caches of 100sTBs across 10s of sites for >10 VOs. Data distribution will depend on & integrate to advanced network infrastructures:  Internet 2 will provide "layer 2” connectivity between OSG University Sites and peers in Europe.  ESNET will provide "layer 2" connectivity between OSG DOE Laboratory sites and EU GEANT network.  Both include the use of the IRNC link (NSF) from the US to Amsterdam

14 14 Security Infrastructure Identity: X509 Certificates. Authentication and Authorization using VOMS extended attribute certficates. Security Process modelled on NIST procedural controls - management, operational, technical, starting from an inventory of the OSG assets. User and VO Management:  VO Registers with Operations Center  User registers through VOMRS or VO administrator  Site Registers with the Operations Center  Each VO centrally defines and assigns roles  Each Site provides role to access mappings based on VO/VOGroup.  Can reject individuals. Heterogeneous identity management systems – OSG vs TeraGrid/EGEE, grid vs. local, compute vs. storage, head-node vs., old-version vs. new-version. Issues include:  Cross domain right management  Right/identity management of software modules and resources  Error/rejection propagation  Solutions/approaches that work end-to-end

15 15 Education, Outreach, Training Training Workshops - for Administrators and Application Developers e.g. Grid Summer Workshop (in 4th year) Outreach - e.g. Science Grid This Week -> International Science Grid This Week Education through e-Labs

16 16 Integrated Network Management OSG Initial Timeline & Milestones - Summary LHC Simulations Support 1000 Users; 20PB Data Archive Contribute to Worldwide LHC Computing Grid LHC Event Data Distribution and Analysis Contribute to LIGO Workflow and Data Analysis +1 Community Additional Science Communities +1 Community Facility Security : Risk Assessment, Audits, Incident Response, Management, Operations, Technical Controls Plan V11st AuditRisk Assessment AuditRisk Assessment AuditRisk Assessment AuditRisk Assessment VDT and OSG Software Releases: Major Release every 6 months; Minor Updates as needed VDT 1.4.0VDT 1.4.1VDT 1.4.2………… Advanced LIGO LIGO Data Grid dependent on OSG CDF Simulation STAR, CDF, D0, Astrophysics D0 Reprocessing STAR Data Distribution and Jobs 10KJobs per Day D0 Simulations CDF Simulation and Analysis LIGO data run SC5 Facility Operations and Metrics: Increase robustness and scale; Operational Metrics defined and validated each year. Interoperate and Federate with Campus and Regional Grids 200620072008200920102011 Project startEnd of Phase I End of Phase II VDT Incremental Updates dCache with role based authorization OSG 0.6.0OSG 0.8.0OSG 1.0OSG 2.0OSG 3.0… AccountingAuditing VDS with SRM Common S/w Distribution with TeraGrid EGEE using VDT 1.4.X Transparent data and job movement with TeraGrid Transparent data management with EGEE Federated monitoring and information services Data Analysis (batch and interactive) Workflow Extended Capabilities & Increase Scalability and Performance for Jobs and Data to meet Stakeholder needs SRM/dCache Extensions “Just in Time” Workload Management VO Services Infrastructure Improved Workflow and Resource Selection Work with SciDAC-2 CEDS and Security with Open Science +1 Community 200620072008200920102011 +1 Community


Download ppt "1 Open Science Grid.. An introduction Ruth Pordes Fermilab."

Similar presentations


Ads by Google