Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego The Open Science Grid Ted Hesselroth Fermilab Slide attribution: Ruth Pordes, Miron Livny, Frank Wuerthwein, Paul Avery, Kent Blackburn,CNGrid
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Map of OSG Sites OSG is a grid organization funded by a SciDAC-2/NSF grant 30 million dollars over five years. 33 FTE. 77 compute and 15 storage elements
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego OSG Mission Statement Practical support for end-to-end community systems in a heterogeneous gobal environment to Transform compute and data intensive science through a national cyberinfrastructure that includes from the smallest to the largest organizations.
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego OSG Goals – Use of Existing Resources ● Enable scientists to use and share a greater % of available compute cycles. ● Help scientists to use distributed systems, storage, processors, and software with less effort. ● Enable more sharing and reuse of software and reduce duplication of effort through providing effort in integration and extensions. ● Establish “open-source” community working together to communicate knowledge and experience.erience and also overheads for new participants.
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego OSG - forming communities Software Developers Sites Experiments (VOs) (e.g Condor, Globus, SRM, …) (e.g BNL, FNAL, LBNL, SLAC, LHC-T2s, DISUN, …) OSG enables community formation to solve compute and data intensive scientific problems. (USCMS, USATLAS, CDF, D0, LIGO, BioTech, NanoTech, …) Coordinating role
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Principal Science Drivers ● High energy and nuclear physics 100s of petabytes (LHC)2008 Several petabytes2005 ● LIGO (gravity wave detector) several petabytes2002 ● Digital astronomy 10s of petabytes2009 10s of terabytes2001 ● Other sciences coming forward Bioinformatics (10s of petabytes) Nanoscience Environmental Chemistry Applied mathematics Materials Science? Data growth Community growth
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego The Evolution of the OSG PPDG GriPhyN iVDGL TrilliumGrid3 OSG (DOE) (DOE+NSF) (NSF) Campus, regional grids LHC Ops LHC construction, preparation LIGO operation LIGO preparation European Grid + Worldwide LHC Computing Grid DOE Science Grid (DOE)
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego VOs in OSG *=non-physics
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Example Campus Grid: Grid Laboratory of Wisconsin (GLOW)
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Institutions Green=Contributing staff *=non-physics
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego China National Grid (CNGrid) ● 17 TFlops
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego ● Many resources are owned or statically allocated to one user community. The institutions which own resources typically have ongoing relationships with (a few) particular user communities (VOs) The remainder of an organization’s available resources can be “used by everyone or anyone else”. organizations can decide against supporting particular VOs. OSG staff are responsible for monitoring and, if needed, managing this usage. Our challenge is to maximize good - successful - output from the whole system. Use of Existing Resources
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego ● Increased usage of CPUs and infrastructure alone (ie cost of processing cycles) is not the persuading cost-benefit value. The benefits come from reducing risk in and sharing support for large, complex systems which must be run for many years with a short life-time workforce. Opportunity and flexibility to distribute load and address peak needs. Savings in effort for integration, system and software support. Maintainance of an experienced workforce in a common system Lowering the cost of entry to new contributors. Enabling of new computational opportunities to communities that would not otherwise have access to such resources. Benefits to Sites
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego The “don’t”s and “do”s of OSG ● The OSG Facility does not – “Own” any compute (processing, storage and communication) resources “Own” any middleware Fund any site or VO administration/operation personel The OSG Facility does – –Help sites join the OSG facility and enable effective guaranteed and opportunistic usage of their resources (including data) by remote users –Help VOs join the OSG facility and enable effective guaranteed and opportunistic harnessing of remote resources (including data) –Define interfaces which people can use. –Maintain and supports an integrated software stack that meets the needs of the stakeholders of the OSG consortium –Reach out to non-HEP communities to help them use the OSG –Train new users, administrators, and software developers
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego What Can the OSG Offer? ● Middleware Packaging Testing Support Security operations ● Organizational support The OSG Consortium (brings together the stakeholders) The OSG Facility (brings together resources and users) ● Technical Support Troubleshooting distributed computing technologies ● Extensions Software capbilities needed by OSG ● Engagment Consultation on OSG participation ● Instruction Workshops Documentation
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego OSG Project Effort Roughly 2/3 of leadership positions filled from outside HEP !
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Benefits to HEP thus far ● LHC –Middleware stack for the LHC distributed computing systems of USATLAS and USCMS –Strong partner to negotiate technical and operational problems with EGEE and Nordugrid. –Framework for integrating “Tier-3” resources. Tevatron and other FNAL based HEP –CDF: MC production on OSG –D0: reprocessing on OSG –Other HEP benefit via FNAL campus grid ● Other HEP starting to show interest as well.
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego ● D0’s own resources are committed to the processing of newly acquired data and analysis of the processed datasets. ● In Nov ‘06 D0 asked to use CPUs for 2-4 months for re-processing of an existing dataset (~500 million events) for science results for the summer conferences in July ‘07. ● The Executive Board estimated there were currently sufficient opportunistically available resources on OSG to meet the request; We also looked into the local storage and I/O needs. D0 Reprocessing
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego D0 Reprocessing OSG Portion
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego LIGO: Search for Gravity Waves ● LIGO Grid 6 US sites 3 EU sites (UK & Germany) * LHO, LLO: LIGO observatory sites * LSC: LIGO Scientific Collaboration Cardiff AEI/Golm Birmingham
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Sloan Digital Sky Survey: Mapping the Sky
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Astronomy Experiences on the Grid ● Experience tells us that Grid is more suitable for CPU Intensive Jobs … achieve parallelism … more jobs… finish sooner ● Running locally would limit the number of jobs run simultaneously ● On OSG, can run several run- rerun and camcols within a run-rerun in parallel ● Current Workflow also will facilitate further analysis Grid not very happy Ideal for Grid Grid Match per day ? per day Avg. Rate of Job Completion 12 Kilobytes 2 Megabytes Data Output/Jo b 9 Gigabytes 1 Megabyte Data Input/Job 180 ~50000Total No. of Jobs NEO Data&CPU Intensive Quasar Spectra CPU Intensive
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Engagement ● Currently main stakeholders are from physics - US LHC experiments, LIGO, STAR experiment, the Tevatron Run II and Astrophysics experiments ● Active “engagement” effort to add new domains and resource providers to the OSG consortium – Rosetta at Kulhman Laboratory Weather Research and Forecast (WRF) model nanoHub applications – BioMoca and nanoWire Chemistry at Harvard Molecular Mechanics (CHARMM)
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Rosetta Protein Folding Application ● “What impressed me most was how quickly we were able to access the grid and start using it. We learned about it [at RENCI], and we were running jobs about two weeks later,” Brian Kuhlman, PI. ● 3,000 CPU hours per protein ● CASP similar protein: 3 hours on the 114 teraflops IBM Blue Gene Watson machine
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Genome Analysis and Database Update system ● Runs across TeraGrid and OSG. Uses the Virtual Data System (VDS) workflow & provenance. ● 3.1 million protein sequences, 93,000 jobs. ● “During the last run in January (2006), GADU VO jobs had access to only about 8-10 OSG sites and were not authenticated by a large number of sites. With the help of the GOC, we are working on getting more sites to authenticate GADU jobs.” Dinanath Sulakhe, Argonne National Laboratory
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Integrated Database Integrated Database Includes: Parsed Sequence Data and Annotation Data from Public web sources. Results of different tools used for Analysis: Blast, Blocks, TMHMM, … GADU using Grid Applications executed on Grid as workflows and results are stored in integrated Database. GADU Performs: Acquisition: to acquire Genome Data from a variety of publicly available databases and store temporarily on the file system. Analysis: to run different publicly available tools and in-house tools on the Grid using Acquired data & data from Integrated database. Storage: Store the parsed data acquired from public databases and parsed results of the tools and workflows used during analysis. Bidirectional Data Flow Public Databases Genomic databases available on the web. Eg: NCBI, PIR, KEGG, EMP, InterPro, etc. Applications (Web Interfaces) Based on the Integrated Database PUMA2 Evolutionary Analysis of Metabolism Chisel Protein Function Analysis Tool. TARGET Targets for Structural analysis of proteins. PATHOS Pathogenic DB for Bio-defense research Phyloblocks Evolutionary analysis of protein families TeraGridOSGDOE SG GNARE – Genome Analysis Research Environment Services to Other Groups SEED (Data Acquisition) Shewanella Consortium (Genome Analysis) Others.. Bioinformatics: GADU / GNARE
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego NanoHub ● BioMOCA (Biology Monte Carlo) transport Monte Carlo tool. ● Written at Network for Computational Nanotechnology ● PI: Umberto Ravaioli, UIUC Ion transfer in artificial membranes ● Job run is 8-40 days
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Network Collaboration Internet 2 National Lambda Rail Ultralight
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego OSG Activities ● Facility Software Operations Deployment Integration Troubleshooting Engagement ● Security ● Education ● Extensions Middleware Improvement Workload Management Scalability Testing Tools and prototypes ● User Support ● Admin
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego The Software Stack
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego What is the VDT? ● A collection of software Grid software: Condor, Globus and lots more Virtual Data System: Origin of the name “VDT” (toolkit) Utilities: Monitoring, Authorization, Configuration Built for >10 flavors/versions of Linux ● Automated Build and Test: Integration and regression testing. ● An easy installation: Push a button, everything just works. Quick update processes. ● Responsive to user needs: process to add new components based on community needs. ● A support infrastructure: front line software support, triaging between users and software providers for deeper issues.
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego How we get to a Production Software Stack Input from stakeholders and OSG directors VDT Release OSG Integration Testbed Release OSG Production Release Test on OSG Validation Testbed
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego Troubleshooting ● GOC Tickets Assigns responsible Interoperability with EGEE ● Mailing Lists ● “Office Hours”
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego OSG Storage Activities ● Support for Storage Elements in OSG (4 FTE) dCache Bestman ● Validation Tier2-level test stand With UCSD Tier2 ● Packaging Installation scripts Through VDT
Ted Hesselroth Nordugrid 2007 September 24-28, 2007 Abhishek Singh Rana and Frank Wuerthwein UC San Diego OSG Storage Activities ● Support Mailing list ● Tools For site administrators Will collect existing tools ● Extensions Space reservation file cleaner dCache logging