Using Grid Computing David Groep, NIKHEF 2002-07-15.

Slides:



Advertisements
Similar presentations
The Access Grid Ivan R. Judson 5/25/2004.
Advertisements

Data Management Expert Panel - WP2. WP2 Overview.
Grid Computing Test beds in Europe and the Netherlands David Groep, NIKHEF
Grid Computing, B. Wilkinson, 20045a.1 Security Continued.
The DutchGrid Platform Collaboration of projects from –Computer Science, HEP and service providers Participating and supported projects –Virtual Laboratory.
High Performance Computing Course Notes Grid Computing.
NIKHEF Testbed 1 Plans for the coming three months.
GridFTP: File Transfer Protocol in Grid Computing Networks
Andrew McNab - EDG Access Control - 14 Jan 2003 EU DataGrid security with GSI and Globus Andrew McNab University of Manchester
Grid Security. Typical Grid Scenario Users Resources.
The Community Authorisation Service – CAS Dr Steven Newhouse Technical Director London e-Science Centre Department of Computing, Imperial College London.
Grid Computing from a solid past to a bright future? David Groep NIKHEF
The Grid ”Enter the GRID” af Kristian Mandrup. Indeks Intro Overview Architecture Solutions Future Conclusions & discussion.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Generic AAA model in Grids IRTF - AAAARCH meeting IETF 52 – Dec 14 th Salt Lake City Leon Gommans Advanced Internet Research Group.
DGC Paris Community Authorization Service (CAS) and EDG Presentation by the Globus CAS team & Peter Kunszt, WP2.
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © Chapter 1, pp For educational use only.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
AustrianGrid, LCG & more Reinhard Bischof HPC-Seminar April 8 th 2005.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Security Mechanisms The European DataGrid Project Team
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
Grid Services at NERSC Shreyas Cholia Open Software and Programming Group, NERSC NERSC User Group Meeting September 17, 2007.
UNICORE UNiform Interface to COmputing REsources Olga Alexandrova, TITE 3 Daniela Grudinschi, TITE 3.
Grids and Globus at BNL Presented by John Scott Leita.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
Andrew McNab - Manchester HEP - 5 July 2001 WP6/Testbed Status Status by partner –CNRS, Czech R., INFN, NIKHEF, NorduGrid, LIP, Russia, UK Security Integration.
Grid Computing - AAU 14/ Grid Computing Josva Kleist Danish Center for Grid Computing
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
National Computational Science National Center for Supercomputing Applications National Computational Science NCSA-IPG Collaboration Projects Overview.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
1 Globus Grid Middleware: Basics, Components, and Services Source: The Globus Project Argonne National Laboratory & University of Southern California
Dutch Tier Hardware Farm size –now: 150 dual nodes + scavenging 200 nodes –buildup to ~1500 up-to-date nodes in 2007 Network –now: 2 Gbit/s internatl.
WP8 Meeting Glenn Patrick1 LHCb Grid Activities in UK Grid WP8 Meeting, 16th November 2000 Glenn Patrick (RAL)
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Using Grid Computing at NIKHEF David Groep, NIKHEF
The DutchGrid Platform – An Overview – 1 DutchGrid today and tomorrow David Groep, NIKHEF The DutchGrid Platform Large-scale Distributed Computing.
3-Nov-00D.P.Kelsey, HEPiX, JLAB1 Certificates for DataGRID David Kelsey CLRC/RAL, UK
28 March 2001F Harris LHCb Software Week1 Overview of GGF1 (Global Grid Forum) and Datagrid meeting, NIKHEF, Mar 5-9 F Harris(Oxford)
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
The Grid the united computing power Jian He Amit Karnik.
The Scaling and Validation Programme PoC David Groep & vle-pfour-team VL-e Workshop NIKHEF SARA LogicaCMG IBM.
Authors: Ronnie Julio Cole David
Key prototype applications Grid Computing Grid computing is increasingly perceived as the main enabling technology for facilitating multi-institutional.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
119 May 2003HEPiX/HEPNT National Institute for Nuclear Physics and High Energy Physics Coordinates all (experimental) subatomic physics research in The.
Authorisation, Authentication and Security Guy Warner NeSC Training Team Induction to Grid Computing and the EGEE Project, Vilnius,
INSA LYON1 Security Policy Configuration Issues in Grid Computing Environments George Angelis, Stefanos Gritzalis, and Costas Lambrinoudakis Presentation.
7. Grid Computing Systems and Resource Management
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
DTI Mission – 29 June LCG Security Ian Neilson LCG Security Officer Grid Deployment Group CERN.
EC Review – 01/03/2002 – WP9 – Earth Observation Applications – n° 1 WP9 Earth Observation Applications 1st Annual Review Report to the EU ESA, KNMI, IPSL,
ALCF Argonne Leadership Computing Facility GridFTP Roadmap Bill Allcock (on behalf of the GridFTP team) Argonne National Laboratory.
2. WP9 – Earth Observation Applications ESA DataGrid Review Frascati, 10 June Welcome and introduction (15m) 2.WP9 – Earth Observation Applications.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
The Globus Toolkit The Globus project was started by Ian Foster and Carl Kesselman from Argonne National Labs and USC respectively. The Globus toolkit.
DutchGrid KNMI KUN Delft Leiden VU ASTRON WCW Utrecht Telin Amsterdam Many organizations in the Netherlands are very active in Grid usage and development,
Grid Computing at NIKHEF Shipping High-Energy Physics data, be it simulated or measured, required strong national and trans-Atlantic.
Clouds , Grids and Clusters
Grid Services B.Ramamurthy 12/28/2018 B.Ramamurthy.
Presentation transcript:

Using Grid Computing David Groep, NIKHEF

CERN LHC particle accellerator operational in Petabyte per year 150 countries > Users lifetime ~ 20 years level 1 - special hardware 40 MHz (40 TB/sec) level 2 - embedded level 3 - PCs 75 KHz (75 GB/sec) 5 KHz (5 GB/sec) 100 Hz (100 MB/sec) data recording & offline analysis The Grid, But Why?

CPU & Data Requirements Estimated CPU Capacity at CERN ,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500 5, year K SI95 Moore’s law – some measure of the capacity technology advances provide for a constant number of processors or investment Jan 2000: 3.5K SI95 LHC experiments Other experiments < 50% of the main analysis capacity will be at CERN Estimated CPU capacity required at CERN

More Reasons Why ENVISAT 3500 MEuro programme cost 10 instruments on board 10 instruments on board 200 Mbps data rate to ground 200 Mbps data rate to ground 400 Tbytes data archived/year 400 Tbytes data archived/year ~100 `standard’ products ~100 `standard’ products 10+ dedicated facilities in Europe 10+ dedicated facilities in Europe ~700 approved science user projects ~700 approved science user projects 3500 MEuro programme cost 10 instruments on board 10 instruments on board 200 Mbps data rate to ground 200 Mbps data rate to ground 400 Tbytes data archived/year 400 Tbytes data archived/year ~100 `standard’ products ~100 `standard’ products 10+ dedicated facilities in Europe 10+ dedicated facilities in Europe ~700 approved science user projects ~700 approved science user projects

And More … For access to data –Large network bandwidth to access computing centers –Support of Data banks replicas (easier and faster mirroring) –Distributed data banks For interpretation of data –GRID enabled algorithms BLAST on distributed data banks, distributed data mining Bio-informatics

Common Ground Large amounts of data Distributed, ad-hoc user community Problems are distributable Need for resources grows faster than market Network grows faster than the application needs Willingness to share resources … … if security and integrity is guaranteed

The One-Liner Resource sharing and coordinated problem solving in dynamic multi-institutional virtual organisations

What is Grid computing? Dependable, consistent and pervasive access Combining resources from various organizations `Virtual Organizations’ – user-based view on Grid Technical challenges: –transparent decisions for the user –uniformity in access methods –secure & crack resistant –authentication, authorization, accounting (AAA) &quota

Globus Project started 1997 de facto-standard Reference implementation of Gridforum standards Large community effort Basis of several projects, including EU-DataGrid Toolkit `bag-of-services' approach Successful test beds, with single sign-on, etc… Grid Middleware

In The Beginning Distributed Computing –synchronous processing High-Throughput Computing –asynchronous processing On-Demand Computing –dynamic resources Data-Intensive Computing –databases Collaborative Computing –science Ian Foster and Carl Kesselman, editors, “The Grid: Blueprint for a New Computing Infrastructure,” Morgan Kaufmann, 1999

Grid Architecture Applications Grid Services GRAM Grid Security Infrastructure (GSI) Grid Fabric CondorMPIPBSInternetLinux Application Toolkits DUROCMPICH-G2Condor-G GridFTPMDS SUN VLAM-G Make all resources talk standard protocols Promote interoperability of application toolkit, similar to interoperability of networks by Internet standards ReplicaSrv

OGSA: new directions Looks superficially like `web services’ Based on common standards: –WSDL –SOAP –UDDI Adds: –Transient services –State of distributed activities –Workflow, videoconf, distributed data analysis Management of service instances Grid Security Infrastructure

EU DataGrid HEP, EO, Bio Applications Grid Services GRAM Grid Fabric CondorPBSInternetLinux Application Toolkits MPICH-G2Condor-G GridFTPMDS SUN ResourceBroker Data Replicas Databases Mass storage Fabric&Network

Looking for Resources Resource Brokerage based on matchmaking (Condor) Information Services Mesh –Meta-computing directory –Replica Catalogues DataGrid

Submitting a Job

Locating a Replica Grid Data Mirror Package Moves data across sites Replicates both files and individual objects Catalogue used by Broker Replica Location Service (giggle) Read-only copies “owner” by the Replica Manager.

Sending Your Data Tape robots, disks, etc. share GridFTP interface Supports single-sign-on and confidentiality Optimize for high-speed >1Gbit/s networks In the future: automatic optimizations, bandwidth reservations, directory-enabled networking, …

Grid-enabled Databases? SpitFire uniform access to persistent storage on the Grid Multiple roles support Compatible with GSI (single sign-on) though CoG Uses standard stuff: JDBC, SOAP, XML Supports various back-end data bases

DataGrid Test Bed 1 DataGrid TB1: –14 countries –21 major sites –Growing rapidly Submitting Jobs: –Login only once, run everywhere –Cross administrative boundaries in a secure and trusted way –Mutual authorization

DutchGrid Platform Amsterdam Utrecht KNMI Delft Leiden Nijmegen Enschede DutchGrid: –Test bed coordination –PKI security Participation by NIKHEF: FOM, VU, UvA, Utrecht, Nijmegen KNMI, SARA AMOLF DAS-2 (ASCI): TUDelft, Leiden, VU, UvA, Utrecht Telematics Institute

And now for some Technical Details For Users

Resources Current startup-resources to be (ab)used: –NIKHEF: Several Globus test machines (try them now from your desk!) 50x2 CPU’s D0 cluster 2x10x2 (=40) CPU’s LHCb at NIKHEF(WCW) &VU 10x2 CPU’s Alice NIKHEF(WCW) ca. 4x2 CPU’s Alice Utrecht ca. 10x2 CPU’s D0 Nijmegen Lots of disk & dedicated 1.3TByte cache server –DAS-II: 200 dual-PIII’s systems & some disk (~2TByte) Spread over 5 locations (NIKHEF is one!) –SARA: tape robot (>200TByte), some clusters –More systems (NCF) to come this year …

Start using the grid All the necessary “client tools” are on all Linux and Solaris systems You just need: –Credentials/tokens for the Grid (see next slides) –Authorization to use resources (you get all NIKHEF resources by default) –Information on which resources to use effectively

Your Grid Credentials You will use resources across several domains –You may not care about security and authorization –But the remote site admin will ! All communications are authenticated using X.509 “Public Key” Certificates The technology used to secure credit card transactions on the web ( ) Uniquely binds name/affiliation to a digital token

Certification Authorities CA’s act as trusted third parties Remote sites trust the CA for a proper binding They will not do authentication again, so only authorization left. CA’s are highly valuable: crack one to impersonate others on the Grid (and abuse resources) Registration Authorities do in-person ID checks

CA’s in DataGrid 10 National CA’s (one per EU country) Each one has a detailed policy and practice statement NIKHEF operates the CA for DutchGrid See Get a “certificate” from the DutchGrid CA before you can start using the Grid It’s valuable, protect it with a pass phrase One cert valid for all DataGrid sites

The Proxy A `proxy certificate’ is a limited-lifetime delegation without a pass phrase to protect it Implements the single sign-on for Grid Valid for 12 hours (by default) Use it to: –Run your jobs –Get access to your data Get it, by running grid-proxy-init

Now see for yourself

Getting a Certificate Initialize your environment for the Grid Use the Globus local guide from Send the result to you will be contacted by phone Put the certificate (sent by mail) in your $HOME/.globus/usercert.pem Or use the Web at

Using the Grid Request authorization: Look what is out there using grid-info-search or Try some local hosts: –bilbo, kilogram, triangel kilogram:davidg:1009$ globus-job-run dommel.wins.uva.nl /usr/ucb/quota -v Disk quotas for random (uid 12xxx): Filesystem usage quota limit timeleft files quota limit timeleft /home/random kilogram:davidg:1010$ Start running your analysis/MC/other jobs

grid-proxy-init kilogram:davidg:1003$ grid-proxy-init Your identity: /O=dutchgrid/O=users/O=nikhef/CN=David Groep Enter GRID pass phrase for this identity: PassPhrase Creating proxy Done Your proxy is valid until Wed Sep 26 05:50:

GridFTP Universal high-performance file transfer Extends the FTP protocol with: –Single sign-on ( GSI, GSSAPI, RFC2228 ) –Parallel streams for speed-up –Striped access (ftp from multiple sites to be faster) Clients: gsincftp, globus-url-copy.

What’s Next? Some of the nice user-features to come: –Finding data files by characteristics (give me all golden decay’s) –Moving your job to where the data is –Automatic partitioning of jobs –Support true-interactive work –Better network utilisation (faster access to data) –……… If you are in the DataGrid project, ask your WP leader for authorization in TB1