GridPP, The Grid & Industry Tony Doyle, Project Leader Steve Lloyd, Collaboration Board Chairman Robin Middleton, Middleware Coordinator Neasan O’Neill, Events Officer Who we are, what it is and what we can do.
Developed a working, highly functional Grid Who are GridPP? 19 UK Universities, CERN and CCLRC (RAL & Daresbury) Funded by PPARC: GridPP1 2001-2004 (£17m) “From Web to Grid” GridPP2 2004-2007 (£16m) “From Prototype to Production” GridPP3 2007-2011 (proposed) “From Production to Exploitation” Developed a working, highly functional Grid
Web: information sharing Invented at CERN by Tim Berners-Lee Quickly crossed over into public use No. of Internet hosts (millions) Year Agreed protocols: HTTP, HTML, URLs Anyone can access information and post their own
Why do particle physicists need the Grid? The CERN LHC The world’s most powerful particle accelerator 4 Large Experiments
Why do particle physicists need the Grid? Example from LHC: starting from this event Concorde (15 Km) Mt. Blanc (4.8 Km) One year’s data from LHC would fill a stack of CDs 20km high ~100,000,000 electronic channels 800,000,000 proton-proton interactions per second 0.0002 Higgs per second 10 PBytes of data a year (10 Million GBytes = 14 Million CDs) We are looking for this “signature” Selectivity: 1 in 1013 Like looking for 1 person in a thousand world populations Or for a needle in 20 million haystacks!
Solution – Build a Grid Share more than information Efficient use of resources at many institutes Leverage over other sources of funding Data, computing power, applications Join local communities Challenges: share data between thousands of scientists with multiple interests link major and minor computer centres ensure all data accessible anywhere, anytime grow rapidly, yet remain reliable for more than a decade cope with different management policies of different centres ensure data security be up and running routinely by 2007
Middleware is Everything Your Program Single PC Grid PROGRAMS Your Program MIDDLEWARE User Interface Machine Word/Excel Games Email/Web Resource Broker Information Service OPERATING SYSTEM CPU Disks, CPU etc Replica Catalogue Bookkeeping Service Middleware is the Operating System of a distributed computing system Disk Server CPU Cluster CPU Cluster CPU Cluster
GridPP Middleware Development Grid Data Management Network Monitoring Workload Management Information Services Security Storage Interfaces
What you need to use the Grid 1. Get a digital certificate (UK Certificate Authority) Authentication – who you are 2. Join a Virtual Organisation (VO) Authorisation – what you are allowed to do 3. Get access to a local User Interface Machine (UI) and copy your files and certificate there 4. Write some Job Description Language (JDL) and scripts to wrap your programs ############# HelloWorld.jdl ################# Executable = "/bin/echo"; Arguments = "Hello welcome to the Grid "; StdOutput = "hello.out"; StdError = "hello.err"; OutputSandbox = {"hello.out","hello.err"}; #########################################
International Context GridPP is part of EGEE and LCG (currently the largest Grid in the world) EU Enabling Grids for e-Science (EGEE) 2004-2008 Grid Deployment Project for all disciplines GridPP LCG EGEE UK National Grid Service UK’s core production computational and data Grid LHC Computing Grid (LCG) Grid Deployment Project for LHC NorduGrid (Scandinavia) Grid Research and Development collaboration Open Science Grid (USA) Science applications from HEP to biochemistry
The LCG Grid Status Worldwide 182 Sites 23,438 CPUs 9.2 PB Disk 2,200 Years of CPU time UK 21 Sites 4,482 CPUs 180 TB Disk 593 Years of CPU time
What GridPP Has Done So Far Analysed 300,000 possible drug components in the fight against the Avian Flu virus Simulated 46 million molecules for medical research in 5 weeks, which would have taken over 80 years on a single PC Reached transfer speeds of 1 Gigabyte per second in high speed networking tests from CERN – a DVD every 5 seconds Simulated 500 million particle physics collisions with the BaBar experiment Transformed the way particle physics computing problems are approached
Who else can use a Grid? Astronomy Bioinformatics Engineering Healthcare Commerce Gaming
“UK contributes to EGEE's battle with malaria” Number of Biomedical jobs processed by country BioMed Successes/Day 1107 Success % 77% WISDOM (Wide In Silico Docking On Malaria) The first biomedical data challenge for drug discovery, which ran on the EGEE grid production service from 11 July 2005 until 19 August 2005. GridPP resources in the UK contributed ~100,000 kSI2k-hours from 9 sites Normalised CPU hours contributed to the biomedical VO for UK sites, July-August 2005
"GridPP has been developed to help answer questions about the conditions in the Universe just after the Big Bang," said Professor Keith Mason, head of the Particle Physics and Astronomy Research Council (PPARC). "But the same resources and techniques can be exploited by other sciences with a more direct benefit to society."
GridPP & Industry What We Have To Offer Our Grid Security tools GridSite R-GMA APEL accounting system
Our Grid The UK Grid (via one of the individual university sites) can be used to run applications for areas such as finance and image processing.
Certification Authority Security Tools & Gridsite Grid Security for the Web Web platforms for Grids Digital Certificates Certification Authority Gridsite identifies users to websites with the digital certificates GridSiteWiki is an extension to the tool GridSite is open source (http://www.gridsite.org/) Gridsite: Have access to the sites' web-based editing interface.
RGMA & APEL accounting system Relational Grid Monitoring Architecture An information and monitoring system for static and dynamic information about grid resources, applications and networks Accounting Processor for Event Logs Provides a summary of the resources consumed based on attributes such as CPU time, Wall Clock Time, Memory and grid user identity
GridPP & Industry Current Involvement HP are sponsoring a joint project with GridPP at Bristol. GridPP has an association with IBM through collaboration on ScotGrid and R-GMA. Specific sites also have close relationships with various industrial suppliers.
GridPP & Industry Current Involvement Posters at “Technology Opportunities from CERN: the impact of Big Physics on Industry”. Attended KITE club meetings on: Healthcare, Medical image processing Film and computer games Speakers at a forum on Network and Grid Security organised for the IT industry.
Future Plan to establish a small steering group to lead technology transfer activity. The group, working with various companies, would examine different methods of technology transfer and identify the GridPP activities that can be used in industry and business. Examples of methods of technology transfer include: CERN’s openlab programme; the EGEE Industry Forum; HP’s collaboration with Bristol; PPARC’s industry awards scheme and the DTI Technology Programme