Steve LloydInaugural Lecture - 24 November 2004 Slide 1 The Data Deluge and the Grid Steve Lloyd Professor of Experimental Particle Physics Inaugural Lecture.

Slides:



Advertisements
Similar presentations
S.L.LloydATSE e-Science Visit April 2004Slide 1 GridPP – A UK Computing Grid for Particle Physics GridPP 19 UK Universities, CCLRC (RAL & Daresbury) and.
Advertisements

Slide 1 Steve Lloyd Culham - July 2006 The Data Deluge and the Grid Steve Lloyd Queen Mary, University of London Culham July 2006.
GridPP, The Grid & Industry Who we are, what it is and what we can do. Tony Doyle, Project Leader Steve Lloyd, Collaboration Board Chairman Robin Middleton,
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
The Grid Professor Steve Lloyd Queen Mary, University of London.
Your university or experiment logo here What is it? What is it for? The Grid.
The Grid What is it? what is it for?. Your university or experiment logo here Web: information sharing Invented at CERN by Tim Berners-Lee No. of Internet.
S.L.LloydGridPP Collaboration Meeting IC Sept 2002Slide 1 Introduction Welcome to the 5 th GridPP Collaboration Meeting Steve Lloyd, Chair of GridPP.
GridPP Building a UK Computing Grid for Particle Physics A PPARC funded project.
Slide 1 of 24 Steve Lloyd NW Grid Seminar - 11 May 2006 GridPP and the Grid for Particle Physics Steve Lloyd Queen Mary, University of London NW Grid Seminar.
Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop.
Particle physics – the computing challenge CERN Large Hadron Collider –2007 –the worlds most powerful particle accelerator –10 petabytes (10 million billion.
UK Agency for the support of: High Energy Physics - the nature of matter and mass Particle Astrophysics - laws from natural phenomena Astronomy - the.
Fighting Malaria With The Grid. Computing on The Grid The Internet allows users to share information across vast geographical distances. Using similar.
Why LHC? Tara Shears, University of Liverpool. To understand the universe … Fundamental particles atoms stars and galaxies NOW Investigate with astrophysics,
GridPP From Prototype to Production David Britton 21/Sep/06 1.Context – Introduction to GridPP 2.Performance of the GridPP/EGEE/wLCG Grid 3.Some Successes.
Enabling e-Research over GridPP Dan Tovey University of Sheffield.
Tony Doyle “GridPP2 Proposal”, GridPP7 Collab. Meeting, Oxford, 1 July 2003.
S.L.LloydGridPP CB 29 Oct 2002Slide 1 Agenda 1.Introduction – Steve Lloyd 2.Minutes of Previous Meeting (23 Oct 2001) 3.Matters Arising 4.Project Leader's.
GEODE Workshop 16 th January 2007 Issues in e-Science Richard Sinnott University of Glasgow Ken Turner University of Stirling.
Randall Sobie The ATLAS Experiment Randall Sobie Institute for Particle Physics University of Victoria Large Hadron Collider (LHC) at CERN Laboratory ATLAS.
E-Science and Global Grids in the Information Society: The Role of an EU e-IRG? Tony Hey Director of the UK e-Science Core Programme
Searching for the Higgs – spearheading grid Tara Shears University of Liverpool.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
Introduction to Computers
Birmingham Particle Physics Masterclass 23 th April 2008 Birmingham Particle Physics Masterclass 23 th April 2008 The Grid What & Why? Presentation by:
Intro to grid computing Cristy Burne GridTalk Queen Mary University of London.
Exploiting the Grid to Simulate and Design the LHCb Experiment K Harrison 1, N Brook 2, G Patrick 3, E van Herwijnen 4, on behalf of the LHCb Grid Group.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow GridPP Vendor Day 30 th April.
GridPP Steve Lloyd, Chair of the GridPP Collaboration Board.
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
1 e-science in the UK Peter Watkins Head of Particle Physics University of Birmingham, UK
The Grid Prof Steve Lloyd Queen Mary, University of London.
A long tradition. e-science, Data Centres, and the Virtual Observatory why is e-science important ? what is the structure of the VO ? what then must we.
Introduction to Computers
Tony Doyle GridPP – From Prototype To Production, GridPP10 Meeting, CERN, 2 June 2004.
Tony Doyle - University of Glasgow 12 January 2005Collaboration Board GridPP: Executive Summary Tony Doyle.
GridPP & The Grid Who we are & what it is Tony Doyle.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
1 st EGEE Conference – April UK and Ireland Partner Dave Kant Deputy ROC Manager.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
Computer Confluence 7/e © 2006 Prentice-Hall, Inc.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
To the Grid From the Web Dr. Francois Grey IT Department, CERN.
“Grids and eScience” Mark Hayes Technical Director - Cambridge eScience Centre GEFD Summer School 2003.
GridPP Building a UK Computing Grid for Particle Physics Professor Steve Lloyd, Queen Mary, University of London Chair of the GridPP Collaboration Board.
11-Feb-2004 IoP Half Day Meeting: Getting Ready For the Grid Peter Clarke SC2003 Video.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
…building the next IT revolution From Web to Grid…
Tony Doyle - University of Glasgow 8 July 2005Collaboration Board Meeting GridPP Report Tony Doyle.
The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
Your university or experiment logo here What is it? What is it for? The Grid.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.
An Introduction to UK e-Science Anne E Trefethen Deputy Director UK e-Science Core Programme.
1 Introduction to Computers Prof. Sokol Computer and Information Science Brooklyn College.
COMPUTER SYSTEM A computer system is define as combination of components designed to process data and store files. A computer system consists of four.
J Jensen/J Gordon RAL Storage Storage at RAL Service Challenge Meeting 27 Jan 2005.
Slide § David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow GridPP delivering The UK Grid.
18/12/03PPD Christmas Lectures 2003 Grid in the Department A Guide for the Uninvolved PPD Computing Group Christmas Lecture 2003 Chris Brew.
© OCR 2016 Unit 2.6 Data Representation Lesson 1 ‒ Numbers.
GridPP, The Grid & Industry
The LHC Computing Grid Visit of Mtro. Enrique Agüera Ibañez
Understanding the nature of matter -
UK GridPP Tier-1/A Centre at CLRC
Building a UK Computing Grid for Particle Physics
Computer Hardware Introduction.
Tour of CERN Computer Center
The LHC Computing Grid Visit of Professor Andreas Demetriou
Presentation transcript:

Steve LloydInaugural Lecture - 24 November 2004 Slide 1 The Data Deluge and the Grid Steve Lloyd Professor of Experimental Particle Physics Inaugural Lecture

Steve LloydInaugural Lecture - 24 November 2004 Slide 2 Outline What is Data? Where it comes from – e-Science The CERN LHC and Experiments What is the Grid? GridPP Challenges ahead

Steve LloydInaugural Lecture - 24 November 2004 Slide 3 What is Data? Anything that can be expressed as numbers Raw Information → Numbers → Binary Digits Pictures Electrical Signals Sound Store amount of Red, Green and Blue Store loudness at each time Lots of Pictures + Sound = DVD Video Store voltage or current Text Every character has a numerical code

Steve LloydInaugural Lecture - 24 November 2004 Slide 4 Digital Data Numbers are stored as Binary digits 1 Bit = 0 or 1 1 Byte = 8 bits Can store yes/no or on/off Can store numbers from 0 to 255 (Enough for a character a-z, A-Z, ) 25 = 0x x64 + 0x32 + 1x16 +1x8 + 0x4 + 0x2 + 1x1 = kiloByte = ~1,000 Bytes Typical Word Document ~30kB 1 MegaByte = ~1,000,000 Bytes A Floppy Disk ~1.4MB A CD ~700MB 1 GigaByte = ~1,000,000,000 Bytes Typical PC Hard Drive GB 1 TeraByte = ~1,000,000,000,000 Bytes 1 PetaByte = ~1,000,000,000,000,000 Bytes ~1.4 Million CDs 1 ExaByte = ~1,000,000,000,000,000,000 Bytes World Annual Book Production World Annual Information Production

Steve LloydInaugural Lecture - 24 November 2004 Slide 5 Data Analysis What is done with data? NothingRead itListen to it Watch it Analyse it 2323 Read A Read B C = A + B Print C 5 Computer Program "Job" Calculate how proteins fold Calculate what the weather is going to do

Steve LloydInaugural Lecture - 24 November 2004 Slide 6 e-Science In the UK this sort of activity has become known as "e-Science" "e-Science will change the dynamic of the way Science is undertaken" "Science increasingly done through distributed global collaborations enabled by the internet using very large data collections, terascale computing resources and high performance visualisation" Dr John Taylor - Director General of Research Councils: "e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it"

Steve LloydInaugural Lecture - 24 November 2004 Slide 7 Astronomy Crab Nebula Optical Radio Infra-red X-ray Jet in M87 HST optical Gemini mid-IR VLA radio Chandra X-ray Virtual Observatories

Steve LloydInaugural Lecture - 24 November 2004 Slide 8 Earth Observation 1 TB/day Ozone map Ottawa Trafalgar Square

Steve LloydInaugural Lecture - 24 November 2004 Slide 9 Species 2000 To enumerate all ~1.7 million known species of plants, animals, fungi and microbes on Earth for studies of biodiversity A federation of initially 18 taxonomic databases - eventually ~ 200 databases From protozoa to platypus to primates

Steve LloydInaugural Lecture - 24 November 2004 Slide 10 Bioinformatics

Steve LloydInaugural Lecture - 24 November 2004 Slide 11 Healthcare Dynamic Brain Atlas Breast Screening Scanning Remote Consultancy

Steve LloydInaugural Lecture - 24 November 2004 Slide 12 Collaborative Engineering Real-time collection Multi-source Data Analysis Unitary Plan Wind Tunnel Archival storage

Steve LloydInaugural Lecture - 24 November 2004 Slide 13 Digital Curation Digitization of almost anything To create Digital Libraries and Museums

Steve LloydInaugural Lecture - 24 November 2004 Slide 14 The CERN LHC 4 Large Experiments The world’s most powerful particle accelerator

Steve LloydInaugural Lecture - 24 November 2004 Slide 15 7,000 tonnes 42m long 22m wide 22m high 2,000 Physicists 150 Institutes 34 Countries ATLAS Detector (About the height of a 5 storey building)

Steve LloydInaugural Lecture - 24 November 2004 Slide 16 ATLAS Pit

Steve LloydInaugural Lecture - 24 November 2004 Slide 17 The Higgs Primary objective of the LHC - What is the origin of Mass? Is it the Higgs Particle? Massless Particle – Travels at the speed of light Low Mass Particle – Travels slower High Mass Particle – Travels slower still

Steve LloydInaugural Lecture - 24 November 2004 Slide 18 Starting from this event… We are looking for this “signature” Selectivity: 1 in Like looking for 1 person in a thousand world populations Or for a needle in 20 million haystacks! LHC Data Challenge ~100,000,000 electronic channels 800,000,000 proton- proton interactions per second Higgs per second 10 PBytes of data a year (10 Million GBytes = 14 Million CDs)

Steve LloydInaugural Lecture - 24 November 2004 Slide 19 LHC Computing Requirements CPU Power (Reconstruction, Simulation, User Analysis etc) - 50,000 of today's PCs Distributed Computing Solution – "The Grid" 'Tape' Storage 20 PetaBytes (= 20 M GBytes) Disk Storage – 2.5 PetaBytes (= 2.5 M GBytes)

Steve LloydInaugural Lecture - 24 November 2004 Slide 20 The Grid Ian Foster / Carl Kesselman: "A computational Grid is a hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities." 'Grid' means different things to different people All agree it's a funding opportunity!

Steve LloydInaugural Lecture - 24 November 2004 Slide 21 Electricity Grid Analogy with the Electricity Power Grid 'Standard Interface' Power Stations Distribution Infrastructure

Steve LloydInaugural Lecture - 24 November 2004 Slide 22 Computing Grid Computing and Data Centres Fibre Optics of the Internet

Steve LloydInaugural Lecture - 24 November 2004 Slide 23 What is the Grid? MIDDLEWARE CPU Disks, CPU etc PROGRAMS OPERATING SYSTEM Word/Excel /Web Your Program Games CPU Cluster User Interface Machine CPU Cluster CPU Cluster Resource Broker Information Service Single PC Grid Disk Cluster Your Program Middleware is the Operating System of a distributed computing system

Steve LloydInaugural Lecture - 24 November 2004 Slide 24 What is the Grid? From this: To this:

Steve LloydInaugural Lecture - 24 November 2004 Slide 25 A distributed computing project - not really a Grid project You pull the data from them rather than they submit the job to you Arecibo telescope in Puerto Rico Users - 5,240,038 Results received – 1,632,106,991 Years of CPU Time – 2,121,057 Extraterrestrials found – 0

Steve LloydInaugural Lecture - 24 November 2004 Slide 26 Entropia Uses idle cycles on Home PCs for profit and non-profit projects: 60,000 Machines 1,400 years of cpu time Rebranding!

Steve LloydInaugural Lecture - 24 November 2004 Slide 27 GridPP 19 UK Universities, CCLRC (RAL & Daresbury) and CERN Funded by the Particle Physics and Astronomy Research Council (PPARC) GridPP £17m "From Web to Grid" GridPP £15m "From Prototype to Production"

Steve LloydInaugural Lecture - 24 November 2004 Slide 28 International Collaboration EU DataGrid (EDG) –Middleware Development Project US and other Grid projects –Interoperability LHC Computing Grid (LCG) –Grid Deployment Project for LHC EU Enabling Grids for e-Science in Europe (EGEE) –Grid Deployment Project for all disciplines

Steve LloydInaugural Lecture - 24 November 2004 Slide 29 Application Development ATLAS LHCbCMS BaBar (SLAC) SAMGrid (FermiLab)QCDGrid

Steve LloydInaugural Lecture - 24 November 2004 Slide 30 Middleware Development Configuration Management Storage Interfaces Network Monitoring Security Information Services Grid Data Management

Steve LloydInaugural Lecture - 24 November 2004 Slide 31 Tier Structure 'Tier-0' – where the data comes from 'Tier-1' – major centres in large countries 'Tier-2' – smaller centres in large countries or smaller countries UK Tier-1 US Tier-1 Italy Tier-1 Germany Tier-1 France Tier-1 Spain Tier-2 Poland Tier-2... Tier-2... Tier-1 UK Tier-2 CERN Tier-0 Tier structure not necessarily appropriate for all disciplines

Steve LloydInaugural Lecture - 24 November 2004 Slide 32 UK Tier-1/A Centre High quality data services National and International Role UK focus for International Grid development 700 Dual CPU 80 TB Disk 60 TB Tape (Capacity 1PB) Grid Operations Centre

Steve LloydInaugural Lecture - 24 November 2004 Slide 33 UK Tier-2 Centres ScotGrid Durham, Edinburgh, Glasgow NorthGrid Daresbury, Lancaster, Liverpool, Manchester, Sheffield SouthGrid Birmingham, Bristol, Cambridge, Oxford, RAL PPD, Warwick London Brunel, Imperial, QMUL, RHUL, UCL Mostly funded by HEFCE

Steve LloydInaugural Lecture - 24 November 2004 Slide 34 The Grid at QM The Queen Mary e-Science High Throughput Cluster 174 PCs (348 CPUs) 40 TByte Disk Storage Part of the London Tier-2 Centre

Steve LloydInaugural Lecture - 24 November 2004 Slide 35 The LCG Grid 89 Sites 9,056 CPUs 3 PBytes Disk

Steve LloydInaugural Lecture - 24 November 2004 Slide 36 Grid Snapshot

Steve LloydInaugural Lecture - 24 November 2004 Slide 37 Challenges (Ex-)Concorde (15 km) CD stack with 1 year LHC data (~ 20 km) We are here (1 km) Scaling to full size ~10,000 → 100,000 CPUs Stability, Robustness etc Security (Hackers Paradise!) Sharing resources (in RAE environment!) International Collaboration Continued funding beyond start of LHC!

Steve LloydInaugural Lecture - 24 November 2004 Slide 38 Further Info