Presentation is loading. Please wait.

Presentation is loading. Please wait.

Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 1 What is e-Science and Grid computing? Dave Berry, NeSC EGEE is funded by the European.

Similar presentations


Presentation on theme: "Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 1 What is e-Science and Grid computing? Dave Berry, NeSC EGEE is funded by the European."— Presentation transcript:

1 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 1 What is e-Science and Grid computing? Dave Berry, NeSC EGEE is funded by the European Union under contract IST-2003-508833

2 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 2 What is e-Science and Grid computing? EGEE Training Team EGEE is funded by the European Union under contract IST-2003-508833

3 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 3 Acknowledgements This talk includes slides from previous tutorials and talks delivered by: the EDG training team Roberto Barbera, INFN Ian Foster, Argonne National Laboratories Jeffrey Grethe, SDSC the National e-Science Centre Prepared by Dave Berry, NeSC

4 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 4 Goals of this module To introduce the concepts of e-Science and Grid computing Assuming no previous knowledge

5 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 5 Overview Motivation for Grid Computing The idea of e-Science Global drivers for Grid and e-Science Some examples The basic ideas of Grid technology

6 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 6 What is Grid computing? “Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations” (I.Foster) Resources are controlled by their owners The Grid infrastructure provides access to collaborators A Virtual Organisation is: People from different institutions working to solve a common goal Sharing distributed processing and data resources Enabling People to Work Together on Challenging Projects Science, Engineering, Medicine, … Public service, commerce too!

7 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 7 The Grid: networked data processing centres and ”middleware” software as the “glue” of resources. Researchers perform their activities regardless geographical location, interact with colleagues, share and access data Scientific instruments and experiments provide huge amount of data The Grid Vision

8 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 8 The Bad Old Days We speak piously of taking measurements and making small studies that will add another brick to the temple of science. Most such bricks just lie around the brickyard. Platt, J.R. (1964) Strong Inference. Science. 146: 347-353.

9 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 9 The Grid Metaphor

10 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 10 Overview Motivation for Grid Computing The idea of e-Science Global drivers for Grid and e-Science Some examples The basic ideas of Grid technology

11 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 11 The Emergence of e-Science Invention and exploitation of advanced computational methods To generate, curate and analyse research data From experiments, observations and simulations Quality management, preservation and reliable evidence To develop and explore models and simulations Computation and data at extreme scales Trustworthy, economic, timely and relevant results To enable dynamic distributed virtual organisations Facilitating collaboration with information and resource sharing Security, reliability, accountability, manageability and agility

12 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 12 Why use Grids for Science? Scale of the problems Science increasingly done through distributed global collaborations enabled by the internet Grids provide access to: Very large data collections Terascale computing resources High performance visualisation Connected by high-bandwidth networks e-Science is more than Grid & Web Services It is what you do with them that counts

13 The Emergence of Global Knowledge Communities Slide from Ian Foster’s ssdbm 03 keynote

14 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 14 WIthin and Between Many Disciplines High Energy Physics Earthquake prediction Climatology Biosciences, Genetics Earth Observation Astronomy Composite materials research Engineering design Social sciences

15 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 15 Connecting people: Access Grid Microphones Cameras

16 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 16 Overview Motivation for Grid Computing The idea of e-Science Global drivers for Grid and e-Science Some examples The basic ideas of Grid technology

17 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 17 Global Drivers of e-Science Collaboration Enabling People to Work Together on Challenging Projects Digital technology – exponential growth Ubiquity & cost reduction Performance increase “Data deluge” Consequential Investment EU e-Infrastructure UK e-Science USA cyberinfrastructure Industry

18 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 18 Exponential Growth Gilder’s Law (32X in 4 yrs) Storage Law (16X in 4yrs) Moore’s Law (5X in 4yrs) Triumph of Light – Scientific American. George Stix, January 2001 Performance per Dollar Spent Optical Fibre (bits per second) Chip capacity (# transistors) Data Storage (bits per sq. inch) Number of Years 0 1 2 3 4 5 9 12 18 Doubling Time (months)

19 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 19 Example: Astronomy No. & sizes of data sets as of mid-2002, grouped by wavelength 12 waveband coverage of large areas of the sky Total about 200 TB data Doubling every 12 months Largest catalogues near 1B objects Data and images courtesy Alex Szalay, John Hopkins University

20 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 20 How Different 2004 is from 1994 Enormous quantities of data: Petabytes For an increasing number of communities Gating step is not collection but analysis Ubiquitous Internet: >100 million hosts Collaboration & resource sharing the norm Security and Trust are crucial issues Ultra-high-speed networks: >10 Gb/s Global optical networks Bottlenecks: last kilometre & firewalls Huge quantities of computing: >100 Top/s Moore’s law gives us all supercomputers Organising their effective use is the challenge Moore’s law everywhere Instruments, detectors, sensors, scanners, … Organising their effective use is the challenge

21 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 21 Overview Motivation for Grid Computing The idea of e-Science Global drivers for Grid and e-Science Some examples The basic ideas of Grid technology

22 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 22 Example: Earth Observation ESA missions: About 100 Gbytes of data per day (ERS 1/2) 500 Gbytes, for the next ENVISAT mission (2002). Grid contribution to EO: Enhance the ability to access high level products Allow reprocessing of large historical archives Improve Earth science complex applications (data fusion, data mining, modelling …) Source: L. Fusco, June 2001 Federico.Carminati, EU review presentation, 1 March 2002

23 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 23 Example: BioInformatics Medical images Exam image patient key ACL... 1. Query the medical image database and retrieve a patient image Metadata 3. Retrieve most similar cases Similar images Low score images 2. Compute similarity measures over the database images Submit 1 job per image Bio-informatics Phylogenetics Search for primers Statistical genetics Bio-informatics web portal Parasitology Data-mining on DNA chips Geometrical protein comparison Medical imaging MR image simulation Medical data and metadata management Mammographies analysis Simulation platform for PET/SPECT Applications deployed Applications tested on EDG Applications under preparation

24 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 24 ATLASCMS LHCb ~6-8 PetaBytes / year ~10 8 events/year ~10 3 batch and interactive users Example: High-Energy Physics

25 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 25 Example: Wearable Devices Easy Plug and Play of Sensors Wireless connection using 802.11 Positioning information from GPS Mobile medical technologies on a distributed Grid Sensor bus GPS aerial

26 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 26 Example: Medical Development Preparation and follow-up of medical missions in developing countries Support to local medical centres in terms of second diagnosis, patient follow-up and e-learning 2 missions (Ibagué & Chuxiong) with the french NPO « Chaîne de l’Espoir » used as test cases Ibagué Hand surgery Medical centre Clermont-Ferrand/Paris Chuxiong Example of HealthGRID application The grid impact : Improved telemedecine services Federation of patient databases Interactive e-learning (high bandwidth network required) Interactive e-learning Video-conferences Patient data Request for 2nd diagnostic

27 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 27 Overview Motivation for Grid Computing The idea of e-Science Global drivers for Grid and e-Science Some examples The basic ideas of Grid technology

28 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 28 Key concept The ability to negotiate resource-sharing arrangements among a set of participating parties (providers and consumers) and then to use the resulting resource pool for some purpose. (I.Foster)

29 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 29 Grids vs. Distributed Applications Distributed applications already exist, but they tend to be specialised systems intended for a single purpose or user group Grids go further and take into account: Different kinds of resources Not always the same hardware, data and applications Different kinds of interactions User groups or applications want to interact with Grids in different ways Dynamic nature Resources and users added/removed/changed frequently

30 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 30 Main Services of a Grid Architecture Service providers Publish the availability of their services via information systems Such services may come-and-go or change dynamically E.g. a testbed site that offers x CPUs and y GB of storage Service brokers Register and categorize published services and provide search capabilities E.g. 1) Resource Broker selects the best site for a “job” 2) Catalogues of data held at each testbed site Service requesters Single sign-on: log into the grid once Use brokering services to find a needed service and employ it E.g. CMS physicists submit a simulation job that needs 12 CPUs for 6 hours and 15 GB which gets scheduled, via the Resource Broker, on the CERN testbed site

31 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 31 Complex Infrastructure Users want access to compute power and data With security, reliability, trust, … This requires a complex infrastructure Registries Brokers Administration Policy Negotiation Etc. Users shouldn’t need to know the details Portals Problem-solving environments

32 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 32 Mammography: Computation Mammograms have different appearances, depending on image settings and acquisition systems Standard Mammo Format Standard Mammo Format Temporal mammography Computer Aided Detection 3D View Compute power can address several issues

33 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 33 DataImages The Logical View of this information is as a Single Resource Grid Patient Age … … Image 107258 55 … … 1.dcm 236008 62 … … 2.dcm 700266 59 … … 3.dcm 895301 58 … … 4.dcm ……… … … … … …….. ……… … … … … …….. ……… … … … … …….. ……… … … … … …….. ……… … … … … …….. ……… … … … … …….. ……… … … … … …….. ……… … … … … …….. Data DICOM Compute Standard Mammo Format Standard Mammo Format Data Mining Data Mining CADe CADi CADe CADi Mammography: Data

34 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 34 Mammography: Non-Functional Epidemiology Teaching Diagnosis Screening Epidemiology Teaching Diagnosis Screening Grid Ethics Legal Security Performance Manageability …… Scalability Auditability Epidemiology Teaching Diagnosis Screening Epidemiology Training Screening Anonymisation 256MB & 5 secs response Lossless Compression Encryption ~100 Centres Systems Administration Non-Repudiation

35 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 35 Grid security Resource providers are essentially “opening themselves up” to itinerant users Secure access to resources is required X.509 Public Key Infrastructure User’s identity has to be certified by (mutually recognized) national Certification Authorities (CAs) Resources (node machines) have to be certified by CAs Temporary delegation from users to processes to be executed “in user’s name” ( proxy and myproxy certificates ) Common agreed policies for accessing resource and handling user’s rights across different domains within Virtual Organizations

36 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 36 Summary Internet

37 Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 37 Questions?


Download ppt "Induction: What is e-Science and Grid computing? –April 26-28, 2004 - 1 What is e-Science and Grid computing? Dave Berry, NeSC EGEE is funded by the European."

Similar presentations


Ads by Google