Download presentation
Presentation is loading. Please wait.
Published byCamron Holt Modified over 8 years ago
1
Chinese Delegation Visit High Performance Computer Mission UK e-Science & The National e-Science Centre Prof. Malcolm Atkinson Director www.nesc.ac.uk 22 nd October 2003
2
Outline What is e-Science? Our Role in Enabling e-Science Information Grids & Data Dominates
3
Foundation for e-Science sensor nets Shared data archives computers software colleagues instruments Grid e-Science methodologies will rapidly transform science, engineering, medicine and business driven by exponential growth (×1000/decade) enabling a whole-system approach Diagram derived from Ian Foster’s slide
4
e-Science and SR2002 Research Council 2004-62001-4 Medical£13.1M (£8M) Biological£10.0M (£8M) Environmental £8.0M (£7M) Eng & Phys£18.0M(£17M) HPC £2.5M (£9M) Core Prog.£16.2M + ?(£15M) + £20M Particle Phys & Astro £31.6M(£26M) Economic & Social £10.6M (£3M) Central Labs £5.0M (£5M)
5
www.nesc.ac.uk
6
Cambridge Newcastle Edinburgh Oxford Glasgow Manchester Cardiff Southampton London Belfast Daresbury Lab RAL Hinxton NeSC in the UK National e- Science Centre HPC(x) Directors’ Forum Helped build a community Engineering Task Force Grid Support Centre Architecture Task Force UK Adoption of OGSA OGSA Grid Market Workflow Management Database Task Force OGSA-DAI GGF DAIS-WG Globus Alliance
7
Events held in the 2 nd Year (from 1 Aug 2002 to 31 Jul 2003) We have had 86 events: (Year 1 figures in brackets) 11 project meetings ( 4) 11 research meetings ( 7) 25 workshops (17 + 1) 2 “summer” schools(0) 15 training sessions(8) 12 outreach events(3) 5 international meetings(1) 5 e-Science management meetings (7) (though the definitions are fuzzy!) > 3600 Participant Days Suggestions always welcome Establishing a training team Investing in community building, skill generation & knowledge development
8
Workshops Space for real work Crossing communities Creativity: new strategies and solutions Written reports Scientific Data Mining, Integration and Visualisation Grid Information Systems Portals and Portlets Virtual Observatory as a Data Grid Imaging, Medical Analysis and Grid Environments Grid Scheduling Provenance & Workflow GeoSciences & Scottish Bioinformatics Forum http://www.nesc.ac.uk/events/
9
Duties of a NeSC Research Leader encouraging the uptake of Grid technologies in Astronomy and related fields encouraging visitors, with whom you have a research overlap, to visit Edinburgh and work with you and other local colleagues organising and running research workshops assisting with the development of new core Grid and scientific database technologies promoting NeSC within the Universities of Edinburgh and Glasgow through, for example, personal presentations and more widely at conferences and workshops. …and all that in 0.5 FTE! Slide from Dr Bob Mann: 30 September 2003
10
Three-way Alliance Computing Science Systems, Notations & Formal Foundation → Process & Trust Theory Models & Simulations → Shared Data Experiment & Advanced Data Collection → Shared Data Multi-national, Multi-discipline, Computer-enabled Consortia, Cultures & Societies Requires Much Engineering, Much Innovation Changes Culture, New Mores, New Behaviours New Opportunities, New Results, New Rewards
11
Global Drivers of e-Science Collaboration Data Deluge Digital Technology Ubiquity Cost reduction Performance increase Consequential Investment UK e-Science £240 million + 80 companies EU e-Infrastructure USA cyberinfrastructure …
12
Derived from Ian Foster’s slide at ssdbM July 03 It’s Easy to Forget How Different 2003 is From 1993 Enormous quantities of data: Petabytes For an increasing number of communities gating step is not collection but analysis Ubiquitous Internet: >100 million hosts Collaboration & resource sharing the norm Security and Trust are crucial issues Ultra-high-speed networks: >10 Gb/s Global optical networks Bottlenecks: last kilometre & firewalls Huge quantities of computing: >100 Top/s Moore’s law gives us all supercomputers Organising their effective use is the challenge Moore’s law everywhere Instruments, detectors, sensors, scanners, … Organising their effective use is the challenge
13
Global Knowledge Communities Often Driven by Data: E.g., Astronomy No. & sizes of data sets as of mid-2002, grouped by wavelength 12 waveband coverage of large areas of the sky Total about 200 TB data Doubling every 12 months Largest catalogues near 1B objects Data and images courtesy Alex Szalay, John Hopkins
14
Tera → Peta Bytes RAM time to move 15 minutes 1Gb WAN move time 10 hours ($1000) Disk Cost 7 disks = $5000 (SCSI) Disk Power 100 Watts Disk Weight 5.6 Kg Disk Footprint Inside machine RAM time to move 2 months 1Gb WAN move time 14 months ($1 million) Disk Cost 6800 Disks + 490 units + 32 racks = $7 million Disk Power 100 Kilowatts Disk Weight 33 Tonnes Disk Footprint 60 m 2 May 2003 Approximately Correct See also Distributed Computing Economics Jim Gray, Microsoft Research, MSR-TR-2003-24
15
Three Pillars of e-Science Research FoundationsTechnologyApplications Apply known results Focus for new work Enable new science Steering of development Informatics DCS Physics & Astronomy EPCC ETF&Testbeds edikt Repositories Computing industry Research Departments Research Institutes Scottish universities Commercial customers
16
Deployment L2G OGSA UK Test Grid Local Grid IBM Contract Funding? Testbed?
17
Research Applications NanoCMOS Grid AstroGrid Research Applications Physics FireGrid ScotGrid GridPP QCDGrid Proposals BioInf ODD-Genes? NeuroInf e-Health
18
CS Research Projects Mobile Code CoAKTinG Scientific Data DIRC Engage Automated Synthesis Ontologies Papers Staff CS Research EQUATOR AMUSE
19
Middleware, etc. Grid Core Programme OGSA-DAI DAIT Middleware, Etc. edikt OMII eldas FirstDIG BinX PGPGrid MS.NETGrid BRIDGES SunDCG GridWeaver ODD-Genes
20
Data Access, Integration, Publication, Annotation and Curation Linguistic Corpora ArkDB Theme: DAIPAC edikt Mouse Atlas Repositories OGSA-DAI & DAIT DCC Proposal ODD-Genes Mobile Code Scientific Data CS Research GTI BRIDGES Super COSMOS AstroGrid ScotGrid GridPP QCDGrid Research Applications
21
OGSA Infrastructure Architecture OGSI: Interface to Grid Infrastructure Data Intensive Applications for Science X Compute, Data & Storage Resources Distributed Simulation, Analysis & Integration Technology for Science X Data Intensive Users Virtual Integration Architecture Generic Virtual Data Access and Integration Layer Structured Data Integration Structured Data Access Structured Data Relational XML Semi-structured- Transformation Registry Job Submission Data TransportResource Usage Banking BrokeringWorkflow Authorisation
22
Data Services GGF Data Access and Integration Svcs (DAIS) OGSI-compliant interfaces to access relational and XML databases Needs to be generalized to encompass other data sources (see next slide…) Generalized DAIS becomes the foundation for: Replication: Data located in multiple locations Federation: Composition of multiple sources Provenance: How was data generated?
23
“OGSA Data Services” (Foster, Tuecke, Unger, eds.) Describes conceptual model for representing all manner of data sources as Web services Database, filesystems, devices, programs, … Integrates WS-Agreement Data service is an OGSI-compliant Web service that implements one or more of base data interfaces: DataDescription, DataAccess, DataFactory, DataManagement These would be extended and combined for specific domains (including DAIS)
24
OGSA-DAI Approach Reuse existing technologies and standards OGSA, Query languages, Java, transport Build portTypes and services which will enable: controlled exposure of heterogenous data resources on an OGSI- compliant grid access to these resource via common interfaces using existing underlying query mechanisms (ultimately) data integration across distributed data resources OGSA-DAI (the software) seeks to be a reference implementation of the GGF DAIS WG standard Can’t keep up with frequent standard changes, so software releases track specific drafts See http://www.ogsadai.org.uk/ for details.http://www.ogsadai.org.uk/
25
1a. Request to Registry for sources of data about “x” 1b. Registry responds with Factory handle 2a. Request to Factory for access to database 2c. Factory returns handle of GDS to client 3a. Client queries GDS with XPath, SQL, etc 3b. GDS interacts with database 3c. Results of query returned to client as XML SOAP/HTTP service creation API interactions RegistryFactory 2b. Factory creates GridDataService to manage access Grid Data Service Client XML / Relationa l database Data Access & Integration Services
26
Third Party Delivery
27
GDTS 2 GDS 3 2 GDTS 1 S x S y 1a. Request to Registry for sources of data about “x” & “y” 1b. Registry responds with Factory handle 2a. Request to Factory for access and integration from resources Sx and Sy 2b. Factory creates GridDataServices network 2c. Factory returns handle of GDS to client 3a. Client submits sequence of scripts each has a set of queries to GDS with XPath, SQL, etc 3c. Sequences of result sets returned to analyst as formatted binary described in a standard XML notation SOAP/HTTP service creation API interactions Data Registry Data Access & Integration master Client Analyst XML database Relational database GDS GDTS 3b. Client tells analyst GDS 1 Future DAI Services “scientific” Application coding scientific insights Problem Solving Environment Semantic Meta data Application Code
28
Take Home Message Data is a Major Source of Challenges AND an Enabler of New Science, Engineering, Medicine, Planning, … Information Grids Support for collaboration Support for computation and data grids Structured data fundamental Integrated strategies & technologies needed OGSA-DAI is here now Join in making better DAI services & standards Bioinformatics is a Priority Application Area There are many opportunities for International collaboration
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.