UK e-Science Grid Infrastructure meets Biological Research Challenges Malcolm Atkinson Director of National e-Science Centre www.nesc.ac.uk 2 nd October.

Slides:



Advertisements
Similar presentations
EU DataGrid progress Fabrizio Gagliardi EDG Project Leader
Advertisements

An open source approach for grids Bob Jones CERN EU DataGrid Project Deputy Project Leader EU EGEE Designated Technical Director
Delivery of Industrial Strength Middleware Federated Strengths Agility & Coordination Prof. Malcolm Atkinson Director 21 st January 2004.
GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
The UK e-Science Programme & The National e-Science Centre Malcolm Atkinson Director of NeSC Universities of Edinburgh and Glasgow Pilot Projects Meeting.
The National Grid Service Mike Mineter.
Chinese Delegation visit Malcolm Atkinson Director 18 th November 2004.
Databases and the Grid OGSA-DAI Architecture & Requirements Malcolm Atkinson OGSA-DAI Chief Architect Director of National e-Science Centre
UK Role in Open Grid Services Architecture Towards an Architectural Road Map A Report to the Technical Advisory Group from The Architecture Task Force.
Research Councils ICT Conference Welcome Malcolm Atkinson Director 17 th May 2004.
National e-Science Centre Glasgow e-Science Hub Opening: Remarks NeSCs Role Prof. Malcolm Atkinson Director 17 th September 2003.
National e-Science Centre & e-Science Institute Malcolm Atkinson Director 2 nd March 2005.
Open Grid Service Architecture - Data Access & Integration (OGSA-DAI) Dr Martin Westhead Principal Consultant, EPCC Telephone: Fax:+44.
NeSC: National e-Science Centre. NeSC Mission Help the UK develop international strength in Grid computing Industry, Commerce, Scientific Research, …
Data Challenges in e-Science Aberdeen Prof. Malcolm Atkinson Director 2 nd December 2003.
Databases and the Grid OGSA-DAI Architecture & Status Malcolm Atkinson OGSA-DAI Chief Architect for all members of the OGSA-DAI team Director of National.
UK e-Science Report on OGSA, OGSI & OGSA-DAI Malcolm Atkinson Director of National e-Science Centre 28 th October 2002 Meeting of the UK.
National e-Science Institute and National e-Science Centre Review Welcome Prof. Malcolm Atkinson Director 11 th October 2004.
UK Role in Open Grid Services Architecture Towards an Architectural Road Map Malcolm Atkinson Director of NeSC 18 th April 2002.
Current status of grids: the need for standards Mike Mineter TOE-NeSC, Edinburgh.
18 April 2002 e-Science Architectural Roadmap Open Meeting 1 Support for the UK e-Science Roadmap David Boyd UK Grid Support Centre CLRC e-Science Centre.
SWITCH Visit to NeSC Malcolm Atkinson Director 5 th October 2004.
02/07/03 Grid Support Centre 1 UK Grid Support Centre Alistair Mills CLRC e-Science Centre
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
Directors Meeting Oxford e-Research Centre Malcolm Atkinson Director e-Science Institute & e-Science Envoy 10 th October 2006.
LHCb Bologna Workshop Glenn Patrick1 Backbone Analysis Grid A Skeleton for LHCb? LHCb Grid Meeting Bologna, 14th June 2001 Glenn Patrick (RAL)
National e-Science Centre Arthur Trew Director, EPCC Deputy Director, NeSC.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
EInfrastructures (Internet and Grids) US Resource Centers Perspective: implementation and execution challenges Alan Blatecky Executive Director SDSC.
GEODE Workshop 16 th January 2007 Issues in e-Science Richard Sinnott University of Glasgow Ken Turner University of Stirling.
THE JOINED UP WORLD OF E-RESEARCH Professor Neil McLean National Technical Standards Adviser to the Department of Education Science and Training (DEST)
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Welcome e-Science in the UK Building Collaborative eResearch Environments Prof. Malcolm Atkinson Director 23 rd February 2004.
Database Taskforce and the OGSA-DAI Project Norman Paton University of Manchester.
DATAGRID Testbed release 0 Organization and working model F.Etienne, A.Ghiselli CNRS/IN2P3 – Marseille, INFN-CNAF Bologna DATAGRID Conference, 7-9 March.
Extensible Framework for Data Access & Integration Malcolm Atkinson Director 10 th November 2004.
SLICE Simulation for LHCb and Integrated Control Environment Gennady Kuznetsov & Glenn Patrick (RAL) Cosener’s House Workshop 23 rd May 2002.
DAIT (DAI Two) NeSC Review 18 March Description and Aims Grid is about resource sharing Data forms an important part of that vision Data on Grids:
11 December 2000 Paolo Capiluppi - DataGrid Testbed Workshop CMS Applications Requirements DataGrid Testbed Workshop Milano, 11 December 2000 Paolo Capiluppi,
1 UK e-Science National e-Science Centre Open Day Prof. Malcolm Atkinson Director 17 th January 2003.
Future of e-Science Malcolm Atkinson Director 18 th March 2004.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
DATAGRID Testbed Work Package (report) F.Etienne, A.Ghiselli CNRS/IN2P3 – Marseille, INFN-CNAF Bologna DATAGRID Conference, 7-9 March 2001 Amsterdam.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Combining the strengths of UMIST and The Victoria University of Manchester “Use cases” Stephen Pickles e-Frameworks meets e-Science workshop Edinburgh,
IBM & HSBC visit Malcolm Atkinson Director & e-Science Envoy UK National e-Science Centre & e-Science Institute 30 th March 2006.
1 The Challenge of Data Integration Data + Grid = Discovery? Prof. Malcolm Atkinson Director 22 nd January 2003.
OGSA-DAI & DAIT projects Update for TAG Prof. Malcolm Atkinson Director 30 th October 2003.
OGSA-DAI Users’ Meeting Introduction Malcolm Atkinson Director 7 th April 2004.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
National e-Science Institute and National e-Science Centre The Way Ahead Prof. Malcolm Atkinson Director 30 th September 2003.
UK e-Science Future Infrastructure for Scientific Data Mining, Integration and Visualisation Malcolm Atkinson Director of National e-Science Centre
NSF Middleware Initiative Purpose To design, develop, deploy and support a set of reusable, expandable set of middleware functions and services that benefit.
The OGSA-DAI Project Databases and the Grid Neil Chue Hong Project Manager EPCC, Edinburgh
Toward a common data and command representation for quantum chemistry Malcolm Atkinson Director 5 th April 2004.
Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director 3 rd October 2003.
The National Grid Service Mike Mineter.
LHC Computing, SPC-FC-CC-C; H F Hoffmann1 CERN/2379/Rev: Proposal for building the LHC computing environment at CERN (Phase 1) Goals of Phase.
Welcome Grids and Applied Language Theory Dave Berry Research Manager 16 th October 2003.
UK Role in Open Grid Services Architecture Towards an Architectural Road Map A Report to the Technical Advisory Group from The Architecture Task Force.
RC ICT Conference 17 May 2004 Research Councils ICT Conference The UK e-Science Programme David Wallace, Chair, e-Science Steering Committee.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
Bob Jones EGEE Technical Director
Update to the Community GGF16 - Athens
Jarek Nabrzyski Director, Center for Research Computing
Welcome to National e-Science Centre Official Opening
UK e-Science OGSA-DAI November 2002 Malcolm Atkinson
The National Grid Service
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Presentation transcript:

UK e-Science Grid Infrastructure meets Biological Research Challenges Malcolm Atkinson Director of National e-Science Centre 2 nd October 2002 The UK Biological Grid Data and Computation The Wellcome Trust Genome Campus Hinxton, Cambridgeshire

Overview UK e-Science Reminder of Investment and Infrastructure International e-Science Examples and Collaboration Data Access and Integration Lego Bricks for Scientific Application Developers A Computer Scientists View of Biology Diversity and Opportunity The Way Ahead

e-Science Fundamentally about Collaboration Sharing Ideas Thought processes and Stimuli Effort Resources Requires Communication Common understanding & Framework Mechanisms for sharing fairly Organisation and Infrastructure Scientists (Biologists) have done this for Centuries

e-Science (take 2) Fundamentally about Collaboration Sharing Ideas Thought processes and Stimuli Effort Resources Requires Communication Common understanding & Framework Mechanisms for sharing fairly Organisation and Infrastructure Text, digital media, structured, organised & curated data, computable models, visualisation, shared instruments, shared systems, shared administration, … Nationally & Internationally Distributed, … Routine, Daily, Automated, … That Requires very Significant Investment in Digital Systems and their Support

e-Science (take 3) Fundamentally about Collaboration Sharing Ideas Thought processes and Stimuli Effort Resources Requires Communication Common understanding & Framework Mechanisms for sharing fairly Organisation and Infrastructure Digital networks, digital work- places, digital instruments, … Metadata, ontologies, standards, shared curated data, shared codes, … Common platforms, shared software, shared training, … The Grid SHOULD make this much easier by providing a common, supported high-level of Software and Organisational infrastructure Authentication, Authorisation, Accounting, Provenance, Policies, … Shared Provision of Platform,

Grid Expectations Persistence Always there, Always Working, Always Supported Stability You can build on foundations that dont move Trustworthy & Predictable Honours commitments Digital policies, digital contracts, security, … Data integrity, longevity and accessibility Performance High-level & Extensible The capabilities you need are already there Ubiquitous Your collaborators use it

Grid Reality Persistence Always there, Always Working, Always Supported Stability You can build on foundations that dont move Trustworthy & Predictable Honours commitments Digital policies, digital contracts, security, … Data integrity, longevity and accessibility Performance High-level & Extensible The capabilities you need are already there Ubiquitous Your collaborators use it Political, Economic & Technical issues to Solve Early days but Open Grid Services link with Web Services + GGF standardisation Not yet but very substantial global effort to achieve this Good basis for extension Commitment to basic functionality WS + Community effort Global & Industrial Rallying Cry Must work with Web Services

Cambridge Newcastle Edinburgh Oxford Glasgow Manchester Cardiff Southampton London Belfast Daresbury Lab RAL Hinxton UK Grid Network National e- Science Centre always-on video walls Access Grid always-on video walls HPC(x)

National e-Science Centre Events Workshops Research Meetings International Meetings History of Events GGF5 HPDC11 Summer school > 50 workshops held > 1000 people in total Many return often Planned Events 25 workshops Conferences to 2005 Visitors 3 arrived 4 arranged International collaboration, visits & visitors China Argonne National Lab SDSC NCSA … Centre Projects Pilot Projects Regional Support Research Projects EPSRC, MRC, WT, SHEFC

A day in the life of NeSC

DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago tomographic reconstruction real-time collection wide-area dissemination desktop & VR clients with shared controls Advanced Photon Source Online Access to Scientific Instruments archival storage From Steve Tuecke 12 Oct. 01

UCSF UIUC From Klaus Schulten, Center for Biomollecular Modeling and Bioinformatics, Urbana-Champaign

DataGrid Testbed Dubna Moscow RAL Lund Lisboa Santander Madrid Valencia Barcelona Paris Berlin Lyon Grenoble Marseille Brno Prague Torino Milano BO-CNAF PD-LNL Pisa Roma Catania ESRIN CERN HEP sites ESA sites IPSL Estec KNMI (>40) - Testbed Sites

A Simplified Grid Anatomy Grid Plumbing & Security Infrastructure SchedulingAccountingAuthorisation MonitoringDiagnosisLogging Scientific Application Data & Compute Resources Operations Team Application Developers Distributed Owners Scientific Users

A Biological Grid Anatomy Grid Plumbing & Security Infrastructure SchedulingAccountingAuthorisation MonitoringDiagnosisLogging Scientific Application Data & Compute Resources Distributed Biological Users Data Access Data Integration Structured Data

Database Growth PDB protein structures

Scientific Data Deluge of Data Exponential growth Doubling times Astronomy12 months Bio-Sequences9 months Functional Genomics6 months Bytes/dollar12 to 18 months Not How big it is but

Scientific Data Deluge of Data Exponential growth Doubling times Astronomy12 months Bio-Sequences9 months Functional Genomics6 months Bytes/dollar12 to 18 months Not How big it is but What you do with it Sharing Curation Metadata Automated movement, access & integration Computational Access

Scientific Data Deluge of Data Exponential growth Doubling times Astronomy12 months Bio-Sequences9 months Functional Genomics6 months Bytes/dollar12 to 18 months Not How big it is but How you Embrace & Manage Change The Database is a Knowledge chest The Database is a Communication Hub Autonomously Managed (Curated) change An Essential part of e-BioMedical Science

Wellcome Trust: Cardiovascular Functional Genomics Glasgow Edinburgh Leicester Oxford London Netherlands Shared data Public curated data

Data Access & Integration Central to e-Science Especially Earth Sciences, Ecology, Biology & Medicine Collaboration Shared Databases Curated Knowledge Accumulated Observations Accumulated Simulations Computation Data mining Input to models Calibration of models Presentation Publication of results Visualisation

GGF DAIS WG Chairs Norman Paton (Manchester Uni.) Leanne Guy (CERN) Dave Pearson (Oracle UK) Activity BoF GGF4 Toronto WG Meeting GGF5 Edinburgh Papers for GGF6 Workshops & Mail lists Goals Agree Standards for Database Access & Integration Freely available reference implementations OGSA-DAI one source & focus for discussions Norman Paton, Inderpal Narang, Leanne Guy, Susan Maliaka, Greg Ricardi, …

OGSA-DAI project Lego kit for Data Access & Integration Components for e-Science Applications Accelerated Application Development Multiple Data Models Distributed Data Access via Grid & Proxies Integration, Translation & Transformation Open Source Reference Implementation For DAIS-WG standard Trigger for Component Construction Start a community

Oxford Glasgow Cardiff Southampton London Belfast Daresbury Lab RAL OGSA-DAI Partners EPCC & NeSC Newcastle IBM USA IBM Hursley Oracle Manchester EPCC & NeSC IBM UK IBM USA Manchester e-SC Newcastle e-SC Oracle £3 million, 18 months, started February 2002 Cambridge Hinxton

Primary Components

Advanced Components

Composed Components

Distributed Query

OGSA-DAI Time Line Feb 02May 02Jul 02Sep 02Dec 02Feb 03May 03Sep 03 Ship Alpha Release for GT3 Integration RDB + GT2 / OGSA Prototypes Available XML + OGSA Prototype Available Design Documents & Demos for DAIS GGF5 XML + OGSA Prototypes for Early Adopters WS + GSI UK support ( > 100 downloads) Phase 2 Starts Phase 1 Starts Presentation & GGF7 GGF6 WG Papers & Prototypes Productisation, RAMPS & Extension

OGSA-DAI Summary On Schedule & Going Well Contributions via GGF5 & 6 Releases with GT3 Releases scheduled Status: Early Days Released prototypes Tested Architectural Design Using OGSA Working with Early Adopter Pilot Projects AstroGrid & MyGrid Influence OGSA-DAI direction Via DAIS-WG & Direct messages to us

Biomedical e-Scientists Is this one species? Understanding bird energy Understanding a river / ocean interaction Understanding a biochemical pathway Understanding a cell Understanding a Heart or Brain Understanding Rhododendra Understanding Evolution … No One-Size fits all solutions But sharable re-usable components

Opportunities Many, many … More than we can address Compute needs Data management needs Data integration needs … Must choose some pioneers To meet a range of common requirements To provoke rich & high-level platform To generate re-usable components A Long-Term Commitment Needed

Advancing Biological Grid Grid Plumbing & Security Infrastructure SchedulingAccountingAuthorisation MonitoringDiagnosisLogging Scientific Application Data & Compute Resources Distributed Biological Users Data Access Data Integration Structured Data Biomedical (Grid) Application Component Library

Summary e-Science Data as well as Compute Challenges Needed to be put together Need ubiquitous supported consistent platforms Grid A (potentially) invaluable platform Only show in town Data Integration Hard Develop & Use Standard kit of parts Started to build the kit Opportunities No one-size fits all, but re-usable subsystems Invest in wider range of Problem driven pioneering Strategic choices needed