Interoperability from the e-Science Perspective Yannis Ioannidis Univ. Of Athens and ATHENA Research Center

Slides:



Advertisements
Similar presentations
DRIVER Long Term Preservation for Enhanced Publications in the DRIVER Infrastructure 1 WePreserve Workshop, October 2008 Dale Peters, Scientific Technical.
Advertisements

1 Building scientific Virtual Research Environments in D4Science Paul Polydoras University of Athens, Greece.
Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
The Data Lifecycle and the Curation of Laboratory Experimental Data Tony Hey Corporate VP for Technical Computing Microsoft Corporation.
High Performance Computing Course Notes Grid Computing.
EInfrastructures (Internet and Grids) US Resource Centers Perspective: implementation and execution challenges Alan Blatecky Executive Director SDSC.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Riding the Wave: a Perspective for Today and the Future APA Conference, November 2011 Monica Marinucci EMEA Director for Research, Oracle.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
OASIS Reference Model for Service Oriented Architecture 1.0
Planning for Flexible Integration via Service-Oriented Architecture (SOA) APSR Forum – The Well-Integrated Repository Sydney, Australia February 2006 Sandy.
Introduction and Overview “the grid” – a proposed distributed computing infrastructure for advanced science and engineering. Purpose: grid concept is motivated.
The MashMyData project Combining and comparing environmental science data on the web Alastair Gemmell 1, Jon Blower 1, Keith Haines 1, Stephen Pascoe 2,
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Donatella Castelli CNR-ISTI
Grid – Path to Pervasive Adoption Mark Linesch Chairman, Global Grid Forum Hewlett Packard Corporation.
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
Interoperability Grids, Clouds and Collaboratories Ruth Pordes Executive Director Open Science Grid, Fermilab.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
© 2006 STEP Consortium ICT Infrastructure Strand.
Infrastructures for Social Simulation Rob Procter National e-Infrastructure for Social Simulation ISGC 2010 Social Simulation Tutorial.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
NGCWE Expert Group EU-ESA Experts Group's vision Prof. Juan Quemada NGCWE Expert Group IST Call 5 Preparatory Workshop on CWEs 13th.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Toward a common data and command representation for quantum chemistry Malcolm Atkinson Director 5 th April 2004.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
The Global Scene Wouter Los University of Amsterdam The Netherlands.
1 Kostas Glinos European Commission - DG INFSO Head of Unit, Géant and e-Infrastructures "The views expressed in this presentation are those of the author.
RC ICT Conference 17 May 2004 Research Councils ICT Conference The UK e-Science Programme David Wallace, Chair, e-Science Steering Committee.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
A 10-year Vision for Global Research Data Infrastructures Erwin Laure KTH 1st WG Meeting, London,
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Support to scientific.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
WP1:Definition & Production of the GRDI2020 Roadmap Roadmap Report To address the Technological, Organizational and Policy problems which hinder the building.
Service Oriented Architecture (SOA) Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Accessing the VI-SEEM infrastructure
AAI for a Collaborative Data Infrastructure
Middleware independent Information Service
Grid Computing.
University of Technology
EOSC services architecture
DataCite - A global registration agency for research data
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
Presentation transcript:

Interoperability from the e-Science Perspective Yannis Ioannidis Univ. Of Athens and ATHENA Research Center OGF 28

Outline Context: –Current situation, vision and future challenges Interoperability – definition Scientific interoperability / e-Infrastructure layers Approach Conclusions

Science Paradigms 1 st - Thousand years ago: science was empirical describing natural phenomena w/ some models, generalizations 2 nd - Last few hundred years: theoretical branch using models, generalizations 3

Really Early Times One scientist One location One discipline One phenomenon One pencil (… carver …) One paper (… stone …) Street announcements, e.g., Εύρηκα!

Science Paradigms 3 rd - Last few decades: a computational branch simulating complex phenomena 5

Recent Times One small group of scientists One location One discipline One phenomenon One file system One local disk with custom files Publications at refereed forums

Science Paradigms 4 th - Today: data exploration (eScience) unify theory, experiment, and simulation 7

Current Times Many/large teams of scientists Many locations Many disciplines Many phenomena Many data management systems Many data forms Web uploads for publications, data, processes, …

Current Times Many/large teams of scientists Many locations Many disciplines Many phenomena Many data management systems Many data forms Web uploads for publications, data, processes, … New order in scholarly communication Open access Creator, author, publisher, curator, preserver roles mixed up Digital libraries & repositories at centre stage

eInfrastructure Layers Network Processing Data / Info / Pubs Functionality Users Communities }

Data Services Community Support Services Astronomy Climatology Chemistry History Biology Computing Infrastructure Persistent Storage Capacity Integrity Authentication & Security API Data Discovery & Navigation Workflows Generation Demography Scientific Data (Discipline Specific) Other Data Researcher 1 Non Scientific World Scientific World Researcher 2 Aggregated Data Sets (Temporary or Permanent) Workflows Aggregation Path Source: High-level Group on Scientific Data

Interoperability DEFINITION “The ability of two or more systems or components to exchange information and to use the information that has been exchanged” (IEEE definition) “A property referring to the ability of diverse systems and organizations to work together (inter-operate)” (Wikipedia) “The capability to communicate, execute programs, or transfer data among various functional units in a manner that requires minimal knowledge of the unique characteristics of those units” (ISO/IEC Information Technology Vocabulary, Fundamental Terms)

e-Science Definition “e-Science is computationally-intensive science that is carried out in highly distributed network environments, or science that uses immense data sets that require computing.” :  astronomy  earth science  biology  chemistry  social science  particle physics  …

e-Science Interoperability “The ability of two or more e-Infrastructure systems or components to exchange Scientific information and to use the information that has been exchanged” Can be applied to different e-Infrastructure “layers” –Network interoperability Interoperability between different (sub)-networks –Computing (Grids, Clouds, etc.) Interoperability between different computing systems, e.g. Grids, Supercomputers, Clouds, etc. –Data interoperability Ability to exchange and use data e-Infrastructure service providers need to make interoperability transparent for their users

Network interoperability The ability to exchange data between different interconnected networks –TCP/IP is the de-facto networking protocol standard enabling world-wide collaboration and data exchange Required a couple of decades –However for layers 1 or 2 (end-to-end lightpaths or end-to-end switched connections), interoperability is still not plug-and-play and requires engineering

Computing systems interoperability The ability to submit jobs and exchange data between different computing systems such as Grids, Supercomputers, Clouds, etc. Indicative areas: Job submission & monitoring, authentication & authorisation, data access –No standardised middleware for such components & their interactions: still many years behind for a “TCP-IP”! –A variety of architectures, stacks, protocols for different systems –There are interoperable implementations, e.g., submitting jobs from one e-Infrastructure to another and vice-versa

Data interoperability The ability to exchange data between different systems and use them Indicative areas: information representation - profiling, discovery, security –Still many challenges ahead Heterogeneity of the exchanged information Inconsistency of the intended use of the information by the two involved systems

Technical Challenges Heterogeneity of the exchanged information objects –Syntactic heterogeneity –Structural heterogeneity –Semantic heterogeneity Inconsistency of the intended use of the information object –by the producer entity and the intended exploitation of this object by the consumer entity –the exchanged information must be complemented with some “descriptive” information, such as contextual, provenance, quality, security, privacy, etc. information –The descriptive information should be modeled by purpose- oriented descriptive data models /metadata models.

Organisational and policy challenges Interoperability is not only about technical challenges Policy and organisational challenges exist: –Interoperability entails (global / con-federated) governance and trust –Interoperability entails certification and standardisation –Interoperability entails sustained funding and support!

For interoperability a mediation concept is needed –Includes a mediation function / schema –Types of mediation: Mapping Matching Integration Consistency Checking –Mediation requires: Adequate modeling of structural, formatting, and encoding constraints of the producer entity information resources Adequate modeling of the consumer entity needs Formally defined transfer and message exchange protocols The definition of a matching relationship between the producer information resources and the consumer models Standards is usually a long and tedious process Technical approach: Mediation

Conclusions E-Science interoperability spans multiple e-Infrastructure layers In most of the layers there are still many challenges ahead: above all heterogeneity Interoperability is thus complex and never- ending For scientific users interoperability should be transparent!

CODATA 22, Cape Town, S. Africa

Thank you! Questions?