Open Data and Cloud Computing e-Infrastructure for Biodiversity Daniele Lezzi Barcelona Supercomputing Center International Workshop on Science Gateways.

Slides:



Advertisements
Similar presentations
Research Infrastructures WP 2012 Call 10 e-Infrastructures part Topics: Construction of new infrastructures (or major upgrades) – implementation.
Advertisements

Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
Species Banks a GBIF mechanism to provide electronic access to quality species information Peter H. Schalk, Marc Brugman ETI, University of Amsterdam Tinde.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Massimiliano Assante – Leonardo Candela – Donatella Castelli – Pasquale Pagano Fourteenth International Conference on Grey Literature An Environment Supporting.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
EC Grant Agreement no GEOSS Interoperability for Weather Ocean and Water Enhancing the GEOSS Infrastructure for all the Stakeholders.
Data Infrastructures Opportunities for the European Scientific Information Space Carlos Morais Pires European Commission Paris, 5 March 2012 "The views.
What is EGI? The European Grid Infrastructure enables access to computing resources for European scientists from all fields of science, from Physics to.
ISpheres Project. Project Overview iSpheresCore iSpheresImage Demonstration References.
1 INFRA : INFRA : Scientific Information Repository supporting FP7 “The views expressed in this presentation are those of the author.
Dimitris Koureas, PhD Natural History Museum London Linking layers of biodiversity data: Informatics challenges for the long tail research RDA - Long Tail.
Introduction to iMarine and it’s challenges Alexandros Antoniadis (NKUA) John Gerbesiotis (NKUA)
EMI INFSO-RI SA2 - Quality Assurance Alberto Aimar (CERN) SA2 Leader EMI First EC Review 22 June 2011, Brussels.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Vision for European DCIs Steven Newhouse Project Director, EGI-InSPIRE 15/09/2010.
Digital Earth Communities GEOSS Interoperability for Weather Ocean and Water GEOSS Common Infrastructure Evolution Roberto Cossu ESA
Linking Tasks, Data, and Architecture Doug Nebert AR-09-01A May 2010.
Interoperability Grids, Clouds and Collaboratories Ruth Pordes Executive Director Open Science Grid, Fermilab.
The GRelC Project: architecture, history and a use case in the environmental domain G. Aloisio - S. Fiore The Climate-G testbed is an interdisciplinary.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
BalticGrid-II Project BalticGrid-II Kick-off Meeting, , Vilnius1 Joint Research Activity Enhanced Application Services on Sustainable e-Infrastructure.
© 2006 STEP Consortium ICT Infrastructure Strand.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
August 3, March, The AC3 GRID An investment in the future of Atlantic Canadian R&D Infrastructure Dr. Virendra C. Bhavsar UNB, Fredericton.
The DEER The Distributed European Electronic Resource.
IMarine and our contribution 1 Presentation methodology: PechaKucha 20x20 Andrea Manzi (CERN) Nick Drakopoulos (CERN) IT GT.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
Progress Alastair Culham. i4Life – the BIG aim To move Catalogue of Life from a research project to a sustainable service 1.To enhance the content 2.To.
The New GBIF Data Portal Web Services and Tools Donald Hobern GBIF Deputy Director for Informatics October 2006.
Convert generic gUSE Portal into a science gateway Akos Balasko.
Building Scientific Workflows for the Fisheries and Aquaculture Management Community based on Virtual Research Environments Pedro Andrade (CERN)
Managing Virtual Research Environments in Hybrid Data Infrastructures Pasquale Pagano (CNR, Italy) iMarine Technical Director
XMC Cat: An Adaptive Catalog for Scientific Metadata Scott Jensen and Beth Plale School of Informatics and Computing Indiana University-Bloomington Current.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
Cloud Computing for Ecological Modeling in the D4Science Infrastructure A. Manzi (CERN), L. Candela, D. Castelli, G. Coro, P. Pagano, F. Sinibaldi (ISTI-CNR)
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
INFSO-RI JRA2 Test Management Tools Eva Takacs (4D SOFT) ETICS 2 Final Review Brussels - 11 May 2010.
Cloud-based e-science drivers for ESAs Sentinel Collaborative Ground Segment Kostas Koumandaros Greek Research & Technology Network Open Science retreat.
EGI Technical Forum Madrid COMPSs in the EGI Federated Cloud Daniele Lezzi – BSC EGI Technical Forum Madrid.
ETICS An Environment for Distributed Software Development in Aerospace Applications SpaceTransfer09 Hannover Messe, April 2009.
The EUBrazilOpenBio-BioVeL Use Case in EGI Daniele Lezzi, Barcelona Supercomputing Center EGI-TF September 2013.
Realising Virtual Research Environments by Hybrid Data Infrastructures: the D4Science Experience Andrea Manzi (CERN) Leonardo Candela, Donatella Castelli,
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
Cloud interoperability and elasticity with COMPSs Federated Cloud F2F Jan , Amsterdam Daniele Lezzi – Barcelona Supercomputing Center.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
PARTHENOS-project.eu EOSC market demand for art, humanties and cultural heritage Amsterdam– EGI Conference– 7/4/2016 Franco Niccolucci Scientific Coordinator,
Virtual Research Environments as-a-Service Donatella Castelli CNR-ISTI EGI Conference 2016, 6-8 April.
EGI Technical Forum Madrid The EUBrazilOpenBio-BioVeL Use Case in EGI Daniele Lezzi – BSC EGI Technical Forum Madrid.
EGI-InSPIRE EGI-InSPIRE RI The European Grid Infrastructure Steven Newhouse Director, EGI.eu Project Director, EGI-InSPIRE 29/06/2016CoreGrid.
Daniele Lezzi Execution of scientific workflows on federated multi-cloud infrastructures IBERGrid Madrid, 20 September 2013.
1 Tutorial Outline 30’ From Content Management Systems to VREs 50’ Creating a VRE 80 Using a VRE 20’ Conclusions.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Grant.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Overview for ENVRI Gergely Sipos, Malgorzata Krakowian EGI.eu
Accessing the VI-SEEM infrastructure
Ecological Niche Modelling in the EGI Cloud Federation
Pasquale Pagano (CNR-ISTI) Project technical director
Pasquale Pagano CNR, Italy
INTAROS WP5 Data integration and management
Flanders Marine Institute (VLIZ)
Brief introduction to the project
Introduction to D4Science
LifeWatch Cloud Computing Workshop
Module 01 ETICS Overview ETICS Online Tutorials
Presentation transcript:

Open Data and Cloud Computing e-Infrastructure for Biodiversity Daniele Lezzi Barcelona Supercomputing Center International Workshop on Science Gateways 2013

IWSG13 – 4/6/2013 Further EU-Brazil collaboration in support of the biodiversity area & infrastructures Computing resources & SW platforms EU-Brazil OpenBio EU & Brazilian biodiversity scientific communities Data and resource managers & Open Access community European & Brazilian policy and funding bodies Who will benefit from EUBrazilOpenBio? Combining Biodiversity Science and the Open Access Movement to deploy a joint European and Brazilian e-Infrastructure of open access resources supporting the needs of the biodiversity scientific community. Two biodiversity use cases EU-Brazil Open Data and Cloud Computing e-Infrastructure for Biodiversity

IWSG13 – 4/6/2013 The Partnership BSC, Spain CRIA, SP CNR-ISTI, Italy UFF, RJ Trust-IT, UK UPVLC, Spain SP2000, UK CESAR, PE RNP, RJ A well-balanced effort of European and Brazilian organisations

IWSG13 – 4/6/2013 The Technological Challenges Nowadays science is posing “systems” engineers with challenging tasks: – highly-evolving requirements; – large scale resource & player distribution; – heterogeneity; This makes standard development approaches often too expensive and – “from-scratch” development of ad-hoc solutions – HW investment (even if intermittently needed) … do not result in sustainable infrastructures …

IWSG13 – 4/6/2013 Infrastructure vs e-Infrastructure Science has been traditionally based on infrastructures E-infrastructures are becoming increasingly important tools for – addressing the complexities and challenges of scientific discovery – enabling researchers across the world to collaborate on scientific initiatives by sharing access to unique or distributed scientific facilities (including data, instruments, applications, computing and communications) through user-friendly interfaces.

IWSG13 – 4/6/2013 Supporting Virtual Research Environments A Virtual Research Environment is a complete “system” consisting of hardware, data, and applications deployed to support the needs of a user community and promoting effective and fruitful collaborations

IWSG13 – 4/6/2013 THE EUBRAZILOPENBIO INFRASTRUCTURE

IWSG13 – 4/6/2013 An Hybrid Infrastructure User communities Application #1 Application #2 Application #N Services Data Services Computing Services Management Services Existing Infrastructures Grids Clouds Clusters Data Sources VENUS- C HTCondor COMPS s gCube storag e Usto.r e Biodiversity VRE CoL GBIF EasyGrid AMS gHNs User communities Application #1 Application #2 Application #N Services Data Services Computing Services Management Services Existing Infrastructures Grids Clouds Clusters Data Sources VENUS- C HTCondor COMPS s gCube storag e Usto.r e Biodiversity VRE CoL GBIF EasyGrid AMS gHNs User communities Application #1 Application #2 Application #N Services Data Services Computing Services Management Services Existing Infrastructures Grids Clouds Clusters Data Sources VENUS- C HTCondor COMPS s gCube storag e Usto.r e Biodiversity VRE CoL GBIF EasyGrid AMS gHNs Integrating different technologies to make a large variety of services available for managing, manipulating and processing data and metadata within an autonomously- managed infrastructure: gCube system, openModeller, COMPSs, EasyGrid AMS, VENUS-C, HTCondor, u.store Leveraging on existing European, Brazilian and global data sources ranging from species data - species names, synonyms, taxonomical classifications - to literature, occurrence maps and images: Catalogue of Life, List of Species of the Brazilian Flora, speciesLink, Biodiversity Heritage Library, Bioline International, Global Biodiversity Information Facility (GBIF). Two use cases: Taxonomy Management and Ecological Niche Modelling

IWSG13 – 4/6/2013 EUBrazil OpenBio have implemented new services and components to ease the development of applications – Data Access services gCube Storage. Storage connectors. – Execution Services COMPS+PMES execution. OMWS2 Execution Services – Orchestrator Service. – Developing portlets High-end services – Species Discovery Service. ENM and XMAP service. Insight of EUBrazilOpenBio infrastructure

IWSG13 – 4/6/2013 Accessing the infrastructure A GUI developed using Google Web Toolkit (GWT) and Java. Integrated with a several number of gCube applications, such as: – The workspace. Users will be able to store taxonomic checklists and the results of the cross-maps on their own storage space. – The Species Discovery Service, a gCube portlet that enables obtaining taxonomic checklist and occurrences from different providers. – The gCube Information System, that enables the GUI to obtain the endpoint of the cross-map and ENM web services that will execute the algorithm of the cross-map and ENM. 10

IWSG13 – 4/6/2013 General Services - The workspace The workspace is a virtual drive in which you can upload and download the files needed for the services and the results. Files can be organized into folders. Several file types can be directly displayed – Bitmaps, text, GIS data, etc. 11

IWSG13 – 4/6/2013 General Services - Species products discovery The Species product discovery enables retrieving taxons and occurrence points from a number of providers & data sources in a seamless way. Search on taxons or occurrence points is selected from the first box. Resulting files can be stored in the workspace and used further. 12

IWSG13 – 4/6/2013 General Services - Species products discovery Occurrence points Service products discovery can also be used to browse occurrence points. Scientific or common names can be used. Data from the different databases is presented in the list. Downloading may take long. 13

IWSG13 – 4/6/2013 Data Services:gCube gCube is a service-oriented framework enabling for the creation and interconnection of e- Infrastructures in a controlled and highly configurable manner. Computing, storage, data and software are made accessible by the infrastructure and are exploited by users using a thin client 14

IWSG13 – 4/6/2013 USE CASE I: INTEGRATION OF TAXONOMIES 15

IWSG13 – 4/6/2013 Use case I - Integration of Taxonomies: The problem Given 2 taxonomic checklists in Darwin code (dwca) format, the objective is to obtain the relationships present between the taxa in one checklist with taxa in the other checklist. The specific steps of the use case are: – Obtain the dwca files of the checklist to compare. – Import the checklist into the web service. – Run the cross-map. – Save the results. 16

IWSG13 – 4/6/2013 Use case I - Integration of Taxonomies: Bottlenecks Currently 46 members There are now 100 participating databases Estimated 150 databases and partners Aim is to increase number of members and databases Increasing access to the data – Problems with the quality of service 17

IWSG13 – 4/6/2013 USE CASE II: ECOLOGICAL NICHE MODELLING 18

IWSG13 – 4/6/2013 Ecological Niche Modelling Ecological niche: Set of ecological requirements for a species to survive and maintain viable populations over the time. (Grinnel, 1917) Species occurrence points Environmental variables Modelling algorithm Projected niche model

IWSG13 – 4/6/2013 (Brazilian Virtual Herbarium) openModeller Web Service (single machine) Approach before OpenBio ~50min for a single species (until the final model is generated) request response

IWSG13 – 4/6/2013 Advanced Web interface for niche Modelling (OMWS+) Other applications (Brazilian Virtual Herbarium) Enhanced niche modelling Web Service Cloud-based backends: COMP Superscalar (virtualized Condor) Virtual Research Environment Additional improvements: EasyGrid AMS EU-Brazil OpenBio strategy U.Store

The COMPSs programming framework Platform unaware programming model that simplifies the development of applications in distributed environments Low user intervention for application development Transparent data management, task execution Parallelization at task level

Validation of the ENM workflow #VMs#Cores Cloud Time Speedup 1402:00: :00: :33: :25: :23:575.02

Collaborations Experiment Orchestrator Service OMWS+ VENUS-C Cloud Middleware COMPSs Workflow Orchestrator VENUS-C Connector OCCI CDMI EGI Federated Cloud Experiment Orchestrator Service OMWS+ VENUS-C Cloud Middleware COMPSs Workflow Orchestrator VENUS-C Connector OCCI CDMI EGI Federated Cloud Provision of the OMWS+ to BioVel community to access the EGI Federated Cloud VENUS-C/COMPSs enables the execution of Taverna workflows thanks to interoperability features OMWS+ protocol officially integrated in the main release

IWSG13 – 4/6/2013 Accessing the infrastructure The infrastructure is available at – e-infrastructure-gateway e-infrastructure-gateway You can access from the main portal of EUBrazilOpenBio – A registration is needed. 25

IWSG13 – 4/6/2013 Inventory of components EUBrazilOpenBio has developed a set of components on top of different technologies to make available a large variety of services for managing, manipulating and processing data and metadata within an autonomously- managed infrastructure More information can be found at the reference pages – gCube Framework, – COMPSs and VENUS-C PMES, – openModeller, – EasyGrid AMS, – u.store,

Sustainability plans Experiment Orchestrator Service OMWS+ VENUS-C Cloud Middleware COMPSs Workflow Orchestrator VENUS-C Connector OCCI CDMI EGI Federated Cloud  Socio-economic Analysis  Promoting benefits for biodiversity researchers and next-generation of researchers  Modelling important for sustainable development: conservation planning, geographic & ecological aspects of disease transmission, guiding biodiversity field surveys.  Benefits for developers & integrators.  Citizen scientists – raising awareness of the value of biodiversity. Sharing a passion for nature.  Promotion of tangible assets to target audiences: integrated services, resources, use cases, eTraining Programme and hands-on tutorials.  Detailed partner exploitation plans to exploit assets and synergies to broaden the user base, e.g.  EGI Federated Cloud Use case for Ecology in synergy with BioVEL  EUBrazilOpenBio Joint Action Plan for policy makers & funding agencies  Horizon 2020: from e-infrastructure prototypes to sustainable services  Analysis of the EU & Brazil policy and biodiversity landscape  Identify actions to evolve the e-infrastructure & address biodiversity challenges for sustainable development  Identify opportunities for enterprise participation in collaborative initiatives and new public-private partnerships

Join the EUBrazilOpenBio Online Community! Engage in the eTraining Programme Thanks for your attention Experiment Orchestrator Service OMWS+ VENUS-C Cloud Middleware COMPSs Workflow Orchestrator VENUS-C Connector OCCI CDMI EGI Federated Cloud