Download presentation
Presentation is loading. Please wait.
Published byDinah Doreen King Modified over 9 years ago
1
www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 Processing services in EUDAT EUDAT GEF status and plans Christian Pagé, CERFACS Earth System Science data management Session RDA 6 th Plenary Paris 23-25 September 2015
2
Science Drivers Data available for scientific analysis : a very large trend Limitations in data access means limitations in data analytics and scientific results Download locally then Analyze : a workflow that cannot be sustained Climate researchers Impact researchers
3
EUDAT Generic GEF ideas: orchestrate / multi- communities / services GEF Web Service Generic API GEF Web Service Generic API iRODS data access Abstracting iRODS (flexibility) or Community -specific Federation ENES/ESGF OR WebLicht, etc. ENES/ESGF OR WebLicht, etc. Hadoop Data Federation Common Metadata Semantics: Searching across communities Common Metadata Semantics: Searching across communities Common AAI: Authentication and Authorization across communities Common AAI: Authentication and Authorization across communities Extensions Processing Services Catalogs: Getting information about communities' Services Processing Services Catalogs: Getting information about communities' Services Common Communities-specific Processing/workflows Requests using PIDs PPIDs for identification of data products
4
EUDAT Generic GEF ideas: orchestrate / multi- communities / services http://github.com/GEFx/gef 4 User request GEF web service GEF Executor iRODS Backend App container Prototype implementation Spec in progress Unclear API direction, more discussions needed Thanks to Emanuel Dima, EKUT (CLARIN)
5
ESGF WPS API: Future computing nodes Goal : perform data analysis near the data storage Better data access Move away from the download/analyze workflow
6
ESGF WPS API: Future computing nodes Develop general APIs for exposing ESGF distributed compute resources to multiple analysis tools Not yet develop the server-side processing capabilities: focus on the API First Steps Use Case approach Used the Goddard Climate Data Services (CDS) API and server-side processing Compared different APIs
7
ESGF WPS API: Future computing nodes Next Steps Technology Exploration for server-side processing Continue the exploration of HDFS and the other technologies (e.g. Spark) Exploration of high performance file systems ESGF API Specification of a ESGF WPS API Test implementation
8
Challenges to orchestrate too! Federated environment: Orchestration of the calculation from the requested computing node Advanced Scheduler needed Where one should perform calculations if data is available at multiple data nodes? Which calculation services are available, if any? Results from several computing nodes will need to be gathered and combined Many challenges ahead!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.