Primary Research Team & Capabilities URL: http://ui.sav.sk Institute of Informatics, Slovak Academy of Sciences Department of Parallel and Distributed Computing Research and Development Areas: Large-scale HPC, Grid and MapReduce applications Intelligent and Knowledge oriented Technologies Experience from IST: 3 project in FP5: ANFAS, CrosGRID, Pellucid 6 project in FP6: EGEE II, K-Wf Grid, DEGREE (coordinator), EGEE, int.eu.grid, MEDIGRID 7 projects in FP7: Commius, Admire, Secricom, EGEE III EGI-InSPIRE, Venis 4 projects in H2020: EGI-Engage, EOSC-hub DEEP-HybridDataCloud, PROCESS Several National Projects (SPVV, VEGA, APVT) Intelligent Knowledge Technology Group focus: Information Processing (Large Scale) Graph Processing Information Extraction and Retrieval Semantic Web - Knowledge oriented Technologies Parallel and Distributed Information Processing Solutions: SGDB: Simple Graph Database gSemSearch: Graph based Semantic Search Ontea: Pattern-based Semantic Annotation ACoMA: KM tool in Email EMBET: Recommendation System Experts on MapReduce and IR (Nutch, Solr, Lucene) Director & leader of PDC: Dr. Ladislav Hluchý PROCESS Kick-off meeting 20-21 November 2017
Research Topics High performance and distributed computing and information processing Big Data oriented technology Data analytics, Data mining Machine Learning Semantic, Ontology, Taxonomy Natural Language Processing Knowledge base and modeling Artificial Intelligence Reasoning and Inference Semantic search, graph and network Information Retrieval Mobile and IoT computing Communication technologies Multi-agent systems Infrastructures PROCESS Kick-off meeting 20-21 November 2017
Large scale Text and Graph data processing Web Crawling Apache Nutch + Plugins Information Retrieval Full-text search Nutch, Lucene, Sorl Natural Language Processing Stanford Core NLP, Cognitive Computation Group, NLP Tools, GATE, Ontea Large-scale Data Processing Hadoop, Spark, Hive, Pig, S4, HBase Graph processing and Querying Simple Graph Database (SGDB), gSemSearch, Neo4j, Blueprints Machine Learning VW, Weka PROCESS Kick-off meeting 20-21 November 2017
Related projects in FP7 and H2020
K-Wf Grid Project Semantic Service Oriented Architecture Workflows of Web Services PROCESS Kick-off meeting 20-21 November 2017
EGI-Inspire EGI-Inspire: Integrated Sustainable Pan-European Infrastructure for Researchers in Europe, FP7-261323, (2010-2014) Collaborative effort involving more than 50 institutions in over 40 countries Provide sustain support for Grids of HPC/HTC, while seeking to integrate new Distributed Computing Infrastructures (DCIs), i.e. Clouds, SuperComputing, Desktop Grids IISAS in EGI-Inspire NGI for Slovakia Federated Cloud Service infrastructure PROCESS Kick-off meeting 20-21 November 2017
ADMIRE (Advanced Data Mining and Integration Research for Europe) 7FP programme, ICT priority, 2008/03 – 2011/02, 6 institutions from EU Accelerate access to and increase the benefits from data exploitation; Deliver consistent and easy to use technology for extracting information and knowledge; Cope with complexity, distribution, change and heterogeneity of services, data, and processes, through abstract view of data mining and integration; and Provide power to users and developers of data mining and integration processes IISAS’s role Environmental data- specific methods for data integration Tools applying knowledge management to data mining in SOA environment Pilot application, user interface PROCESS Kick-off meeting 20-21 November 2017
EGI-Engage project EGI-Engage: Engaging the Research Community towards an Open Science Commons H2020-654142 (2015-2017) Collaborative effort involving more than 70 institutions in over 30 countries Expanding the capabilities of a European backbone of federated services for compute, storage, data, communication, knowledge and expertise Complementing community-specific capabilities
EGI-Engage objectives Ensure the continued coordination of the EGI Community in strategy and policy development, engagement, technical user support and operations of the federated infrastructure in Europe and worldwide. Evolve the EGI Solutions, related business models and access policies for different target groups aiming at an increased sustainability of these outside of project funding Offer and expand an e-Infrastructure Commons solution Promote the adoption of the current EGI services and extend them with new capabilities through user co-development
EGI Federated Cloud Federated multi-national cloud system Integrates community, private and public clouds into a scalable computing platform for research Offers a federation pools IaaS, PaaS and SaaS services Uses single authentication and authorization framework IISAS actively participated into EGI Federated cloud activity with three different cloud sites
Accelerated Computing in EGI FedCloud IISAS is the leader of Accelerated Computing activity in EGI-Engage Provides accelerated computing capabilities to user communities in EGI Federated Cloud IISAS-GPUCloud site IISAS-Nebula Coordinates development and user activities related to Accelerated computing Manage Accelerated computing VO Provide tutorials, user guides and training for using Accelerated computing Cooperate with developers of EGI Federation Cloud to integrate Accelerated computing to the pool of Cloud services Use Cases and Application Audience Biodiversity and Ecosystem Research Bioinformatics and Biomolecular simulations Machine Learning (ML), Artificial Neural Networks (ANN) and Deep Learning (DL) applications Life large-scale simulation packages with dynamic progress development
DEEP-Hybrid-DataCloud project DEEP-Hybrid-DataCloud: Designing and Enabling E-infrastructures for intensive Processing in a Hybrid DataCloud (2017-2020) H2020-777435 Partners: CSIC (Spain), LIP (Portugal), INFN (Italy), PSNC (Poland), KIT (Germany), UPV (Spain), CESNET (Czech Republic), IISAS (Slovakia), ATOS (Spain), HMGU (Germany) to support intensive computing techniques that require specialized HPC hardware, like GPUs or low latency interconnects, to explore very large datasets to deploy under the common label of “DEEP as a Service” a set of building blocks that enable the easy development of applications requiring these techniques: deep learning using neural networks, parallel post-processing of very large data, and analysis of massive online data streams
DEEP-Hybrid-DataCloud architecture Very large datasets Deep learning framework Hybrid Cloud Specialized hardware and systems
IISAS roles in DEEP-Hybrid-DataCloud WP leader for Accelerated and HPC Computing in Cloud To develop on existing cloud middleware to provide bare-metal like performance To provide access to accelerators (GPU, FPGA) via all software layers to applications To create a seamless access to HPC resources from PaaS level
IISAS roles in DEEP-Hybrid-DataCloud DEEP as a Service To create basic building blocks for Deep learning and Big data analytics To compose building blocks according to application requirements To deploy and provide DEEP as a service Use case analysis of massive online event streams in order to generate alerts with hard realtime constraints.
PROviding Computing solutions for ExaScale challengeS PROCESS PROviding Computing solutions for ExaScale challengeS Call: H2020-EINFRA-2016-2017 e-Infrastructures Topic: EINFRA-21-2017 Platform-driven e-infrastructure innovation
PROCESS Goals PROCESS will deliver a comprehensive set of mature services prototypes and tools specially developed to enable extreme scale data processing in both scientific research and advanced industry settings. PROCESS will provide services to three communities with exceptional requirements in terms of data processing The PROCESS demonstrators will pave the way towards exascale data services that will accelerate innovation and maximise the benefits of the emerging “very large data” solutions. The main tangible outputs of PROCESS are five data services prototypes, implemented using a mature, modular, and generalizable open source solution for user friendly exascale data. The services are thoroughly validated in real-world settings: in scientific research, in industry pilot deployments, and in open data service pilot targeting also general public
PROCESS Partners Ludwig-Maximilians-Universität München - Germany Universiteit van Amsterdam - Netherlands Netherlands eScience Center - Netherlands Haute Ecole Spécialisée de Suisse Occidentale - Switzerland Lufthansa Systems GmbH & Co. KG - Germany Inmark Europa SA - Spain Ústav informatiky, Slovenská akadémia vied - Slovakia Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie - Poland
PROCESS UISAV Role Leader of JRA1 – Design and Architecture. Responsibility for design, implementation and validation of a novel SoA architecture for exascale data management and processing. Leader of JRA4 - Service Orchestration and User Interfaces. Responsible for the development and deployment of the service orchestration infrastructure as well as the user interface and collaboration tools. One of principal designers of PROCESS data services Active in JRA3 - Extreme Large Computing Service-oriented Infrastructure ,developing services for Big Data manipulation and processing Active in JRA2, integrating data management with service orchestration Will participate in dissemination and exploitation
PROCESS Architecture
EOSC-hub Integrating and managing services for the European Open Science Cloud 09 January, EOSC-hub Kickoff Tiziana Ferrari/EOSC-hub Project Coordinator
EOSC-hub mobilises providers from 20 major digital infrastructures, EGI, EUDAT CDI and INDIGO-DataCloud jointly offering services, software and data for advanced data-driven research and innovation. 7/25/2019
100 Partners, 76 beneficiaries (75 funded) Project figures 100 Partners, 76 beneficiaries (75 funded) 3874 PMs, 108 FTEs, more than 150 technical and scientific staff involved €33,331,18 of which the European Commission funds €30,000,000 €2,155,540 (~8 FTEs) are co-funded by the EGI Foundation and its participants + €1,221,094 (~5 FTEs) from EGI participants 36 months: Jan 2018 – Dec 2020 (18 month reporting period) 7/25/2019
Mission The project will create the Hub a federated integration and management system for the future EOSC Resources and Services Federation “core” services Federated operations Processes and policies 7/25/2019
What does the Hub provide? Contact point for researchers and innovators to discover, access, use and reuse a broad spectrum of resources for advanced data-driven research Catalogue of resources and services Humanities, Engineering, Medical and Health Sciences, Natural Sciences Corpus of policies, processes and federation services (community-defined) Principles of engagement and EOSC “core” services Quality assurance reviews Manage end-to-end service level management performance Competence Centres and a Joint Digital Innovation Hub Specialized Technical support, training 7/25/2019
The Hub and the Three Os Open Science Open Innovation Resources and services for sharing, discovery access, use and reuse Collaboration with OpenAIRE-Advance Open Innovation A open collaborative effort of service providers and user communities Co-design of process with Competence Centres and a Joint Digital Innovation hub Open to the world Aggregates services from local, regional and national e-Infrastructures in Europe and other regions of the world 7/25/2019
From integration to utilization Integrate production-ready services Operate and Provide Access and Consume 7/25/2019
Architecture 7/25/2019
Work packages 7/25/2019
Service Integration and Access Management 7/25/2019
Service Providers 1/2 e-Infra Humanities Engineering EGI Federation EUDAT CDI Humanities Language and literature (CLARIN) Arts (DARIAH) Engineering Environmental engineering (sea vessels, LNEC) Civil Engineering (Disaster Mitigation) Medical and Health Sciences Biological Sciences (ELIXIR) Structural biology (WeNMR) 7/25/2019
Natural Sciences Service Providers 2/2 Physical Sciences Astronomy (LOFAR) Fusion (ITER) High Energy Physics (CMS and VIRGO) Space Science (EISCAT-3D) Earth Science EO Pillar GEO Climate Research (ENES) Seismology (ORFEUS, EPOS) Biological Sciences Marine and freshwater biology (IFREMER) Biodiversity conservation (LifeWatch) Ecology (ICOS) 7/25/2019
EOSC Domains and the Hub: what model? 7/25/2019
IISAS roles in EOSC-Hub Continues the research and development activities from EGI-Engage project: To participate in Federated computing activities: Federated cloud, DNS as a service To provide Accelerated computing capabilities to scientific communities To provide training, guides and user supports related to Accelerated computing