Presentation is loading. Please wait.

Presentation is loading. Please wait.

EGI FedCloud in Digital Humanities

Similar presentations


Presentation on theme: "EGI FedCloud in Digital Humanities"— Presentation transcript:

1 EGI FedCloud in Digital Humanities
Davor Davidović Ruđer Bošković Institute DARIAH Competence Centre, EGI-Engage

2 Digital Arts and Humanities
Search Browse Access Annotate Archive STORAGE digitization storing analysis COMPUTE DI4R conference, Krakow,

3 DI4R conference, Krakow, 28-30.09.2016.
What is DARIAH-ERIC? DARIAH, the Digital Research Infrastructure for the Arts and Humanities… …aims to enhance and support digitally-enabled research and teaching across the humanities and arts. It is a connected network of tools, information, people and methodologies for investigating, exploring and supporting research across the digital arts and humanities for researchers and humanists. DI4R conference, Krakow,

4 DARIAH Organization Virtual Competency Centres 20 Working Groups:
VCC e-Infrastructure VCC Research and Education VCC Scholarly Context Management VCC Advocacy 20 Working Groups: Text and Data Analytics Natural Language Processing Training and Education Digital Annotation Visual Media Guidelines and Standards Dynamic and flexible units with specific goals and outcomes, related to one or more VCCs Cover strategic areas and topics, provide sustainability and incorporate the outcomes of working groups DI4R conference, Krakow,

5 DARIAH resources today
DARIAH is not a service provider and does not provide any compute nor storage resources, so... Scattered resources: Local, institutional, national, public, different access policies,… Allocated through project : HaS, Ariadne, Cendari, NeDiMAH sustainability? Limited usage of the cloud technologies National providers (e.g. DARIAH-DE) Small number of available cloud-based services/applications (e.g. in EGI FedCloud) In-kind contributions DI4R conference, Krakow,

6 DI4R conference, Krakow, 28-30.09.2016.
A&H requirements Storage and data capacities Digital repositories and archives Long-term data retention Compute resources Simple access, AAI DARIAH IdP, eduGain Training and education on using e-Infrastructure DI4R conference, Krakow,

7 DI4R conference, Krakow, 28-30.09.2016.
EGI European 32 countries (National Grid Initiative) Grid Federating IT services Infrastructure compute power, storage, applications Clouds, grids, clusters,... Sustainability sustainable operations Project driven-innovation EGI-Engage, Indigo-DataCloud, AARC, etc... DI4R conference, Krakow,

8 What EGI offers to user communities?
Technical support Compute resources User-specific applications Base services: AAI, monitoring, service registry Storage resources DI4R conference, Krakow,

9 EGI-Engage – DARIAH Competence Centre
Widen the usage of the Federated (cloud) services for A&H research Objectives: Strengthening the collaboration between EGI and DARIAH Increasing the number of cloud-based services and applications for A&H running Raising the awareness of benefits of using e-Infrastructure in A&H Providing access to EGI FedCloud resources Technical support DI4R conference, Krakow,

10 DI4R conference, Krakow, 28-30.09.2016.
DARIAH-CC workplan Development phase Dissemination phase Provide direct access to compute and storage resources (VMs, block storage,…) Establish DARIAH VO Develop selected services and applications AAI – EduGain, OpenID Build demonstrators/examples Dissemination actions workshops, presentations, events,... Training and education Technical support Engaging new use cases from A&H DI4R conference, Krakow,

11 Available (planned) FedCloud resources
Virtual organization: vo.dariah.eu EGI-DARIAH SLA: 1/4/2016 – 1/9/2017 GWDG (DE) MTA SZTAKI (HU) VCPU 30 Memory 70 GB Storage 2 TB INFN-Bari (IT) SRCE (CRO) INFN-Catania (IT) DI4R conference, Krakow,

12 DARIAH-CC software stack
Outreach, training, user support Community Apps. and services Optical Character Recognition system New app & services Semantic Search Engine Training platform Cloud Access DARIAH Science Gateway WS-PGRADE CDSTAR Technologies Federated DARIAH resources AAI EGI FedCloud infrastructure DARIAH Virtual Organisation e-Infrastructure DI4R conference, Krakow,

13 Services for A&H Generic services
End-user oriented, non-specific applications: DH Gateway, PSSE, Cloud Access, File Transfer FedCloud services, applications and tools Developers services App developer and service provider oriented, development services: gLibrary, CDSTAR, WS-PGRADE Demonstrators End-user, specific research groups, specific use-cases: OCR DI4R conference, Krakow,

14 DARIAH Science Gateway
Central access point for FedCloud resources User login via EduGain (in progress) –> DARIAH IdP, OpenID Access to DARIAH VO resources Based on Liferay and WS-PGRADE/gUSE Generic services file transfer, Cloud Access Specific A&H services PSSE, SSE, OCR (beta) DI4R conference, Krakow,

15 DARIAH Science Gateway
Identity provider 1 Identity provider 2 DARIAH Science Gateway PSSE Portlet OCR Portlet gLibrary Portlet APP Portlet APP Portlet DCI Bridge Robot certificate EGI FedCloud DI4R conference, Krakow,

16 Parallel Semantic Search Engine (PSSE)
Parallel search across Open Access repositories Search and semantically correlate contents in geographically distributed digital repositories across several different domain DI4R conference, Krakow,

17 DI4R conference, Krakow, 28-30.09.2016.
Simple cloud services Simple Cloud Access IaaS – drag&drop app running on FedCloud OpenStack from INFN Bari and Catania FileTransfer – DataAvenue DI4R conference, Krakow,

18 DI4R conference, Krakow, 28-30.09.2016.
Developers services Repository framework developed by INFN Access to existing and the creation of a new repositories via REST API “Tool” for creating and managing repositories Common data storage Architecture (GWDG) Provides system that can store, modify, search and access structured and unstructured data Already in-use by DARIAH-DE, but not on FedCloud! DI4R conference, Krakow,

19 DI4R conference, Krakow, 28-30.09.2016.
Use case 1: The Virtual Dialect Dictionary 100+ year old collection of Bavarian dialects from Austro-Hungarian monarchy 50,000+ records (provided by Austrian Academy of Science) Organize, search, store and retrieve digital assets Traget users: lexicographers Link: DI4R conference, Krakow,

20 Use case 2: Optical Character Recognition
Digitalization of a large collection of scanned or pictured documents with a search option Based on the CDSTAR framework for data storing and analyzing OCR  for Big Data problems MapReduce parallelization model SaaS – installation possible on any cloud site Beta version DI4R conference, Krakow,

21 DI4R conference, Krakow, 28-30.09.2016.
Indigo-DataCloud INtegrating Distributed data Infrastructures for Global ExplOitation Goal: develop a sustainable PaaS Cloud solution for e-Science 26 partners, 11 counties 11 scientific communities DI4R conference, Krakow,

22 INDIGO DARIAH repository platform
Platform for easy creation of new repositories Provide simple deploying and hosting of the Open Access repository solutions in the Cloud: Invenio, ePrints, Islandora, OAR (docker images) Does not require technical knowledge, makes deploying repo in the Cloud simple Beneficiaries: small groups, individuals, A&H-related projects Under development DI4R conference, Krakow,

23 DARIAH repository - scheme
DI4R conference, Krakow,

24 Who does need these services?
Who are the beneficiaries of the services in DARIAH? COMMUNITIES RESEARCHERS SERVICE PROVIDERS Services and applications (OCR, PSSE) DH Science Gateway Frameworks, engines (gLibrary, CDSTAR) Researcher -> Working groups Communities -> DARIAH-related projects Service providers -> existing DARIAH resource providers DI4R conference, Krakow,

25 Future plans for FedCoud in A&H
Integration of the service into the Gateway Finish registering the Gateway as EduGain SP Support DARIAH via “Cloud infra” working group Engage new use-cases and user community to explore: Existing services: PSSE, OCR Provide new user-specific services Prepare demonstrators: OCR Workshop (hands-on) App developers Indigo repository, gLibrary, CDSTAR End-users PSSE, OCR, Cloud Access DI4R conference, Krakow,

26


Download ppt "EGI FedCloud in Digital Humanities"

Similar presentations


Ads by Google