Download presentation
Presentation is loading. Please wait.
Published byTracey Marilynn Oliver Modified over 8 years ago
1
DASISH Digital Services Infrastructure for Social Sciences and Humanities Daan Broeder TLA - MPI for Psycholinguistics / DASISH & CLARIN EGI Forum Garching, March 27 2012
2
DASISH consortium 18 partners from 10 countries in EU + Norway 5 ESFRI infrastructures: CESSDA, CLARIN, DARIAH, ESS and SHARE DASISH budget: 6ME -> 700PMs Started Jan 2012 Already existing collaborations – Partners part of multiple research infrastructures – ESFRI project merging (proposals) e.g. CLARIN + DARIAH -> CLARIAH
3
DASISH brings together 5 ESFRI infrastructures by focusing on common activities across disciplines and infrastructures DASISH aims to provide solutions to common problems that will also strengthen international collaboration CESSDA Council of European Social Science Data Archives CLARIN Common Language Resources and Technology Infrastructure DARIAH Digital Research Infrastructure for the Arts and Humanities ESS European Social Survey SHARE Survey of Health, Ageing and Retirement in Europe Two already established ERICs others have (long) established collaborations. All aim for enhanced visibility and re-usability of digital resources, tools and services All are constructing digital, distributed research infrastructures based on giving researchers an environment with access to storing and retrieving data, offering community specific tools and services, persistent, high quality common data services. All face many identical types of challenges. All see the advantages of cross-fertilization and synergy in the construction phase DASISH consortium
4
Consortium SND DANS UEssex FSD NSD GESIS CITYKCL UGOE OEAW MIPL UCPH UIB UPF NUIM UMA UNIVE CentERdata UT CESSDA DARIAH CLARIN ESS SHARE
5
Management
6
DASISH Mission DASISH provides & brokers solutions for a number of common issues of the five ESFRI projects in social sciences and humanities, DASISH identifies four major areas: data quality (surveys) ESS & SHARE data archiving data access legal and ethical issues General procedure inventory current status analysis brokering or new implementation recommendations outreach and education
7
Need to create common infrastructure not just strengthen community specific ones Traditions vary considerably – Between SS at one side and the humanities. But also within the humanities – Some collaborations/communities have a rich history – Others are fairly new (as an infrastructure) – Organizational models and complexity varies and impacts preferences for solutions – Differences wrt. understanding IT issues Language and terminology vary even more so as past discussions learned us DASISH Challenges
8
(Meta) Data Quality – Highly domain specific Archiving – Generic, but specific formalized policies required Access – Generic AAI (for SSH): SSO and singe identity FIM based on SAML2, VO-platform(s); GEANT/eduGain – Exceptions for very sensitive data should be possible Persistent identifiers – Generic, but often tradition determined e.g. URNs vs. HS/DOI – Added functionality in cooperation with other e-infra projects as EUDAT – PID service providers as EPIC Common Services I
9
Joint Metadata Domain – Many different metadata schema – Semantic interoperability is a challenge – Large number of records; granularity problem Annotation framework – Relations between (parts-of) on-line data objects PID for parts, adequate visualization – (Partly) data-type specific; impl. by DASISH Workflows, examples from the SSH – Computing (CLARIN, …) Dynamic deployment of (web-)services in SOA Organizational problem of service sustainability – Storage for workspaces (CLARIN, DARIAH, …) Common Services II
10
SSH communities wide - DASISH common SSH metadata catalog community specific community specific CLARIN LT web service infrastructure NETWORK Services - GEANT Federated Identity Management Data Preservation – EUDAT replication & preservation DASISH Context CLARIN DARIAH CESSDA Life Watch DASISH
11
Thank you for your attention
12
DASISH Topics Make valuable data explicit – Move data from backyards into data centers – Provide persistent access using PIDs Take care of long-term preservation issues – Provide objects’ persistence, authenticity and integrity – Selection and curation issues Make valuable data visible to others – Provide high-quality metadata with explicit syntax and declared semantics – Joint portals to show the metadata with smart filters and proper context Make data accessible to humans and tools – SSO using federated identity and authentication – Allow building virtual collections and workflows – Increase the interoperability between resources - push pragmatic standardization
13
DASISH Topics Data Representation and Validation – Investigate current workflows and tool use wrt. formats – Distinguish for formats for short and long term preservation and shared and non-shared data. Legal & Ethical issues – Investigate required classification of data wrt. restricted access, IPR and licensing – Harmonization licenses, academic use Policies for Curation and Selection – Selection process should be (stake holder) policy based – Explicit criteria for determining relevance of data Efficient high quality Questionnaire Creation and Processing – Multilingual aspect using terminology registries is of interest to DASISH as a whole Usage monitoring Help educate a new generation of users: Training, Education, Support & Helpdesk
14
Possible use cases I These have not yet been fleshed out. Some ideas from the CLARIN side: Social scientists have recordings that are of interest to linguists. – Locate these using appropriate metadata and process it with LT tools to analyze gesture – The analysis results should be (after evaluation) deposited into an archive with proper references to the primary data – The analysis data should again be registered with proper metadata for reuse Use of demographic data for corpus building and use – Give a linguist building a balanced speech corpus access to demographic data – How many of the speakers need to be older than 65 for the corpus to be a representative sample
15
1Management44 2Architecture and Quality assessment Liaise with other e-infra initiatives Get requirements for a ref. architecture Assess results 83 3Data QualityImprove EU wide survey quality: terminology, translation, vocabulary normalizations 200 4Archiving State of preservation in SSH Assessment of deposit services recommendations, negotiations. Deposit service convergence 67 5Data Access and Enrichment Federated login PID requirements Metadata quality improvement Joint metadata domain Workflow use cases Annotation framework 171 6Legal and Ethical Issues identify legal and ethical issues wrt. current and new SSH data types resulting from the integration, linking and archiving Legal & ethical VCC 68 7Education and TrainingTraining modules, workshop program56 8DisseminationCommunication strategy and means34 DASISH Work packages
16
Possible use cases II Combine maps of linguistic dialects or linguistic micro- variation with migration statistics. – Looking at variation both in place and time should be interesting Make metadata on medieval texts & literature available on the web and interlink it with manuscripts and transcripts available from cultural heritage institutions – Have possibility to add enhancements and comments from the research community Give historians of science and ideas access to language technology to analyze historical texts – Allows following the appearance and spread of new concepts and inventions – The Dutch CCKC project used this to analyze the correspondence between scientists in the 17 th century
17
Thank you for your attention
18
Background DASISH – Digital Services Infrastructure for Social Sciences and Humanities ESFRI projects Communities – CESSDA - Council of European Social Science Data Archives – CLARIN - Common Language Research Infrastructure Network – DARIAH - Digital Research Infrastructure for the Arts and Humanities – ESS - European Social Survey – SHARE – Survey of Health, Aging and Retirement in Europe
19
DASISH Work Packages (tentative) WP 1 Management – U Gothenburg/CESSDA WP 2 Architectures and Solutions - DANS/DARIAH WP 3 Data Quality – U London/ESS WP 4 Data Archiving – U Gothenburg/CESSDA WP 5 Shared Access & Enrichment – MPI/CLARIN WP 6 Legal and Ethical Issues – U Mannheim/SHARE WP 7 Education and Training - ?/DARIAH WP8 Dissemination - ?/CLARIN
20
ESS – European Social Survey Start date mentioned 2001 Although the ESS was always intended to be a time series, it has hitherto been funded on a round-by-round basis. The central coordination and design has been funded through the European Commission’s Fifth and Sixth Framework Programmes and the European Science Foundation. The national scientific funding bodies in each country cover the costs of fieldwork. FifthSixth The ESS is also among the first social science projects to receive funding to support its infrastructure and in 2005 was awarded Europe’s top annual science award, the Descartes prize. infrastructureDescartes The European Social Survey (the ESS) is an academically-driven social survey designed to chart and explain the interaction between Europe's changing institutions and the attitudes, beliefs and behaviour patterns of its diverse populations. Now preparing for its fifth round, the survey covers more than 30 nations and employs the most rigorous methodologies. A repeat cross-sectional survey, it has been funded through the European Commission’s Framework Programmes, the European Science Foundation and national funding bodies in each country. The ESS is also among the first social science projects to receive funding to support its infrastructure. In 2005 the ESS was awarded Europe’s top annual science award, the Descartes prize. The ESS has also been nominated by ESFRI as a possible future European Research Infrastructure Consortium. Click here to read more details about the ESS Preparatory Phase. infrastructure Descartes prizehere
21
SHARE Survey of Health Aging and Retirement in Europe The Survey of Health, Ageing and Retirement in Europe (SHARE) is a multidisciplinary and cross-national panel database of micro data on health, socio- economic status and social and family networks of more than 45,000 individuals aged 50 or over. As such, it responds to a Communication by the European Commission calling to "examine the possibility of establishing, in co-operation with Member States, a European Longitudinal Ageing Survey". By now SHARE has become a major pillar of the European Research Area and in 2008 was selected as one of the projects to be implemented in the European Strategy Forum on Research Infrastructures (ESFRI). The SHARE data collection has been primarily funded by the European Commission through the 5th framework programme (project QLK6-CT-2001- 00360 in the thematic programme Quality of Life) and through the 6th framework programme (projects SHARE-I3, RII-CT- 2006-062193, and COMPARE, CIT5-CT-2005-028857). Additional funding came from the U.S. National Institute on Aging5th framework programme6th framework programmeU.S. National Institute on Aging http://www.share-project.org/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.