Download presentation
Presentation is loading. Please wait.
1
The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld
2
DRIVER motivation Scholarly communication changes towards distributed provision of text, data and services Repositories are thought as a saviour in this development building such a distributed system An infrastructure supporting distributed repositories and services is needed (and reactions) (needs explanation)
3
Some observations on repositories They represent a shift towards … open internet-exposure as opposed to closed database (‚graveyards‘) content orientation as opposed to mere technical orientation (‚web-servers‘) distributed systems centralized structures not immediateley required nowadays
4
„Everybody can be a publisher“ Common description standards e.g. Dublin Core Metadata Initiative Many subject-specific standards Common transfer protocols e.g. OAI-PMH, but also FTP, XML-RPC, WS, etc. Searchability is possible! Still: many results are lost to re-use/remix Closed: too sensible, weakly described, unimportant (???) Missing service frameworks / infrastructures Problems: Data and service interoperability Solution: „Infrastructure“ Repositories can solve access problem
5
What infrastructures are: DRIVER terms Not an infrastructure Single repository Single application for search and retrieval (e.g. BASE) Only local operation Backwards causation on repositories is missing Maybe an infrastructure Distributed repository landscape as a whole As a capacity for emergent properties, e.g. quality and quantity incentive for data population Nurturing development of service providers Definitely an infrastructure Many service providers in one organisational and technical context (e.g. run-time environment) Enabling re-use and remix of data and services
6
DRIVER Objectives Organisational structure for repositories e.g. the „Confederation“ Improving quality and standards in local rep. e.g. validation procedures Building a distributed runtime system e.g. service and data sharing Target Groups Repository Managers Service Providers Information System Executives
7
The DRIVER approach is incremental Start with publication metadata Existing distributed system, somehow connected Considerable homogeneity and formats: OAI-PMH Extend geographical coverage From 5 countries, to 10, to 27, to ??? Extend towards other contents From publication metadata to enhanced publications, i.e. representations of „texts + data“ Learn about subject specificity Data bring in disciplinary requirements
8
88 The DRIVER Initiative DRIVER-I 6/2006 – 11/2007 Organisational Models and Technical Test-Bed DRIVER-II 12/2007 – 11/2009 Running Organisation and Production Infrastructure DRIVER-Confederation 2010ff Operations Office and Technical Deployment NB: DRIVER is not an authoritative body, it is a liberal bottom-up initiative of stakeholders
9
DRIVER partners and related projects Networking, Support, Policy, Studies Göttingen, Nottingham, SURF, Genth, Ljubiljana, Minho, Copenhagen Technical development and deployment Athens, Bielefeld, Pisa, Warsaw Partners make links to many other things OA-services: Sherpa-ROMEO, OpenDOAR, BASE… Projects: Europeana, PEER, DELOS, DL.org, D4Science, PARSE-Insight, NESTOR… Orgs: DINI, JISC, LIBER, SPARC, KE … Platforms: DSPACE/FEDORA/OPUS/ePrints
10
10 DRIVER-II Midterm Review, January 30, 2009 - Pisa 10 Project structure Networking ResearchService Running Infrastructure: Content & Functionality Construction of Services: ideas, design, development Technical Management Advocacy: attracting users, content and Service providers Discovery: technology watch, EPs requirements
11
Some results
12
Some Results: Studies
13
Some Results: A Portal
14
Some Results: A Search
15
Some Results: Repository Registration
16
Some Results: Guidelines Build on knowledge from past & current IR projects (EU) 26 actively involved contributors (experts and repository managers) from 8 countries. Practical answers on how to: Improve full-text access Standardize metadata quality Create a reliable infrastructure for permanent identification, resolution, traceability and storage Resolve semantic and classification issues
17
Some Results: Support structures
18
Some Results: Repositories 185+ harvested repositories 21 countries 856,264+ documents
19
Some Results: Service-Oriented-Arch. 9 hosting nodes 25+ Functionality typologies (services) 36 service Instances 3 applications: DRIVER Main, Belgium, Spain-Recolecta
20
20 Some Results: Runtime-System & Hosting Enabling Layer Data Layer EU Open Access Repositories Functionality Layer Administrators End users Advanced User Interfaces National portals Project Applications
21
Another Compulsory Design Diagram
22
Some Results: A software Meant for large service providers only!
23
Technicalities
24
DRIVER and standards Service Resources are implemented as Web Services and accessed through the corresponding Web Service Interface Parameters calls are enveloped into SOAP messages The Enabling Services are also compatible with REST XML is the lingua-franca for the whole system Resource internal status, i.e. Resource profiles Profiles in Information Service use Exist XML engine Vocabularies Names of Languages: ISO 639 – 2 (three letters, B/T) Names of Countries: ISO 3166 (two letters) Date format: ISO 8601: 1988 (E) DRIVER Aggregation Harvesting according to OAI-PMH protocol Adopting OAI-Provenance best practice (OAI-about) To be extended to other object models and harvesting protocols Queries to Search and Index obey to SRW/CQL standard
25
25 DRIVER-II Midterm Review, January 30, 2009 - Pisa 25 Enabling Layer Developments FunctionTaskPartnerStatusD-NET IS-StoreResource profile storeEnhanced Port (PERL > JAVA) CNRRC1.1 IS-S&NW3C S&N/TopicsEnhanced Port (PERL > JAVA) CNRRC1.1 IS-LookupResource discoveryEnhanced Port (PERL > JAVA) CNRRC1.1 IS-RegistryResource registration/de- registration/update Enhanced Port (PERL > JAVA) CNRRC1.1 ManagerOrchestration of DRIVER Info Space Enhanced Port (PERL > JAVA) CNRRC1.1 Authn&AuthzService-2-Service secure interaction/multiple applications Enhanced Service (JAVA)ICMProto2.0 MonitoringAdmin User Interface and autonomic administration Novel Service (JAVA)CNRRC1.2
26
26 DRIVER-II Midterm Review, January 30, 2009 - Pisa 26 Data-Layer Developments FunctionTaskPartnerStatusD-NET HarvesterCollects arbitrary formatsPort (PERL > JAVA)UniBi/CNRAlpha2.0 TransformatorEases arbitrary mappingsNovel service (JAVA)UniBi/CNRAlpha2.0 Feature ExtractionExecutes transform.s. and utilities Novel service (JAVA)UniBiAlpha2.0 Text-EngineUtilities, e.g. language detection, full-text-extr. Novel service (JAVA)UniBiAlpha1.1 MD-StoreSupport special MD operations Port (PERL > JAVA)UniBiAlpha1.1 StoreGeneric store for binariesNovel service (JAVA)UniBi/ICM/C NR Proto2.0 IndexLookup table for stored information Adapt from YADDAICM/UniBiProd.1.0 OAI-ORE PublisherExposure of stored information Novel service (JAVA)CNRSpec.2.0 OAI-PMH PublisherExposure of stored information --CNRProd.1.0 Content ServiceManaging complex objectsNovel service (JAVA)CNRProto2.0 Access ServiceGeneric service for using remote objects Novel service (JAVA)CNRProto2.0
27
27 DRIVER-II Midterm Review, January 30, 2009 - Pisa 27 Functional Layer Developments FunctionTaskPartnerStatusD-NET AIDEnhanced Publications management Novel Service (JAVA)NKUASpec.2.0 Advanced searchOptimized Search Similarity Search Enhanced Service (JAVA) Novel Service (JAVA) NKUA ICM Spec. 2.0 User ServicesAdvanced personalizationEnhanced Service (JAVA)NKUASpec.2.0 Community ServiceAdvanced Community management Enhanced Service (JAVA)NKUASpec.2.0 Web InterfaceGeneric to data model and services Enhanced UIs Enhanced Service (JAVA)NKUASpec. Spec 1.2 2.0
28
28 Current Work: DRIVER-II Networking Confederation with who-is-who advisory board Outreach: LIBER, SPARC, US, JAPAN etc… Consolidation DRIVER-I Services packaged and performing in production quality Enhancement DRIVER-I Services Improved indexing and data aggregation functionalities DRIVER-II Services: D-NET v2.0 Enhanced publication management and functionality
29
DRIVER II – D-NET v2.0 Studies What are „Enhanced Publications“? >> PDFPDF Technologies for „Enhanced Publications“ >> PDFPDF Long-Term Preservation of „Enhanced Publications“ „Technology Watch“: the Future >> PDFPDF Demonstrators „Enhanced Publications“ >> LiveLive „Enhanced Publications“ Long-Term Preserv. >> FilmFilm Infrastructure Specs. ready, Development in progress >> WIKIWIKI D-NET v1.1: Java-Porting & Build-System D-NET v1.2: New Aggregator, Installer (, Contracts) D-NET v2.0: Compound Object Management
30
Outlook: Enhanced Publications
31
Based on OAI-ORE
32
The Web-Capable Model – OAI-ORE http://www.openarchives.org/ore/
33
The Document Model for DRIVER
34
The Object Model – Internal Processing Primitives: Types, Sets and Objects Object: atoms, descriptions, relations
35
35 The DRIVER-application
36
Compound Object Management Object InstancesDRIVER Processing DRIVER Application Web-Representation Web-Processing
37
Conclusion
38
Lessons learnt Distributed data infrastructure requires links between organisational and technical concepts Data specialists, computer scientists, service providers Guidelines / content policies as a „glue“ In distributed data provision, quality and access measures are the most ‚expensive‘ tasks Distributed service operation (not data provision) can be solved but asks novel questions (SLAs) „Infrastructure“ for novel paradigms for scholarly communication are hard to get across ;-)
39
Summary DRIVER tackles the data infrastructure challenge from the text-repository side (mostly OAI-PMH) DRIVER handshakes with primary & secondary data through „enhanced publications“ DRIVER isn‘t only a project but a forum for information specialists ‚Products‘ include: Studies, Infrastructure run-time- system in production, software, support … DRIVER has adressed many problems for data and service interoperability in a distributed repository environment and found some solutions
40
But… How could DRIVER link to serious processing of unstructured data?
41
Thanks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.