Download presentation
Presentation is loading. Please wait.
Published byMyron Marshall Modified over 9 years ago
1
IWIR-CRIS '06 Data retrieval in PURE Data retrieval in the 4-year old PURE CRIS project at 9 universities
2
2 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Agenda ■ Overview ■ Retrieval Validated manual data gathering Dynamic integration to local back-end systems Aggregation, enrichment and import of historic data Experiments with automated imports of historic data ■ Exposure Two web services OAI Z39.50 Reports Portal framework ■ Archiving ■ Near future
3
3 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Overview ■ Brief overview ■ … in order to discuss ingestion, integration, conversion and import in a specific context
4
4 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Overview ■ Brief overview ■ History Development begun in 2002 ■ Users 9 universities (DK+SE), several hospitals + other research institutions ■ Platform and architecture J2EE enterprise application Release management: All users have instances of same release version, same code-base ■ Business model Commercial software licenses, powerful user group, shared budgets ■ Modular Basic module, Reporting module, Student thesis module, External publications module, Bibliometrics module, Press module.
5
5 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Overview
6
6 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Retrieval ■ Manual data gathering ■ User roles/right + workflow: = de-centralized data gathering = validated data gathering = continuous data gathering ■ GUI example ■ Management focus is necessary Reports and statistics, KPI-management, etc. ■ Adding value to researchers is necessary Instantly in Google indexes, instantly updated personal websites, instantly updated CV, increased citations (source in paper), etc.
7
7 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Retrieval ■ Dynamic integration ■ Dynamic integration to local back-end systems: Personnel systems, payroll systems (for data retrieval) LDAPs, Active Directories (for data retrieval + authentication) Single sign-on systems (for authentication) … to automatically create object types such as “person” or “organization” ■ … and yes, PURE hosts data, too We need complete objects according to the meta-data model ■ Plug-in architecture in PURE: Pro = individually adapted integration Con = individually programmed plug-in necessary Future = GUI, standardized plug-ins
8
8 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Retrieval ■ Import ■ Historic data ■ Many sources More or less useful data More or less consequent use of formats :-) ■ The PXA format PURE XML Archive format -.zip based Meta-data, relations between entities, binary files ■ Aggregation > enrichment > conversion > import The process is external to PURE
9
9 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Retrieval ■ Experiments ■ Experiments with automated imports of historic data from specific, identified sources ■ [source format] > PXA conversion > import > enrichment/validation ■ Very poor data quality demands the concept of “draft objects” in PURE
10
10 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ Web services ■ RPC/encoded + document/literal ■ Rich libraries of methods ■ Including format-specific methods: APA, MLA, HARVARD, VANCOUVER and CBE ■ Free and near-instant adding of methods ■ WS code example (if time)
11
11 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ OAI support ■ OAI-PMH data provider ■ OAI-PMH formats ■ DC ■ DDF-MXD (Danish national format) ■ SVEP (Swedish national format) … more to come ■ Also used to harvest other PURE-repositories for “external publications”
12
12 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ Z39.50 ■ Enabling of searches in PURE from library systems ■ SRW/SRU
13
13 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ Reports ■ PURE reporting module ■ GUI example
14
14 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ Reference manager ■ Export of data to local Reference Manager installation ■ Using RM-formatted export file ■ Promotes registering to the repository rather than in RM ■ GUI example
15
15 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ Portal framework ■ PUREportal – free PURE-specific framework for custom development of research exhibition portals ■ Online example ■ Typical cost scenario € 20,000 ■ Typical delivery time 1 month ■ Little need for requirements specification ■ Automatic PURE-API maintenance
16
16 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Archiving ■ Data archiving – 2 levels ■ SQL environment ■ Meta-data and relations ■ Binary files just stored in server file system ■ FEDORA via connector (not PURE-specific, Open Source) ■ Facilitates: Higher quality archival of binary files Long term preservation in general Adoption of PURE in institutions’ general FEDORA strategies
17
17 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Near future ■ The near future regarding data retrieval ■ More automated imports using increasingly advanced converters ■ Automated data delivery (push and harvest) to: Industry specific search services (e.g. PubMed, Nordicom) Documentary data collections (such as clinicaltrials.org), and national collections (such as DDF (DK), ForskDok (NO), etc. ■ Temporary import objects When imported data are not in sufficient quality to create valid objects when data cannot be properly related to other objects upon import
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.