CRISP WP 17 1 / 2 Proposed Metadata Catalogue Architecture Document.

Slides:



Advertisements
Similar presentations
Richard Jones, Systems Developer Technical Issues for Repository Software Theses Alive! Edinburgh University Library SHERPA Nottingham.
Advertisements

Metadata Progress GridPP18 20 March 2007 Mike Kenyon.
Theo Andrew, Edinburgh University Library Choosing Suitable Open-Source Repository Software Choosing Suitable Open Source Repository Software Theo Andrew.
Towards an information model for I2S2
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
The CERIF-2000 Implementation. Andrei S. Lopatenko CERIF Implementation Guidelines Andrei Lopatenko Vienna University of Technology
Single view of customer Support deposit and loan accounts Fully integrated General Ledger module that can be customised according to customer specification.
Data Catalogue Service Work Package 4. Main Objective: Deployment, Operation and Evaluation of a cataloguing service for scientific data. Why: Potential.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Software Frameworks for Acquisition and Control European PhD – 2009 Horácio Fernandes.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Presented by IBM developer Works ibm.com/developerworks/ 2006 January – April © 2006 IBM Corporation. Making the most of Creating Eclipse plug-ins.
User Group 2015 Version 5 Features & Infrastructure Enhancements.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
JVM Tehnologic Company profile & core business Founded: February 1992; –Core business: design and implementation of large software applications mainly.
Windows.Net Programming Series Preview. Course Schedule CourseDate Microsoft.Net Fundamentals 01/13/2014 Microsoft Windows/Web Fundamentals 01/20/2014.
“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
CISTI Source & SiteSearch OCLC User Meeting 2001 Danielle Langlois & Carol Serroul May 9, 2001.
Getting connected.  Java application calls the JDBC library.  JDBC loads a driver which talks to the database.  We can change database engines without.
Crystal Hoyer Program Manager IIS Team Preview of features that will be announced at MIX09 Please do not blog, take pictures or video of session.
Database Update Kaveh Ranjbar Database Department Manager, RIPE NCC.
Fundamentals of Database Chapter 7 Database Technologies.
BioMart A Federated Query Architecture Arek Kasprzyk European Bioinformatics Institute 26 April 2004.
InDiCo 20 April 2004 EPFL, Lausanne Integrated Digital Conferencing JY Le Meur CERN
Features overview Dynamic parameters and customizable User Interface Searchable, sortable and item selectable report with a variety of output formats.
CRISP WP17 2/2 Data Continuum Achievements & Perspectives 18th March 2013Jean-François Perrin - Institut Laue Langevin - CRISP 2nd Annual Meeting1.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
PHP Features. Features Clean syntax. Object-oriented fundamentals. An extensible architecture that encourages innovation. Support for both current and.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.
Jamie Hall (ILL). SciencePAD Persistent Identifiers Workshop PANData Software Catalogue January 30th 2013 Jamie Hall Developer IT Services, Institut Laue-Langevin.
IUScholarWorks Technical Overview Randall Floyd Digital Library Program Programmer/Database Administrator.
Uwe SchindlerGES 2007 – May 2-4, 2007 Data Information Service based on Open Archives Initiative Protocols and Apache Lucene Uwe Schindler 1, Benny Bräuer.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
Metadata for structural science Workshop on research metadata in context Nijmegen, 7–8 September 2010 Simon Lambert STFC e-Science UK.
DATABASE CONNECTIVITY TO MYSQL. Introduction =>A real life application needs to manipulate data stored in a Database. =>A database is a collection of.
Monte-Carlo Event Database: current status Sergey Belov, JINR, Dubna.
ICAT Schema Current Schema organization What’s there but not yet implemented What could we want in the future 1 ICAT developer workshop, August 2009.
DSpace - Digital Library Software
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
Find Research Data b2find.eudat.eu B2FIND User Training How to find data objects and collections using EUDAT’s B2FIND This work is licensed.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
BEN Tools & Isovera Services Isovera Consulting Cal Collins, Shakib Mostafa, Sergey Demidenko Feb
Using the ICAT API to ingest business and experiment metadata Tom Griffin, STFC ISIS Facility NOBUGS 2012 ICAT Workshop
ATLAS Database Access Library Local Area LCG3D Meeting Fermilab, Batavia, USA October 21, 2004 Alexandre Vaniachine (ANL)
GROUP PresentsPresents. WEB CRAWLER A visualization of links in the World Wide Web Software Engineering C Semester Two Massey University - Palmerston.
Apache Solr Dima Ionut Daniel. Contents What is Apache Solr? Architecture Features Core Solr Concepts Configuration Conclusions Bibliography.
ICAT Status Alistair Mills Project Manager Scientific Computing Department.
Open source IP Address Management Software Review
E-infrastructure requirements from the ESFRI Physics, Astronomy and Analytical Facilities cluster Provisional material based on outcome of workshop held.
ETRIKS Platform for bioinformatics ISGC 17/03/15 Pengfei Liu, CC-IN2P3/CNRS.
Jean-Philippe Baud, IT-GD, CERN November 2007
Icat January st Jan 2013 Jamie Hall / Jean-François Perrin ILL IT services - 4th Harmonisation Meeting at XFEL/DESY.
Introduction the IT and DM Topic
The GEMBus Architecture and Core Components
VI-SEEM Data Discovery Service
ICAT- Experience and activities at ISIS
Pandata Service Verification
Implementing an Institutional Repository: Part II
SCORM Runtime Integration
Database Management Systems
API and Interface Tasks
Internet Engineering Course
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
Presentation transcript:

CRISP WP 17 1 / 2 Proposed Metadata Catalogue Architecture Document

Work package 17 - IT & DM: Metadata Management and Data Continuum Objectives: choose, implement data management and metadata mining services and establish an environment permitting a data continuum from raw data to publications across the participating Research Institutes (RIs): ILL, ESRF, SLHC and EuroFEL. Task plan: 1)Evaluate and adapt metadata catalogues according to the RIs requirements. 2)Deploy and integrate metadata catalogue 3)Prototype of data mining on metadata services. 2Bessone Nicola - ESRF

Evaluate metadata catalogues: Use cases Identified a list of requirement based on ILL, ESRF and DASY use cases. Select a list of most suitable metadata catalogue system on the market. Match the requirements with features proposed by the metadata catalogues. 3Bessone Nicola - ESRF

Evaluate metadata catalogues: Requirements 1)AAA 1.Authentication Modular integration of different authentication systems. 2.Authorization Customizable access control system. 3.Accounting Granular logging information levels. 4Bessone Nicola - ESRF

Evaluate metadata catalogues: Requirements 2)Metadata Model Core Scientific Metadata Model (CSMD) already been developed at STFC Bessone Nicola - ESRF5 Study Investigation Sample Dataset Datafile Parameter

Evaluate metadata catalogues: Requirements 3)Searching method Fulfill user’s search needs, being easy to use and to access (web). Provide data mining to Facilities and Scientific management about data use/access/search/modific. 4)Cross platform 5)Service API Stable set of API possibly programming language agnostic. Bessone Nicola - ESRF6

Evaluate metadata catalogues: Requirements 6)Sustainability 1.Open source 2.Project organization: Actively maintained, Release plan (documentation, update mechanism, backward comp.), Patch release process (security, bug fix) 3.Cutting edge Technology 7)License Free of charge Bessone Nicola - ESRF7

Evaluate metadata catalogues: Requirements 8)Data policy Dynamic authorization system. 9)Scalability & Performance ILL host ~2’000 experiment /year producing ~10’000 datasets. Other facilities possibly more… 10)Data ingestion Manually & automatic + possible harvest (OAI-PMH) 11)Security Protect intellectual property. Bessone Nicola - ESRF8

Evaluate metadata catalogues: Metadata catalogue systems 1.ICAT 2.Dspace 3.Fedora 4.Ckan 5.Invenio 6.Tardis 7.ISPyB 8.iRODS 9.SRB-MCAT 10.MS. Zentity 9Bessone Nicola - ESRF

Evaluate metadata catalogues: Selection result Different solutions have been explored, amongst them ICAT appears to be the only one that currently fits the Data Model requirements. This is the key element for a successful implementation in a reasonable time frame. 10Bessone Nicola - ESRF

Evaluate metadata catalogues: ICAT Authentication plug-in Rule based authorization mechanism Flexible metadata model Search method: full-text, numerical and string search and SQL like query syntax Set of API (Java and Python) Database configurable (Oracle, Posgres and MySQL) Federated search via TopCAT Core Scientific Meta-Data Model (CSMD) 11Bessone Nicola - ESRF

Evaluate metadata catalogues: ICAT Plug-in for DAWN/Mantid Licence: FreeBSD Web interface: TopCAT In use at 11+ RIs 12Bessone Nicola - ESRF

Evaluate metadata catalogues: ICAT Work-in-progress: –Improve web interface (TopCat) –Possibility to harvest (OAI-PMH) –Installation process –Synonym mechanism –Integration with Umbrella authentication Bessone Nicola - ESRF13

Deploy and integrate ICAT: ESRF - Pilot 14Bessone Nicola - ESRF Spec ICAT API RDBMS Web Service API Spec Tomo Xml TomoDB DB Tomo to ICAT xml converter ICAT Xml ICAT xml ingest Actual TomoDB metadata collect structure SMIS 3 SMIS to ICAT ingester

Deploy and integrate ICAT: ESRF - future Bessone Nicola - ESRF15 Spec New Sequencer Experiment metadata Management Scientist controlling the Experiment ICAT API RDBMS Web Service API SMIS API RDBMS Web Service API WEB Interface Data Manager Spec Spec session NEW beamline control system

Deploy and integrate ICAT: ILL Data policy published in Dec 2011 Implementation Oct 2012 ICAT deployment Dec 2012 Currently, ingestion of the Data since Nov Bessone Nicola - ESRF

Future work Complete the deployment (ingestion) at the participating facilities. Data mining –Collect uses cases from the different facilities –Currently all use cases are technically simple (no request for correlation for instance) Work on the search engine (lucene) Reporting Bessone Nicola - ESRF17

Bessone Nicola - ESRF18