Tim Brody University of Southampton CiteBase Services 13/07/2001.

Slides:

Advertisements

Similar presentations

A centre of expertise in digital information management The OAI Protocol for Metadata Harvesting Andy Powell UKOLN,

Advertisements

1 Web Search Environments Web Crawling Metadata using RDF and Dublin Core Dave Beckett Slides:

What is GNU EPrints 2? Creates Online Archive Free Software OAI Compliant Targeted at Scholarly Material Adaptable Extendable.

Search, access and impact: Web citation services Tim Brody Intelligence, Agents, Multimedia Group University of Southampton.

OAI Protocol for Metadata Harvesting Tim Brody Intelligence, Agents, Multimedia Group University of Southampton OpCit –

28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.

A busy persons introduction to OAI-PMH Christopher Gutteridge ALT, April 2003.

From eprint archives to open archives and OAI: the Open Citation project By The Open Citation Project team Presented by Steve Hitchcock, Southampton University.

A brief overview of the Open Archives Initiative and OpenURL Steve Hitchcock Open Citation Project (OpCit) Southampton University Prepared for Z39.50/OAI/OpenURL.

A brief overview of the Open Archives Initiative Steve Hitchcock Open Citation Project (OpCit) Southampton University Prepared for Z39.50/OAI/OpenURL plenary.

Southampton University Research e-Prints: e-Prints Soton School of Medicine Discussion 19 Jan 2005 Pauline Simpson Elizabeth.

Southampton University Research Repository POETS discussion on Open Access Pauline Simpson and Jessie Hey 23 September 2004.

Open Access for authors, researchers and their institutions Presented by Steve Hitchcock, School of Electronics and Computer Science (ECS), Southampton.

Revealing a New Dynamic: Interaction in an Open Access Archive Steve Hitchcock The Open Citation Project (OpCit), Southampton University These slides prepared.

From eprint archives to open archives and OAI: the Open Citation project By The Open Citation Project team Presented by Steve Hitchcock, Southampton University.

IST Humboldt University Berlin, Germany – Computer and Media Service – Electronic Publishing Group Birgit Matthaei, 4th Sept. 2003, Bath,

IST Humboldt-University, Berlin, Germany - Electronic Publishing Group - Computing Centre / University Library Susanne Dobratz, 28. March.

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ CERN Document Server Software Martin Vesely CERN Geneva, Switzerland.

Heinrich Stamerjohanns Institute for Science Networking Distributed Open Archives Dr. Heinrich Stamerjohanns Institute for Science Networking at the University.

OLAC Metadata Steven Bird University of Melbourne / University of Pennsylvania OLAC Workshop 10 December 2002.

Deconstructing Cataloging A Web Services Approach to Bibliographic Control Thomas Hickey.

DRIVER TUTORIAL OAI 6 DRIVER Guidelines Geneva, 17th June 2009 Friedrich Summann Bielefeld University Library.

Institutional Repositories and Self-Archiving Crisis? What Crisis? Bill Hubbard SHERPA Project Manager University of Nottingham.

Repositories, Learned Societies and Research Funders Stephen Pinfield University of Nottingham.

OpenDOAR and ROAR RSP Services Day, Bath, 15 th Jan.2009 Peter Millington SHERPA Technical Development Officer SHERPA, University.

SHERPA Din guide til det åpne landskapet 31. oktober 2007 Peter Millington SHERPA Technical Development Officer SHERPA, University.

RoMEO, JULIET & OpenDOAR Services that can enhance your repository JISC Repositories & Preservation Programme Meeting, Bristol,

Building Repositories of eprints in UK Research Universities Bill Hubbard SHERPA Project Manager University of Nottingham.

EPrints 2.0 / March 4 th 2002 / Glasgow / Chris Gutteridge Introduction to EPrints 2.0 March 4 th 2002 Glasgow Christopher Gutteridge from the Department.

Open Scholarship 2006 Bielefeld Academic Search Engine a Scientific Search Service for Institutional Repositories Open Scholarship 2006 New Challenges.

Daedalus Service Development Stephen Gallacher Lesley Drysdale.

Digital Preservation for Digital Repositories David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.

RCUK, Octiber Archiving research data and research publications. Dr Leslie Carr, Intelligence, Agents Multimedia, University of Southampton Dr Simon.

German Physical Society (DPG): Open Access in physics Regensburg March 2007.

Lessons from the Open Citation Project Presented by Steve Hitchcock, Southampton University These slides prepared for The Open Archives Initiative: application.

Sunday October 28, www.eprints.org Tim Brody - Stevan Harnad -

Y.T. a brief history of the OAI 0 Kaynak: Herbert van de Sompel.

June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.

COMP 6701 eScience Project Semantic Web for Museums ___ Initial Student : Yan Wang Client/Technical Supervisor : Tom Worthington Academic Supervisor :

OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.

Open Access Citation Index Services Tim Brody Intelligence, Agents, Multimedia Group University of Southampton.

The Open Archives Initiative Simeon Warner Cornell University, Ithaca, NY, USA CREPUQ 2002, Montréal, Canada 14:00, 24 October 2002.

Basic Concepts Architecture Topology Protocols Basic Concepts Open e-Print Archive Open Archive -- generalization of e-print Data Provider and Service.

Databases & Data Warehouses Chapter 3 Database Processing.

PSIgate Knowledge Exchange: Using OAI to Share Information Paul Meehan, PSIgate Technical Manager UKSG Meeting. May 14, 2003.

OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.

Eprints Open Source Document Repository Henok Mikre ORNL and University of Tennessee Summer Intern 1.

Herbert van de sompel Workshop on OAI and peer review journals in Europe Geneva, Switserland – March 22nd to 24th 2001 Herbert Van de Sompel Cornell University.

University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.

4th March 2002Tim Brody 1 A joint JISC/NSF project.

07/11/2002Thomas Baron - JACoW Workshop1 CERN Library Requirements T. Baron CERN ETT-DH-CDS.

A centre of expertise in digital information management RDN, e-Prints UK and NOF- Digitise: a (very) small sample of UK OAI activity Andy.

The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --

WDC-MARE – World Data Center for Marine Environmental Sciences Data portal based on Open Archives Initiative Protocols and Apache Lucene Uwe Schindler,

Digital Commons & Open Access Repositories Johanna Bristow, Strategic Marketing Manager APBSLG Libraries: September 2006.

UKOLN is supported by: The Open Archives Initiative Protocol for Metadata Harvesting and ePrints UK AULIC Institutional Repositories Meeting University.

The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --

IUScholarWorks Technical Overview Randall Floyd Digital Library Program Programmer/Database Administrator.

Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi

Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.

1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,

Arc – Federated Searching Service Kurt Maly, Xiaoming Liu, M.Zubair, Michael L.Nelson Old Dominion University January 23, 2001.

ELISQ Systems Demonstration Sagnik Ray Choudhury Doha -- May 2015.

Bielefeld Academic Search Engine

Building Search Systems for Digital Library Collections

Institutional Repository at NIO: Inspiration to Implementation

Building on the shoulders of Giants: the Scholarly Web

PRESERV PReservation Eprint SERVices

Open Archives Initiative

Digitometric Services for Open Archives Environments

Presentation transcript:

Tim Brody University of Southampton CiteBase Services 13/07/2001

Content History What is CiteBase Problems: –Searching & Information Retrieval –OAI & Distribution Problems –Usage Questions Future

History Researcher for OpCit summer 2000 – – Started CiteBase as part of my 3rd year project Sept Dienst/Santa Fe Most work done during Spring 2001 Thesis completed May 2001

What is CiteBase Prototype Database (MySQL) MetaData - OAI (arXiv, cogprints) Citation Data - OpCit (arXiv) Ranked searches, a la Google/CiteSeer Static hit data, demo of other criteria re-exports metadata+citation data via OAI, opcit_dc => AMF?

Problems 1 Searching & Information Retrieval –Large data sets, ~170,000 records, 170mb of searchable data, potentially millions (bigger than web?) –Requires custom ranking, not just best match –SQL search is >O(N) –MySQL text-index is too fuzzy –ARC uses Oracle, expensive! –SQL best solution for metadata?

Problems 2 OAI & Distribution Problems –Reliant upon source archives … (XML problems, format, semantic, reliability) –Harvest from when? problem –Identifier change problems/deletion –Redistribution, should datestamps be changed? –Subjects nice idea in practice … –Only texts identified, what about people/institutions/journals? –No clear solution for peer-archives/conflicts

Problems 3 Usage Questions –Should we store full-text? –Who is going to use these services? –Are we all things to all people, or subject specific? –How careful should we be with ranking (how do we prevent abuse)? –What archives do we expose (trust question)? –How to keep citation links up-to-date –Multiple language handling?

Future Implement AMF at source archives Re-assess metadata requirements/storage Find a better solution for I.R.: Cheshire, gnoSearch, Oracle? Prevent abuse (self-citation etc.) Implement usage tracking (hit ranking)