Presentation is loading. Please wait.

Presentation is loading. Please wait.

Open Archives Iniative – Protocol for Metadata Harvesting Iztok Kavkler, University of Ljubljana Some slides by Stefaan Ternier, KUL Bram Vandenputte,

Similar presentations


Presentation on theme: "Open Archives Iniative – Protocol for Metadata Harvesting Iztok Kavkler, University of Ljubljana Some slides by Stefaan Ternier, KUL Bram Vandenputte,"— Presentation transcript:

1 Open Archives Iniative – Protocol for Metadata Harvesting Iztok Kavkler, University of Ljubljana Some slides by Stefaan Ternier, KUL Bram Vandenputte, KUL Joris Klerkx, KUL

2 2 What is OAI? Harvesting standard, documented at http://www.openarchives.org/OAI/openarchivesprotocol.html Seven service verbs – Identify – ListMetadataFormats – GetRecord – ListRecords – ListIdentifiers – ListSets Allows multiple metadata formats – DC (Dublin core) format mandatory

3 3 How OAI works OAI “VERBS” – Identify – ListMetadataFormats – GetRecord – ListIdentifiers – ListRecords – ListSets HARVESTERHARVESTER REPOSITORYREPOSITORY OAI Service Provider Metadata Provider HTTP Request HTTP Response (OAI Verb) (Valid XML)

4 4 Try it Install Apache-Tomcat or any other Java servlet container Download WAR file from http://fire.eun.org/Iztok/OAILREApp.war Deploy WAR Demo html http://localhost:8080/OAILREApp/ Or type a service verb, e.g. http://localhost:8080/OAILREApp/oaiHandler?verb=Identify

5 5 The raw XML By default, the resulting XML has stylesheet attached for pretty rendering To remove the stylesheet comment the line OAIHandler.styleSheet=testoai/oaicat.xsl in file oaicat.properties (in WAR file or the web-app dir)

6 6 OAI XML example 2007-06-11T06:48:58Z http://localhost:8080/OAILREApp/oaiHandler oai:oai.xyz-repository.com:exercises/112553 2007-06-09T22:38:28Z exercises....... <resumptionToken expirationDate="2007-06-11T07:48:58Z" completeListSize="42" cursor="10">1181544538265

7 7 OAICat - a Java implementation OAICat home at http://www.oclc.org/research/software/oai/cat.htm Takes care of – web service details – OAI XML specification The implementer has to provide three classes – RepositoryOAICatalog – RepositoryRecordFactory – Repository2oai_dc (lom,...) - usually more than one

8 8 A sample implementation (Source code and libs in http://fire.eun.org/Iztok/OAILREApp.zip) http://fire.eun.org/Iztok/OAILREApp.zip Create a new web module Add servlet oaiHandler to web.xml LreOAIHandler ORG.oclc.oai.server.OAIHandler 5 LreOAIHandler /oaiHandler

9 9 (cont) Define properties file location properties oaicat.properties Welcome file for testing testoai/index.html

10 10 Sample record A record with basic fields id, url, title, descr and date SampleOAICatalog contains an array with 3 sample records

11 11 SampleOAICatalog.listIdentifiers Parameters – from – date to harvest from (String in iso8601 format) date or datetime - depends on granularity – to – date to harvest to – set – a set name, list only records from this set (if null, list all records) set names classify objects in natural groups every record may belong to multiple sets (or none) – metadaPrefix – list only records that support this format (sample formats: oai_dc, oai_lom,...)

12 12 SampleOAICatalog.listIdentifiers Must return a map with to fields – headers – a String iterator of OAI headers – identifiers – a String iterator of OAI identifiers Both created by the call (rec is a SampleRecord) String[] header = getRecordFactory().createHeader(rec); headers.add(header[0]); identifiers.add(header[1]); Create result Map listIdMap = new HashMap (); listIdMap.put("headers", headers.iterator()); listIdMap.put("identifiers", identifiers.iterator()); return listIdMap ;

13 13 getRecordFactory().createHeader(rec) Creates header by calling the methods in SampleRecordFactory String getOAIIdentifier(Object rec) – return full oai identifier “oai:oay.rep.com:id001” String getDatestamp(Object rec) – returns date in iso8601 format Iterator getSetSpecs (Object rec) ArrayList list = new ArrayList (); list.add(...); return list.iterator(); Iterator getAbouts (Object rec) String fromOAIIdentifier(String id) – helper method – convert id to a local id

14 14 SampleOAICatalog.listSets takes no parameters, returns the list of all sets in this repository – each ListIdentifiers or ListRecords query may contain a set name, limiting the results to just one set

15 15 SampleOAICatalog.getSchemaLocations like GetRecord, but returns the Vector of all metadata schema locations the record supports – to obtain them, just call getRecordFactory().getSchemaLocations(rec);

16 16 SampleOAICatalog.getRecord String getRecord(String id, String metadataPrefix) – find record and convert it to xml string ( element) – id is in global format – to get local value call getRecordFactory().fromOAIIdentifier(id) – throw IdDoesNotExistException if record not found – to generate XML use constructRecord constructRecord(rec, metadataPrefix)

17 17 SampleOAICatalog.listRecords just like ListIdentifiers, only generates a list of XML elements return a map with one element Map listRecMap = new HashMap (); listRecMap.put(“records", records.iterator()); return listRecMap;

18 18 Crosswalks Conversions of native record type to XML like Sample2oai_lom or Sample2oai_dc Only two methods per implementation – boolean isAvailableFor(Object rec) – String createMetadata(Object rec) SampleRecord record = (SampleRecord) rec; return LOMFormat.writeStringWithSchema(record.toLOM()); throw CannotDisseminateFormatException if the metadata not available in this format

19 19 SampleRecord.toLOM uses LOM-j lib to quickly hack together LOM http://sourceforge.net/projects/lom-j/ http://sourceforge.net/projects/lom-j/ – automatic serialization/deserialization of LOM and DC XML formats Example lom.newGeneral().newIdentifier(0).newCatalog().setString("lre"); lom.newGeneral().newIdentifier(0).newEntry().setString("sample:" + id); lom.newTechnical().newLocation(-1).setString(url); lom.newGeneral().newTitle().newString(0).newLanguage().setValue("en"); lom.newGeneral().newTitle().newString(0).setString(title);

20 20 Resumption A repository usually has fixed limit on the numer of records to return in one call – if there are more available, it returns a resumption token, allowing to receive next packet – Implemented by functions listIdentifiers(String resumptionToken), listRecords(String resumptionToken) – see XYZOAICatalog for details

21 21 References http://www.openarchives.org/OAI/openarchivesprotocol.html http://www.fmf.uni-lj.si/~kavkler/ http://www.oclc.org/research/software/oai/cat.htm http://www.cs.kuleuven.ac.be/~hmdb/SqiOaiMelt http://sourceforge.net/projects/lom-j/ SIO/Trubar OAI url http://sio.edus.si/LreTomcat/ http://sio.edus.si/LreTomcat/


Download ppt "Open Archives Iniative – Protocol for Metadata Harvesting Iztok Kavkler, University of Ljubljana Some slides by Stefaan Ternier, KUL Bram Vandenputte,"

Similar presentations


Ads by Google