Stuart Jeffrey, Julian Richards, Fabio Ciravegna Stewart Waller, Sam Chapman, Ziqi ZhangTony Austin. STAR/Archaeotools Workshop, York, 9 th May 2008. Stuart.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

DRIVER Step One towards a Pan-European Digital Repository Infrastructure Norbert Lossau Bielefeld University, Germany Scientific coordinator of the Project.
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Libby Bishop Online Qualitative Data Resources: Best Practice in Metadata Creation.
Issues in methods and reuse for hypermedia ethnography Presented at QUADS Showcase day September 28, 2006 Louise Corti.
Metadata workshop, June The Workshop Workshop Timetable introduction to the Go-Geo! project metadata overview Go-Geo! portal hands on session.
Report on progress Stakeholder workshop, 29 Jan 2003.
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Calstock Parish Archive History on the Ground Project.
SEVENPRO – STREP KEG seminar, Prague, 8/November/2007 © SEVENPRO Consortium SEVENPRO – Semantic Virtual Engineering Environment for Product.
STELLAR Introduction Ceri Binding, Douglas Tudhope Hypermedia Research Unit, University of Glamorgan.
Grand Designs: reflections on archaeology, the historic environment and the E-science programme Dr William Kilbride E-science? Eh? Collaboration? Data.
Discove r Humanities and Social Science Electronic Thesaurus - HASSET Faceted search HASSET is the subject thesaurus that the UK Data Service uses to index.
H E I R N E T Historic Environment Information Resources Network.
STELLAR Introduction Douglas Tudhope Hypermedia Research Unit, University of Glamorgan.
Transatlantic Archaeology Gateway Stuart Jeffrey, 10 th March 2010 The ADS and an introduction to web services, or a brief history of archaeological interoperability.
Learning and Teaching with the UK Census Developing the Collection of Historical and Contemporary Census Data and Materials into a Major Learning and Teaching.
Ontology Classifications Acknowledgement Abstract Content from simulation systems is useful in defining domain ontologies. We describe a digital library.
Part of the Arts and Humanities Data Service and the UK Data Archive. Funded by the Joint Information Systems Committee and the Arts and Humanities Research.
Part of the Arts and Humanities Data Service and the UK Data Archive. Funded by the Joint Information Systems Committee and the Arts and Humanities Research.
Joint Information Systems Committee Supporting Higher and Further Education Development of an Information Environment for UK Learning and Teaching NOF-Digitise.
GIS e-Science: developing a roadmap Paul S. Ell Centre for Data Digitisation & Analysis Queen’s Belfast.
Metadata Standards Anita Coleman, Asst. Prof. School of Information Resources & Library Science, University of Arizona, Tucson.
Project IST_1999_ ARTISTE – An Integrated Art Analysis and Navigation Environment Review Meeting N.1: Paris, C2RMF, November 28, 2000 Workpackage.
Usability Evaluation of Digital Libraries Stacey Greenaway Submitted to University of Wolverhampton module Dec 15 th 2006.
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
Digging Up Data: The Archaeotools project, Faceted Classification and Natural Language Processing in an archaeological context. Stuart Jeffrey, Julian.
Digital Library Architecture and Technology
KOS-based tools for archaeological dataset interoperability: NKOS Workshop, ECDL 2010 C. Binding, K. May 1, D. Tudhope, A. Vlachidis Hypermedia Research.
Using an ontology-driven system to integrate museum information and library information Paper presented on the occasion of the Symposium on Digital Semantic.
© Copyright 2012 STI INNSBRUCK
An Overview of the Research Information Metadata Ecosystem Prof Keith G Jeffery ©Keith G JefferyAn Overview.
Managing the Record of Research At the Smithsonian Using SIdora SAA Research Forum August 12, 2014.
Mining the Semantic Web: Requirements for Machine Learning Fabio Ciravegna, Sam Chapman Presented by Steve Hookway 10/20/05.
Knowledge Organization Systems and Information Discovery Douglas Tudhope Inaugural Lecture.
THE LEGACY OF FACETED CLASSIFICATION Brian Vickery and the Classification Research Group.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
The Agricultural Ontology Service (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Library and Documentation Systems.
Emma Bayne Discovery: Enhancing search experience through interface design.
TAG: Transatlantic Archaeology Gateway Faunal Remains Workshop York 10 March 2010.
Michael Charno 2,000 years in the making, 2 weeks to record, 2 days to archive, too difficult to reference? How DataCite is unlocking the potential of.
Natural Language Processing in Archaeology: disciplinary impact and beyond. Arts and Humanities E-Science Project Meeting, UCL, London, June 8 th 2009.
An Interoperable Portal for the Historic Environment Tony Austin, Julian Richards Archaeology Data Service, Department of Archaeology,
Image BioInformatics Research Group Department of Zoology University of Oxford, UK CERIF Data Surgery University of Bath 9 February.
The Archaeotools project, faceted classification and natural language processing in an archaeological context. University of York, April 2008.
1 Context-Aware Internet Sharma Chakravarthy UT Arlington December 19, 2008.
The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Tutorial on XML Tag and Schema Registration in an ISO/IEC Metadata Registry Open Forum 2003 on Metadata Registries Tuesday, January 21, 2003; 4:45-5:30.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Jon Bateman Transatlantic Archaeology Gateway The Transatlantic Archaeology Gateway: fishing data from the pond Jon Bateman and.
The Application of Semantic Technologies to Scientific Archives J. Steven Hughes Daniel J. Crichton J. Steven Hughes Daniel J. Crichton Science Archives.
Semantic Web 06 T 0006 YOSHIYUKI Osawa. Problem of current web  limits of search engines Most web pages are only groups of character strings. Most web.
STAR, STELLAR and SKOS Ceri Binding, Phil Carlisle, Keith May, Doug Tudhope, Andreas Vlachidis University of Glamorgan and English Heritage.
Copyright © The Polis Center GIS for Historians The North American Religion Atlas and Indiana Online Bloomington, Indiana April 16, 2002 Karen Frederickson.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
The Agricultural Ontology Server (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Food and Agriculture Organization.
When ontology and reality collide:
Integrating Data for Archaeology
Topics Covered in COSC 6340 Data models (ER, Relational, XML (short))
Cataloging the Internet
Statistical Knowledge Patterns: Identifying Synonymous Relations in Large Linked Datasets Ziqi Zhang, Anna Lisa Gentile, Eva Blomqvist, Isabelle Augenstein,
The Welsh Natural Language Toolkit
Topics Covered in COSC 6340 Data models (ER, Relational, XML)
Dr Kristin Stock Allworlds Geothinking
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
C. Binding, K. May1, R. Souza, D. Tudhope, A. Vlachidis
Context-Aware Internet
Record your QUESTIONS as your read.
Science has three major facets - SAM
Science has three major facets - SAM
Presentation transcript:

Stuart Jeffrey, Julian Richards, Fabio Ciravegna Stewart Waller, Sam Chapman, Ziqi ZhangTony Austin. STAR/Archaeotools Workshop, York, 9 th May Stuart Jeffrey, Julian Richards, Fabio Ciravegna, Stewart Waller, Sam Chapman, Ziqi Zhang, Tony Austin. STAR/Archaeotools Workshop, York, 9 th May The Archaeotools project: faceted classification and natural language processing in an archaeological context.

AHRC-EPSRC-JISC eScience research grants scheme: AIM: To allow archaeologists to discover, share and analyse datasets and legacy publications which have hitherto been very difficult to integrate into existing digital frameworks BUILDS UPON: Common Information Environment Enhanced Geospatial browser PARTNERS: Natural Language Processing Research Group, Department of Computer Science, University of Sheffield Joint Information Systems Committee

Workpackage 1 - Advanced Faceted Classification /Geo-spatial browser – 1m+ records; 4 primary facets (What, Where, When and Media).Workpackage 1 - Advanced Faceted Classification /Geo-spatial browser – 1m+ records; 4 primary facets (What, Where, When and Media). Workpackage 2 – Natural language processing /Data-mining of Grey Literature; plus taggingWorkpackage 2 – Natural language processing /Data-mining of Grey Literature; plus tagging Workpackage 3 – Data-mining of Historic Literature; plus geoXwalkWorkpackage 3 – Data-mining of Historic Literature; plus geoXwalk Three distinct Workpackages:

Datasets include: –National Monuments Records (Scotland, Wales, England) –Excavation Index (EH) –Archive Holdings –Local Authority Historic Environment Records Thesauri include: –Thesaurus of Monuments Types (TMT) –Thesaurus of Object Types –MIDAS Period list –UK Government list of administrative areas, County, District, Parish (CDP) – Not MIDAS

Oracle RDBMS MIDAS XML Record Information Extraction RDF Resource Knowledge triple store XML Docs of Thesaurus Query User Interface Information Extraction When, Where, What ontologies as entries to faceted index Input

Workpackage 1 - Advanced Faceted Classification /Geo-spatial browser – 1m+ records; 4 primary facets (What, Where, When and Media).Workpackage 1 - Advanced Faceted Classification /Geo-spatial browser – 1m+ records; 4 primary facets (What, Where, When and Media). Workpackage 2 – Natural language processing /Data-mining of Grey Literature; plus taggingWorkpackage 2 – Natural language processing /Data-mining of Grey Literature; plus tagging Workpackage 3 – Data-mining of Historic Literature; plus geoXwalkWorkpackage 3 – Data-mining of Historic Literature; plus geoXwalk

Workpackage 1 - Advanced Faceted Classification /Geo-spatial browser – 1m+ records; 4 primary facets (What, Where, When and Media).Workpackage 1 - Advanced Faceted Classification /Geo-spatial browser – 1m+ records; 4 primary facets (What, Where, When and Media). Workpackage 2 – Natural language processing /Data-mining of Grey Literature; plus taggingWorkpackage 2 – Natural language processing /Data-mining of Grey Literature; plus tagging Workpackage 3 – Data-mining of Historic Literature; plus geoXwalkWorkpackage 3 – Data-mining of Historic Literature; plus geoXwalk

/

“WHAT” Records that have no subject information Records that use terms not found in TMT, so these records cannot be indexed (6,442 unique terms) Records (1,001,407) 19,269 records (2%) Records (1,001,407) 101,507 records (10.1%)

“WHEN” Records that have no temporal information Records that use period terms not found in MIDAS so these records cannot be indexed (457 types of irresolvable dates) Records (1,001,407) 292,793 records (29.2%) Records (1,001,407) 114,505 (11.4%) 1066, ,11 th Centuary, C11, 11C, Eleventh Century

“WHERE” Records that have no spatial information Records that use terms not found in CDP, so these records cannot be indexed. Records (1,001,407) 11,126(1.1%) Records (1,001,407) 245,601 records (24.5%)