When ontology and reality collide:

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Resource description and access for the digital world Gordon Dunsire Centre for Digital Library Research University of Strathclyde Scotland.
Metadata workshop, June The Workshop Workshop Timetable introduction to the Go-Geo! project metadata overview Go-Geo! portal hands on session.
Prototype Knowledge Base: an on-line information service in dependability and security Hugh Glaser Electronics & Computer Science University of Southampton.
ICT in Arts and Humanities Research e-Science in the Arts and Humanities 7 July 2006.
© UNIVERSITETETS SENTER FOR INFORMASJONSTEKNOLOGI UNIVERSITETET I OSLO USIT Side 1 Knowledge organization with TopicMaps Thomas Flemming, web-gruppa USIT.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Calstock Parish Archive History on the Ground Project.
STELLAR Introduction Ceri Binding, Douglas Tudhope Hypermedia Research Unit, University of Glamorgan.
Discove r Humanities and Social Science Electronic Thesaurus - HASSET Faceted search HASSET is the subject thesaurus that the UK Data Service uses to index.
H E I R N E T Historic Environment Information Resources Network.
STELLAR Introduction Douglas Tudhope Hypermedia Research Unit, University of Glamorgan.
Learning and Teaching with the UK Census Developing the Collection of Historical and Contemporary Census Data and Materials into a Major Learning and Teaching.
Ontology Classifications Acknowledgement Abstract Content from simulation systems is useful in defining domain ontologies. We describe a digital library.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Part of the Arts and Humanities Data Service and the UK Data Archive. Funded by the Joint Information Systems Committee and the Arts and Humanities Research.
Part of the Arts and Humanities Data Service and the UK Data Archive. Funded by the Joint Information Systems Committee and the Arts and Humanities Research.
Joint Information Systems Committee Supporting Higher and Further Education Development of an Information Environment for UK Learning and Teaching NOF-Digitise.
GIS e-Science: developing a roadmap Paul S. Ell Centre for Data Digitisation & Analysis Queen’s Belfast.
Semantic Web Presented by: Edward Cheng Wayne Choi Tony Deng Peter Kuc-Pittet Anita Yong.
Digging Up Data: The Archaeotools project, Faceted Classification and Natural Language Processing in an archaeological context. Stuart Jeffrey, Julian.
Stuart Jeffrey, Julian Richards, Fabio Ciravegna Stewart Waller, Sam Chapman, Ziqi ZhangTony Austin. STAR/Archaeotools Workshop, York, 9 th May Stuart.
Digital Library Architecture and Technology
KOS-based tools for archaeological dataset interoperability: NKOS Workshop, ECDL 2010 C. Binding, K. May 1, D. Tudhope, A. Vlachidis Hypermedia Research.
ICT in Arts and Humanities Research e-Science Institute Public Lecture A Potential for All: e-Science for the Arts and Humanities 30 April 2007.
Using an ontology-driven system to integrate museum information and library information Paper presented on the occasion of the Symposium on Digital Semantic.
Managing the Record of Research At the Smithsonian Using SIdora SAA Research Forum August 12, 2014.
Mining the Semantic Web: Requirements for Machine Learning Fabio Ciravegna, Sam Chapman Presented by Steve Hookway 10/20/05.
Knowledge Organization Systems and Information Discovery Douglas Tudhope Inaugural Lecture.
Digital Enterprise Research Institute HADA – An Access Controlled Application for Publishing and Discovering Linked Government Data Owen Sacco.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Themes Architecture Content Metadata Interoperability Standards Knowledge Organisation Systems Use and Users Legal and Economic Issues The Future.
IPAS project: Providing a Knowledge Desktop Gary Wills, Richard Crowder, Nigel Shadbolt and Sylvia Wong July2008.
TAG: Transatlantic Archaeology Gateway Faunal Remains Workshop York 10 March 2010.
Page 1 Alliver™ Page 2 Scenario Users Contents Properties Contexts Tags Users Context Listener Set of contents Service Reasoner GPS Navigator.
Natural Language Processing in Archaeology: disciplinary impact and beyond. Arts and Humanities E-Science Project Meeting, UCL, London, June 8 th 2009.
An Interoperable Portal for the Historic Environment Tony Austin, Julian Richards Archaeology Data Service, Department of Archaeology,
Image BioInformatics Research Group Department of Zoology University of Oxford, UK CERIF Data Surgery University of Bath 9 February.
The Archaeotools project, faceted classification and natural language processing in an archaeological context. University of York, April 2008.
CNI, 3rd April 2006 Slide 1 UK National Centre for Text Mining: Activities and Plans Dr. Robert Sanderson Dept. of Computer Science University of Liverpool.
1 Context-Aware Internet Sharma Chakravarthy UT Arlington December 19, 2008.
Research Data Management At the Smithsonian Using Sidora CNI December 10, 2013.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
A centre of expertise in digital information management Shaping the e-future? Grids, Web Services and Digital Libraries Professor Tony.
Jon Bateman Transatlantic Archaeology Gateway The Transatlantic Archaeology Gateway: fishing data from the pond Jon Bateman and.
ESIP AQ Cluster Community Components for the Air Quality SBA in AIP-2.
How Linked Open Data helps Museums Collaborate, Reach New Audiences, and Improve Access to art Information Eleanor E. Fink Manager, American Art Collaborative.
The Application of Semantic Technologies to Scientific Archives J. Steven Hughes Daniel J. Crichton J. Steven Hughes Daniel J. Crichton Science Archives.
Semantic Web 06 T 0006 YOSHIYUKI Osawa. Problem of current web  limits of search engines Most web pages are only groups of character strings. Most web.
STAR, STELLAR and SKOS Ceri Binding, Phil Carlisle, Keith May, Doug Tudhope, Andreas Vlachidis University of Glamorgan and English Heritage.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
ARIADNE is funded by the European Commission's Seventh Framework Programme Interoperability Holly Wright.
The Earth System Curator Metadata Infrastructure for Climate Modeling Rocky Dunlap Georgia Tech.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Do MORe with your data LoCloud Final Conference 5th February 2016
TextCrowd – Collaborative semantic enrichment of text-based datasets
Integrating Data for Archaeology
Semantic Database Builder
Martin Moyle Digital Curation Manager UCL Library Services, UK
Topics Covered in COSC 6340 Data models (ER, Relational, XML (short))
The Welsh Natural Language Toolkit
Topics Covered in COSC 6340 Data models (ER, Relational, XML)
Semantic Annotation service
Dr Kristin Stock Allworlds Geothinking
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
C. Binding, K. May1, R. Souza, D. Tudhope, A. Vlachidis
Web archives as a research subject
Context-Aware Internet
Metadata supported full-text search in a web archive
Presentation transcript:

When ontology and reality collide: The Archaeotools project, faceted classification and natural language processing in an archaeological context. Stuart Jeffrey, Julian Richards, Fabio Ciravegna , Stewart Waller, Sam Chapman, Ziqi Zhang ,Tony Austin. CAA Budapest, 5th April 2008

AHRC-EPSRC-JISC eScience research grants scheme: PARTNERS: Natural Language Processing Research Group, Department of Computer Science, University of Sheffield AIM: To allow archaeologists to discover, share and analyse datasets and legacy publications which have hitherto been very difficult to integrate into existing digital frameworks BUILDS UPON: Common Information Environment Enhanced Geospatial browser Joint Information Systems Committee

Three distinct Workpackages: Workpackage 1 - Advanced Faceted Classification /Geo-spatial browser – 1m+ records; 4 primary facets (What, Where, When and Media). Workpackage 2 – Natural language processing /Data-mining of Grey Literature; plus tagging Workpackage 3 – Data-mining of Historic Literature; plus geoXwalk

Datasets include: Thesauri include: National Monuments Records (Scotland, Wales, England) Excavation Index (EH) Archive Holdings Local Authority Historic Environment Records Thesauri include: Thesaurus of Monuments Types (TMT) Thesaurus of Object Types MIDAS Period list UK Government list of administrative areas, County, District, Parish (CDP) – Not MIDAS

Input Input MIDAS XML Record RDF Resource XML Docs of Thesaurus Query Oracle RDBMS MIDAS XML Record RDF Resource Information Extraction Input When, Where, What ontologies as entries to faceted index Knowledge triple store XML Docs of Thesaurus Information Extraction Input Query User Interface

Search Demo 1:Click to zoom in to England

Search Demo 1:Click to choose ‘Results’ tab

Search Demo 1:Click to view ‘EVAN HOWE, North Yorkshire’ record.

Search Demo 1:Click RESET to go back to CIE root slide

“WHAT” Records that have no subject information Records that use terms not found in TMT, so these records cannot be indexed (6,442 unique terms) Records (1,001,407) 19,269 records (2%) Records (1,001,407) 101,507 records (10.1%) 11

“WHEN” Records that have no temporal information Records that use period terms not found in MIDAS so these records cannot be indexed (457 types of irresolvable dates) Records (1,001,407) 292,793 records (29.2%) Records (1,001,407) 114,505 (11.4%) 1066, 1001-1100,11th Centuary, C11, 11C, Eleventh Century 12

“WHERE” Records that have no spatial information Records that use terms not found in CDP, so these records cannot be indexed. Records (1,001,407) 11,126(1.1%) Records (1,001,407) 245,601 records (24.5%) 13

linear

Three distinct Workpackages: Workpackage 1 - Advanced Faceted Classification /Geo-spatial browser – 1m+ records; 4 primary facets (What, Where, When and Media). Workpackage 2 – Natural language processing /Data-mining of Grey Literature; plus tagging Workpackage 3 – Data-mining of Historic Literature; plus geoXwalk

XML tagging of semantic content CIDOC: CRM

University Researchers Local authority curators

Three distinct Workpackages: Workpackage 1 - Advanced Faceted Classification /Geo-spatial browser – 1m+ records; 4 primary facets (What, Where, When and Media). Workpackage 2 – Natural language processing /Data-mining of Grey Literature; plus tagging Workpackage 3 – Data-mining of Historic Literature; plus geoXwalk

http://ads.ahds.ac.uk/project/archaeotools/