What lies beneath? Building a semantic web-ready repository for complex collections Louise Corti UKDA Agostina Martinez, Patrick Carmichael, CARET, Cambridge.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
DCMI Workshop on Metadata and Search Vendor Panel Presentation Bradley P. Allen
Simile and the Semantic Web Draft Presentation for the W3C Technical Plenary Cannes, March 1-5, 2004.
Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
Digital Repositories – Linked Open Data – the possible Role of D4Science Workshop, December 2010, FAO use cases A tool to create Linked Data providers.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
CS570 Artificial Intelligence Semantic Web & Ontology 2
A. Grigorov, A. Georgiev, M. Petrov, S. Varbanov, K. Stefanov Building a Knowledge Repository for Life-long Competence Development.
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
VISUAL UNDERSTANDING ENVIRONMENT Ranjani Saigal and Anoop Kumar Academic Technology Tufts University.
January 2006DSpace User Group Meeting, Sydney, Australia DSpace development from MIT's Digital Library Research Program MacKenzie Smith Associate Director.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
2009 Ensemble Semantic Technologies for the Enhancement of Case Based Learning Patrick Carmichael, Project.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Samad Paydar Web Technology Laboratory Computer Engineering Department Ferdowsi University of Mashhad 1389/11/20 An Introduction to the Semantic Web.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
1 DCS861A-2007 Emerging IT II Rinaldo Di Giorgio Andres Nieto Chris Nwosisi Richard Washington March 17, 2007.
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Managing & Integrating Enterprise Data with Semantic Technologies Susie Stephens Principal Product Manager, Oracle
Information Integration Intelligence with TopBraid Suite SemTech, San Jose, Holger Knublauch
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Deploying Trust Policies on the Semantic Web Brian Matthews and Theo Dimitrakos.
Peer-to-Peer Data Integration Using Distributed Bridges Neal Arthorne B. Eng. Computer Systems (2002) Supervisor: Babak Esfandiari April 12, 2005 Candidate.
Network of Communities: Synergy Through Common Formats, Reuse, and Models for Contribution Cathy Manduca, Sean Fox, Bruce Mason representing SERC, comPADRE,
Digital Enterprise Research Institute HADA – An Access Controlled Application for Publishing and Discovering Linked Government Data Owen Sacco.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
A bad case of content reuse Validator Website to Validate License Violations Validator – Only requires the URI of the site to check This work by Oshani.
Learning Objects on the Semantic Web Permanand Mohan Department of Mathematic and Computer Science University of the West Indies St. Augustine, Trinidad.
A bad case of content reuse Validator Website to Validate License Violations Validator – Only requires the URI of the site to check for a license violation.
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
Towards a semantic web Philip Hider. This talk  The Semantic Web vision  Scenarios  Standards  Semantic Web & RDA.
Semantic Web, Web Services and Museums: Mapping the Road to Implementation John Perkins “MESMUSES Workshop” Florence, June 16-17, 2003.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Semantic Clipboard User Interface is integrated in the Browser Architecture of the Semantic Clipboard Illustration of a license incompliant content reuse.
The Semantic Logger: Supporting Service Building from Personal Context Mischa M Tuffield et al. Intelligence, Agents, Multimedia Group University of Southampton.
Towards some Grand (?) Challenges for Technology Enhanced Learning Richard Noss London Knowledge Lab University of London (IOE/Birkbeck) TLRP - Technology.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Introduction to the Semantic Web and Linked Data
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
The Semantic Library RDF in Practice Robert Wolfe, MIT Libraries NEASIST Awards Dinner 12 June 2008.
Fedora Content Modeling for Improved Services for Research Databases Open Repositories 2009 Mikael Karstensen Elbæk Alfred Heller Gert Schmeltz Pedersen.
REPRESENTING CONTEXT IN AN ARCHIVE OF EDUCATIONAL EVALUATIONS PROJECT ACTIVITIES The project team canvassed opinion across the.
REPRESENTING CONTEXT IN AN ARCHIVE OF EDUCATIONAL EVALUATIONS The project has constructed a permanent archive of significant.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Metayogi Increasing the Accessibility of the Semantic Web Karim Tharani Doug Macdonald Rachel Heidecker.
Data Sources & Using VIVO Data Visualizing Science VIVO provides network analysis and visualization tools to maximize the benefits afforded by the data.
The Semantic Web By: Maulik Parikh.
Cloud based linked data platform for Structural Engineering Experiment
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Knowledge Management Systems
Flexible Extensible Digital Object Repository Architecture
Flexible Extensible Digital Object Repository Architecture
An Architecture for Complex Objects and their Relationships
Metadata in Digital Preservation: Setting the Scene
LOD reference architecture
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
TOOLS & Projects overview
Presentation transcript:

What lies beneath? Building a semantic web-ready repository for complex collections Louise Corti UKDA Agostina Martinez, Patrick Carmichael, CARET, Cambridge IASSIST 2009

The Ensemble Project Semantic Technologies for the Enhancement of Case Based Learning 3 Year, £1.5 Million ESRC/EPSRC Project: Research, Development and Implementation ( ) working with teachers and students in undergraduate and postgraduate courses to explore both the nature and role of the cases around which learning is focused and the part that emerging semantic web technologies can play in supporting this learning a big, happy interdisciplinary and multi-institutional extended family … website: 2

Pedagogy examining teaching and learning in complex, politically or ethically contentious, and rapidly-evolving fields where case-based learning is the pedagogical approach of choice how do teachers and learners design, develop, describe and reconstruct cases, and how do these processes contribute to academic and professional outcomes? the learning technologies need to be robust yet flexible enough to support teachers and learners as they grapple with complex situations and develop creative solutions and they need to be able to easily access, adapt and manage their case based learning……a pedagogical challenge!

The settings where reflective processes allows learners to achieve the higher levels of understanding and capability that characterise the ‘expert’ or the ‘virtuoso’ advanced undergraduate, taught postgraduate and professional development courses (6 groups) teachers and learners are taking part in ‘case-building’ activities in which semantic web tools and digital repositories are used to support engagement with rich case data data differently structured and represented and in which alternative constructions of cases are possible

Technical aims repurposing, reconfiguring and enhancing existing repositories and other data sources aims to easily ‘translate’ research data in a Repository for integration into applications which use semantic or 'Web 3.0' technologies: –federated searches –visualisation tools –collaborative working environments allow end-users to engage in flexible discovery, aggregation, representation and visualisation of data using: –topic maps, tag clouds, timelines and maps –VLE's and wikis to share data, interpretation and analysis

One Semantic Web Vision Tim Berners Lee’s 2001 ‘vision’ of the SW - personalisation of services through seamless integration of web based systems “At the doctor's office, Lucy instructed her Semantic Web agent through her handheld Web browser. The agent promptly retrieved information about Mom's prescribed treatment from the doctor's agent, looked up several lists of providers, and checked for the ones in-plan for Mom's insurance within a 20- mile radius of her home and with a rating of excellent or very good on trusted rating services. It then began trying to find a match between available appointment times supplied by the agents …” Berners-Lee et al, 2001 The general tone is not unlike that of upbeat 1950’s films about the promise of futuristic kitchens, full of labour saving devices and intelligent fridges Source: Stellman & Greene

Backend: archiving systems and tools for data management –digital repositories and libraries, with data and/or metadata in differing formats –Web services: lookups, converters, searches (i.e. external data providers) Middleware: data aggregation and semantic data management –Triplestore: large data aggregators containing data, metadata, vocabularies, ontologies and sets of rules –Endpoints and APIs to allow querying the Triplestore Frontend: presentation and visualization of data –Web Interfaces, portals, visualization tools, personal information managers 7 Our semantic web application

A semantic web application

The technologies we are using Our back end repository: Fedora open source digital repository framework specifically oriented towards supporting semantic web applications (Fedora 3.0 represents a major upgrade) stores digital objects and manages external references enforces no specific collection structure and allows multiple metadata schemes to be used describe specific resources

Fedora’s SW potential also allows in-line RDF semantic data to be stored in a digital object these can be streamed directly to other applications can search across the repository using exposed metadata AND semantic information if present relationships among digital resources need to be defined to enable this e.g. just like DDI3 is doing

Data out convert data to RDF/XML using a RDFizer –Triplify or RDF123 –Eg Excel to RDF, PDF to RDF and so on metadata record (in RDF/XML) accompanies data – with permanent address to dataset using the Fedora Resource Index module to index relationships among objects (contained in the inline RDF datastreams - RDF/XML) now available to aggregators, triplestores, reasoners we storing and syncronizing the metadata in every object into a Mulgara Triplestore

Triplify small plugin which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data

Mulgara Semantic Triplestore is a large database optimised for very rapid searching and pattern matching It does this by rendering all data into ‘triples’ - a record of information in the form of subject - predicate – object egURL- property of the resource- value of that property can be used to describe connectedness of objects a single bibliographic record is represented by about triples a Triplestore can contain hundreds of millions of triples N3 format (Notation3) is a compact and readable alternative to RDF's XML syntax

SPARQL endpoints emerging W3C standard for semantic data management, aggregation, selection and querying semantic triplestores exploration of SPARQL as a basis for user interaction with data sets and a means of exposing repository content for querying, reuse and repurposing we have implemented as a set of predefined queries running across the Triplestore results are formatted on the fly for the visualisation tools at hand with SPARQL, Web applications can be constructed without extensive additional templating or scripting - 'lowerins the bar '

Fedora Configuration 3: Custom search FEDORA API-A “GET” OAI-PMH Feed DC RELS-EXT XLS Inline RDF Mulgara Triplestore Custom Search SPARQL Endpoint 15

Visualisation tools Using SIMILE tooklit based at MIT and supported by WWW3 and Hewlett-Packard labs SIMILE tools: –customisable browser LONGWELL – aggregates RDF content from multiple sources and presents them through a faceted browser –can then display through catalogues, maps, timelines, network views, eg using Web widgets such as SIMILE’s Exhibit geo representations and Timeline

What Kinds of Questions? What is the latin name for Aleppo Pine? What does an Aleppo Pine look like? How do Aleppo Pines reproduce? Show me a map of their distribution? Is this a picture of an Aleppo Pine? Tell me about Aleppo Pines? Show me examples of plants which frequently inhabit the same environment as Aleppo Pines What insect life do Aleppo Pines support? What do people from Aleppo call Aleppo Pines? Source: PlantWiki

Geo visualisation

Exhibit faceted browsing

Interactivity and creativity encourages students to experiment, construct their own evidence-based cases appreciate new data sources, be more adventurous, have more fun! discuss findings with fellows using social networking tools and so on and give back newly constructed datasets

Summary Fedora Digital Repository provides a framework to store large and heterogeneous data –not only access to the metadata descriptions but access to the data itself data structured and defined in semantic-ready format –triplestores like Mulgara enable to aggregate and reason across different data sources visualization and presentation tools –process semantic-ready data and present the information in different formats 22The Ensemble Project. 2009

Implications for the likes of us? access to generically applicable and well documented tools & scripts, APIs in an open access Tools Library need help implementing such tools using the experience of existing implementers We need to know: –what technical skills does one need and what will it cost? –how much manual data manipulation needs to be done –how easy is it to integrate these tools into existing systems and platforms e.g. VREs and VLEs? –and so on