Microsearch and SearchMonkey Interfaces for Semantic Search Peter Mika Researcher, Data Architect Yahoo! Research.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Developing a Metadata Exchange Format for Mathematical Literature David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010.
Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.
Sematic Web Microdata, Microformat and RDF Advanced Web-based Systems | Misbhauddin.
Making the Web Searchable Peter Mika Researcher, Data Architect Yahoo! Research.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Semantic Web Presented by: Edward Cheng Wayne Choi Tony Deng Peter Kuc-Pittet Anita Yong.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Samad Paydar Web Technology Laboratory Computer Engineering Department Ferdowsi University of Mashhad 1389/11/20 An Introduction to the Semantic Web.
Metadata Standards and Applications 4. Metadata Syntaxes and Containers.
Result presentation. Search Interface Input and output functionality – helping the user to formulate complex queries – presenting the results in an intelligent.
RDF: Concepts and Abstract Syntax W3C Recommendation 10 February Michael Felderer Digital Enterprise.
Data on the (Semantic) Web
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Semantic Web Technologies ufiekg-20-2 | data, schemas & applications | lecture 21 original presentation by: Dr Rob Stephens
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
Chapter 6 Understanding Each Other CSE 431 – Intelligent Agents.
Practical RDF Chapter 1. RDF: An Introduction
1 The BT Digital Library A case study in intelligent content management Paul Warren
Logics for Data and Knowledge Representation
Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the.
The Semantic Web and Microformats. The Semantic Web Syntax = how you say something – Letters, words, punctuation Semantics = meaning behind what you say.
INLS 520 – Erik Mitchell INLS 520 Information Organization.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
Developing “Geo” Ontology Layers for Web Query Faculty of Design & Technology Conference David George, Department of Computing.
RDFa, Microformats, and Atom Semantic Web Presented by: Anuradha Kandula Instructor: Steven Seida.
Semantic Visualization What do we mean when we talk about visualization? - Understanding data - Showing the relationships between elements of data Overviews.
Semantic Web Technologies Brief Readings Discussion Class work: Research topics and Project discussion Research Presentation Topics assigned Building lightweight.
A Short Tutorial to Semantic Media Wiki (SMW) [[date:: July 21, 2009 ]] At [[part of:: Web Science Summer Research Week ]] By [[has speaker:: Jie Bao ]]
Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.
1 Tutorial on the Semantic Web (Last update: 26 May 2009) adapted from (C) Ivan Herman, W3C Given at WE course by Peter Dolog Adapted: October 2010.
The future of the Web: Semantic Web 9/30/2004 Xiangming Mu.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
HTML Basic. What is HTML HTML is a language for describing web pages. HTML stands for Hyper Text Markup Language HTML is not a programming language, it.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
 Structured Data An Introduction to Semantic Web “It is very hard for search engines to understand the structure and semantics of data embedded in an.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
THE BIBFRAME EDITOR AND THE LC PILOT Module 3 – Unit 1 The Semantic Web and Linked Data : a Recap of the Key Concepts Library of Congress BIBFRAME Pilot.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
©Silberschatz, Korth and Sudarshan10.1Database System Concepts W3C - The World Wide Web Consortium W3C - The World Wide Web Consortium.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
From XML to DAML – giving meaning to the World Wide Web Katia Sycara The Robotics Institute
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
1 RDF, XML & interoperability Metadata : a reprise Communities, communication & XML An introduction to RDF RDF, XML and interoperability.
RDFa Primer Bridging the Human and Data webs Presented by: Didit ( )
Microdata in HTML 5.0 Web Applications Martin Nečaský Department of Software Engineering, Faculty of Mathematics and Physics, Charles University in Prague,
CITA 330 Section 11 The Web and Its Future. Web 1.0 News, music and everything else is moved to digital Web sites become super applications Ease of.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Semantic Web in Depth RDFa, GRDDL and POWDER Dr Nicholas Gibbins
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Linked Data & Semantic Web Technology The Semantic Web Part 4. Resource Description Framework (1) Dr. Myungjin Lee.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Semantic Web in Depth Schema.org RDFa, JSON-LD, Microdata Professor Steffen Staab 2016, Many slides courtesy by Dr. Nick Gibbins.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
The Semantic Web By: Maulik Parikh.
RDFa How and Why Ralph R. Swick World Wide Web Consortium
Semantic Database Builder
Embedding Knowledge in HTML
Embedding Knowledge in HTML
Resource Description Framework (RDF)
Presentation transcript:

Microsearch and SearchMonkey Interfaces for Semantic Search Peter Mika Researcher, Data Architect Yahoo! Research

- 2 -Updated: 30-May-09 Previously in search Horizontal search –Yahoo… –Keyword-based indexing –Minimal natural language processing –Limited experiments with ontologies (query expansion) Vertical search –e.g. shopping.com, Kelkoo –Faceted search, browsing –Fixed ontology Combinations –Google Base, Google Co-op Web-scale, but fixed ontologies Proprietary technology Can we do better with the Semantic Web? –Address the long tail of queries (88% of queries) –Use standard technology Not a new question. But the answer may be new.

- 3 -Updated: 30-May-09 Which Semantic Web? Two visions –Data Web Bringing the content of databases to the Web (linkeddata.org) Rich data, heavyweight semantics Deep Web –Annotated Web Annotating the content of Web resources (documents, mm) Simple data, lightweight semantics Shallow Web This presentation is about the Annotated Web.

- 4 -Updated: 30-May-09 Brief history of the Annotated Web 1995: HTML meta tags 1996: Simple HTML Ontology Extensions (SHOE) 1998: RDF/XML –RDF/XML in HTML –RDF linked from HTML 2003: Web 2.0 –Tagging –Microformats –Metadata in Wikipedia –Machine tags in Flickr 2005: eRDF 2008: RDFa

- 5 -Updated: 30-May-09 HTML meta tags <LINK rel="meta" type="application/rdf+xml" title="FOAF" href= " …

- 6 -Updated: 30-May-09 SHOE example (Hefflin & Hendler, 1996) My name is George Cook and I live at...

- 7 -Updated: 30-May-09 SHOE system

- 8 -Updated: 30-May-09 SHOE Text-based query interface

- 9 -Updated: 30-May-09 SHOE Graphical Query Interface

- 10 -Updated: 30-May-09 Example: Creative Commons Embedding CC license in HTML (now deprecated): … … <!–- <rdf:RDF xmlns=" xmlns:dc=" xmlns:rdf=" The Law of Averages...because eventually i&apos;ll be right... -->

- 11 -Updated: 30-May-09 Example: Creative Commons Current: rel attribute (HTML4) This work is licensed under a Creative Commons Attribution 3.0 United States License. Use of the “rel” attribute for semantic annotation is the birth of the microformat…

- 12 -Updated: 30-May-09 Example: microformats <a class="fn url" rel="friend colleague met" href=" Meyer wrote a post ( Tax Relief ) about an unintentionally humorous letter he received from the Internal Revenue Service. Joe Friday Area Administrator, Assistant

- 13 -Updated: 30-May-09 microformats microformats.org Originated by Tantek Celik and others Agreements on the way to encode certain kinds metadata in HTML –Reuse of semantic-bearing HTML elements –Based on existing standards –Community process –Persons, events, listings etc. but also syntactic metadata: licenses, tags Microformats have no shared syntax –Each microformat has a separate syntax tailored to the vocabulary Microformats are not ontologies –No formal descriptions of schema, only text –Limited reuse, extensibility of schemas –No datatypes No namespaces, unique identifiers (URIs) –no interlinking –mapping between instances is required Relationship to page context is unclear Widely used in millions of documents –User-generated as well as automatically generated

- 14 -Updated: 30-May-09 Example: tags and machine tags

- 15 -Updated: 30-May-09 Example: Tags and machine tags Tags –User defined keywords –Minimal agreement Is ‘rock’ on Flickr same as ‘rock’ on myspace? Is ‘rock’ by me on Flickr is the same as ‘rock’ by you on Flickr? Is ‘rock’ by me on Flickr today the same as ‘rock’ by me on myspace tomorrow? Machine tags –User defined values for user defined properties –Possibility to define the namespace (but not enforced) –Limited use

- 16 -Updated: 30-May-09 RDF-based annotation #1: eRDF eRDF –Ian Davis (Talis) –Embedding RDF in HTML Straightforward mapping to RDF triples (XSLT available) HTML4 compatible –More complex than microformats Use any RDF/OWL vocabulary Reuse of semantic-bearing HTML elements is limited –More limited than RDF No blank nodes No data types No statements about subjects other than the current document –Limited usage

- 17 -Updated: 30-May-09 RDF-based annotation #2: RDFa RDFa –World Wide Web Consortium (W3C) last call document –Similar intent as eRDF, but full RDF support Requires XHTML –Big question: user complexity (  data quality) Jo Smith. Web hacker at Example.org. You can contact me via ....

- 18 -Updated: 30-May-09 Metadata is out there Question: –Just how much data is out there? –What is the quality? Idea: bring metadata to the surface of search How does it work? –User enters query –Metadata is extracted dynamically –Entity reconciliation –Metadata is used to display rich abstracts, related pages spatial, temporal visualization Microsearch prototype Play at

- 19 -Updated: 30-May-09 Example: ivan herman Related pages based on metadata Events from personal calendar, Conferences, and bio from LinkedIn Geolocation Rich abstract

- 20 -Updated: 30-May-09 Example: peter site:flickr.com Flickr users named “Peter” by geography

- 21 -Updated: 30-May-09 Example: san francisco conference Conferences in San Francisco by date

- 22 -Updated: 30-May-09 Example: greater st. peter Save to address book Call phone number (other actions)

- 23 -Updated: 30-May-09 Where it fails User’s expected date of death is predicted

- 24 -Updated: 30-May-09 Lessons More metadata than we expected –53% of unique queries have at least one metadata-enabled page in top 10 (n=7848) Performance is poor –Metadata needs to come from the index for performance Metacrap does exist –Users have to see metadata to spot mistakes in their markup, warn others RDF templating is hard –Adds extra complexity Scalability

- 25 -Updated: 30-May-09 Creating an ecosystem of publishers, developers and end- users –Motivating and helping publishers to implement semantic annotation –Providing tools for developers to create compelling applications –Focusing on end-user experience Rich abstracts as a first application Addressing the long tail of query and content production Standard Semantic Web technology –dataRSS = Atom + RDFa –Industry standard vocabularies SearchMonkey

- 26 -Updated: 30-May-09 BeforeAfter an open platform for using structured data to build more useful and relevant search results What is SearchMonkey?

- 27 -Updated: 30-May-09 image deep links name/value pairs or abstract Enhanced Result

- 28 -Updated: 30-May-09YAHOO! CONFIDENTIAL | 28 Infobar

- 29 -Updated: 30-May-09 Acme.com’s database Index RDF/Microformat Markup site owners/publishers share structured data with Yahoo!. 1 consumers customize their search experience with Enhanced Results or Infobars 3 site owners & third-party developers build SearchMonkey apps. 2 DataRSS feed Web Services Page Extraction Acme.com’s Web Pages SearchMonkey

- 30 -Updated: 30-May-09 Developer tool

- 31 -Updated: 30-May-09 Developer tool

- 32 -Updated: 30-May-09 Developer tool

- 33 -Updated: 30-May-09 Developer tool

- 34 -Updated: 30-May-09 Developer tool

- 35 -Updated: 30-May-09 Future Work Encouraging reception from publishers and developers Next steps –Increasing relevance through better understanding of query intent –Focusing on task completion instead of single results –Opening up more of search, other parts of Yahoo! –Increasing the complexity of applications Contact –developer.yahoo.com/searchmonkey/ –mailing lists –forums –Semantic Web FAQ

- 36 -Updated: 30-May-09 the monkey is out!