Making the Web searchable, or the Future of Web Search Peter Mika Yahoo! Research Barcelona.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Opportunistic Reasoning for the Semantic Web: Adapting Reasoning to the Environment Carlos Pedrinaci Tim Smithers and Amaia Bernaras.
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
The Web of data with meaning... By Michael Griffiths.
1 Semantic Technology supporting science Peter Mika / Dept. of Computer Science / Vrije Universiteit, Amsterdam.
Information and Business Work
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
UMBC AN HONORS UNIVERSITY IN MARYLAND Future Research Challenges and Needed Resources for The Web, Semantics and Data Mining Tim Finin UMBC, Baltimore.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. 1 The Architecture of a Large-Scale Web Search and Query Engine.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
Flink: Lessons of interoperability Peter Mika Dept. of Business Informatics Free University Amsterdam 1 st Intl. Workshop on.
Watson Supporting Next Generation Semantic Web Applications Mathieu d’Aquin, Claudio Baldassarre, Laurian Gridinoc, Marta Sabou, Sofia Angeletou, Enrico.
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Samad Paydar Web Technology Laboratory Computer Engineering Department Ferdowsi University of Mashhad 1389/11/20 An Introduction to the Semantic Web.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Web 3.0 or The Semantic Web By: Konrad Sit CCT355 November 21 st 2011.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Result presentation. Search Interface Input and output functionality – helping the user to formulate complex queries – presenting the results in an intelligent.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Advances in Technology and CRIS Nikos Houssos National Documentation Centre / National Hellenic Research Foundation, Greece euroCRIS Task Group Leader.
Dr. Kurt Fendt, Comparative Media Studies, MIT MetaMedia An Open Platform for Media Annotation and Sharing Workshop "Online Archives:
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Semantic Web Technologies ufiekg-20-2 | data, schemas & applications | lecture 21 original presentation by: Dr Rob Stephens
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
Mining the Semantic Web: Requirements for Machine Learning Fabio Ciravegna, Sam Chapman Presented by Steve Hookway 10/20/05.
The Yellow Group Design Informatics (Regli, Stone, Kusiak, Leifer, Gupta, Chung, Fenves, Law, Kopena)
Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the.
Personalized Interaction with Web Resources First Sino-German Symposium on KNOWLEDGE HANDLING: REPRESENTATION, MANAGEMENT AND PERSONALIZED APPLICATION.
Ontology Summit2007 Survey Response Analysis -- Issues Ken Baclawski Northeastern University.
7. Approaches to Models of Metadata Creation, Storage and Retrieval Metadata Standards and Applications.
NLP And The Semantic Web Dainis Kiusals COMS E6125 Spring 2010.
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
Reading Discussions Metcalfe’s Law paper What is metcalfe’s Law? Examples from the Web? How can we utilize it? How semantics contribute to social networks,
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
Department of computer science and engineering Two Layer Mapping from Database to RDF Martin Švihla Research Group Webing Department.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
RHIT COURSE CATALOGUE SEMANTIC WIKI Overview and Initial Thoughts From your client for : Christina Selby, RHIT Math Dept G214,
Semantic Web Technologies Brief Readings Discussion Class work: Research topics and Project discussion Research Presentation Topics assigned Building lightweight.
You sexy beast. Ok, inappropriate. How about: Web of links to Web of Meaning Hello Semantic Web!
Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.
The Semantic Logger: Supporting Service Building from Personal Context Mischa M Tuffield et al. Intelligence, Agents, Multimedia Group University of Southampton.
OWL Representing Information Using the Web Ontology Language.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Web Review The Web Web 1.0 Web 2.0 Future of the Web Internet Programming - Chapter 01:XHTML1.
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Toward Semantic Search: RDFa based facet browser Jin Guang Zheng Tetherless World Constellation.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
Lessons learned from Semantic Wiki Jie Bao and Li Ding June 19, 2008.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Introduction to the Semantic Web Jeff Heflin Lehigh University.
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Information Sharing on the Social Semantic Web Aman Shakya* and Hideaki Takeda National Institute of Informatics, Tokyo, Japan The Second NEA-JC Workshop.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
1 Intelligent Information System Lab., Department of Computer and Information Science, Korea University Semantic Social Network Analysis Kyunglag Kwon.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
The Semantic Web By: Maulik Parikh.
Analyzing and Securing Social Networks
Zachary Cleaver Semantic Web.
Embedding Knowledge in HTML
Presentation transcript:

Making the Web searchable, or the Future of Web Search Peter Mika Yahoo! Research Barcelona

Overview Why a new vision? Context –Semantic Web: metadata infrastructure –Web 2.0: user-generated metadata Thesis: making the Web searchable Research challenges (SW & IR) Conclusion

Motivation 1.State of Web search Picked the low hanging fruit –Heavy investments, marginal returns –High hanging fruits Hard searches remain… 2.The Web has changed…

Hard searches Ambiguous searches –Paris Hilton Multimedia search –Images of Paris Hilton Imprecise or overly precise searches –Publications by Jim Hendler –Find images of strong and adventurous people (Lenat) Searches for descriptions –Search for yourself without using your name –Product search (ads!) Searches that require aggregation –Size of the Eiffer tower (Lenat) –Public opinion on Britney Spears Queries that require a deeper understanding of the query, the content and/or the world at large –Note: some of these are so hard that users don’t even try them any more

Example…

The Semantic Web (1996-…) Making the content of the Web machine processable through metadata –Documents, databases, Web services Active research, standardization, startups –Ontology languages (RDF, OWL family), query language for RDF (SPARQL) –Software support (metadata stores, reasoners, APIs)

Problem: difficulties in deployment Not enough take-up in the Web community at large –Technological challenges Discovery Ontology learning Ontology mapping –Lack of attention to the social side Over-estimating complexity for users Need for supporting ontology creation and sharing  Focus shifts from documents to databases -- the Web of Data  Enterprise/closed community applications

Web 2.0 (2003-) Simple, nimble, socially transparent interfaces Simplified KR –e.g. tagging, microformats, Wikipedia infoboxes  In exchange for a better experience, users are willing to Provide content, markup and metadata Provide data on themselves and their networks Rank, rate, filter, forward Develop software and improve your site …

Problem: lack of foundations No shared syntax or semantics No linking mechanism Example: tag semantics –flickr:ajax = del.icio.us:ajax ? –flickr:ajax:Peter = flickr:ajax:John ? –flickr:ajax:Peter:1990 = flickr:ajax:Peter:2006 ? Microformats –Separate agreement required for each format

Thesis: making the Web searchable The Web has changed –Content owners are interested in their content to be found (Web 2.0) Cf. findability (Peter Morville), reusability (mashups), open data movement –Foundations are laid for a Semantic Web We need to –Combine the best of Web 2.0 and the Semantic Web –Reconsider Web IR in this new world

Semantic Web 2.0 Getting the representation right –RDF++ –RDFa (RDF-in-HTML) Innovations on the interface side –Semantic Wikis New methods of reasoning –Semantics = syntax + statistics Bottom-up, emergent semantics Methods of logical reasoning combined with methods of graph mining, statistics –Scalability Giving up soundness and/or completeness –Dealing with the mess Social engineering –Collaborative spaces for creating and sharing ontologies, data –Connecting islands of semantics –Best practices, documentation, advocacy

Example: Freebase

Example: machine tags

Example: folksonomies Simplified view: “tags are just anchortext” Can be used to generate simple co- occurrence graphs hilton eiffel url1 url3 url2 paris

The more complete picture Folksonomies as tripartite graphs of users, urls and tags user1 user2 user3 hilton eiffel url1 url3 url2 paris

Community-based ontology mining Opportunities for mining community- specific interpretations of the world Peter Mika. Ontologies are us: A unified model of social networks and semantics. Journal of Web Semantics 5 (1), page 5-15, 2007

Web IR 2.0 Keep on improving machine technology –NLP –Information Extraction Exploit the users for the tasks that are hard for the machine –Encourage and support users –Exploit user-generated metadata in any shape or form Support standards of the SW architecture

Vision: ontology-based search Query: at the knowledge level –Partial description of a class/instance Mapping of queries and resources in the conceptual space –Computing relevance in semantic terms Novel user interfaces

Ideal world Plenty of precise metadata to harvest User intent can be captured directly as a SPARQL query Single ontology used both by the query and the knowledge base Executed on a single knowledge base, gives the correct, single answer

Technical challenges Query interface Data quality –Cleaning up metadata, tags –Spam Ontology mapping and entity resolution Ranking across types Results display –How do you avoid information overload? –How do you display information you partially understand?

Social challenges Getting the users on your side –Users are unwilling to submit large amounts of structured data to a commercial entity (Google Base) –Provide a clear motivation and/or instant gratification Trust them… but not too much (Mahalo)

Example: Technorati and microformats Semantic Web

Example: openacademia.org and RDFa <span class="foaf:Person" property="foaf:name" about="#peter_mika"> Peter Mika

Conclusion Why a new vision? The opportunity: convergence –Semantic Web: metadata infrastructure –Web 2.0: user-generated metadata Thesis: making the Web searchable Research challenges

Making the Web searchable Encourage the emergence of ontologies, the creation of metadata Support standards for the discovery of embedded metadata and the querying of ontology stores Harvest and actively use user-generated metadata

What is there to gain? Knowledge-based search –Sorting out hard searches –Creating new information needs Beyond search –Analysis, design, diagnosis etc. on top of aggregated data Personalization –Rich user profiles Monetization –No more “buy virgins on eBay”

Questions? Peter Mika. Social Networks and the Semantic Web. Springer, July, Special Issue on the Semantic Web and Web 2.0, Journal of Web Semantics, December, 2007.

The open data movement Public money, public information User generated content, owned by the community Personal information, owned by the user + the law cannot follow anyway

Caveats 99% of the Web is Web 1.0 –And will stay like that Market leaders will not join the standards –Have most to loose, can achieve the same effect through proprietary agreements with large data providers –Technological rational also dictates that it’s likely to break through in domain-specific search engines

Examples Swoogle, SWSE Freebase dbpedia pediaX Technorati microformat search Semantic Wiki Foafing the music