A Web of Concepts Dalvi, et al. Presented by Andrew Zitzelberger.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Web Mining.
Struggling or Exploring? Disambiguating Long Search Sessions
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
1 SEARCH ENGINE OPTIMIZATION AT Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine's.
Slide 1 Web-Base Management Systems Aaron Brown and David Oppenheimer CS294-7 February 11, 1999.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Page 1 June 2, 2015 Optimizing for Search Making it easier for users to find your content.
Managing data Resources: An information system provides users with timely, accurate, and relevant information. The information is stored in computer files.
Towards Semantic Web Mining Bettina Berndt Andreas Hotho Gerd Stumme.
Web Projections Learning from Contextual Subgraphs of the Web Jure Leskovec, CMU Susan Dumais, MSR Eric Horvitz, MSR.
CM143 - Web Week 2 Basic HTML. Links and Image Tags.
Information Retrieval
Overview of Web Data Mining and Applications Part I
Overview of Search Engines
Search Engine Optimization March 23, 2011 Google Search Engine Optimization Starter Guide.
 IR: representation, storage, organization of, and access to information items  Focus is on the user information need  User information need:  Find.
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
Databases & Data Warehouses Chapter 3 Database Processing.
SEO for Web Designers By Alfredo Palconit, Jr.. I. What is SEO? A process of improving a site’s traffic and rank from organic search engine results. Notes:
Search Engine Optimization (SEO) Week 07 Dynamic Web TCNJ Jean Chu.
HTML Comprehensive Concepts and Techniques Intro Project Introduction to HTML.
Getting started on informaworld™ How do I register my institution with informaworld™? How is my institution’s online access activated? What do I do if.
Web Usage Mining with Semantic Analysis Date: 2013/12/18 Author: Laura Hollink, Peter Mika, Roi Blanco Source: WWW’13 Advisor: Jia-Ling Koh Speaker: Pei-Hao.
C OLLECTIVE ANNOTATION OF WIKIPEDIA ENTITIES IN WEB TEXT - Presented by Avinash S Bharadwaj ( )
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
1 The BT Digital Library A case study in intelligent content management Paul Warren
Copyright (c) 2004 Prentice-Hall. All rights reserved. 1 Committed to Shaping the Next Generation of IT Experts. Project 8: Prepping and Publishing a Web.
Web Data Management Dr. Daniel Deutch. Web Data The web has revolutionized our world Data is everywhere Constitutes a great potential But also a lot of.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Internet Business Foundations © 2004 ProsoftTraining All rights reserved.
-1- Philipp Heim, Thomas Ertl, Jürgen Ziegler Facet Graphs: Complex Semantic Querying Made Easy Philipp Heim 1, Thomas Ertl 1 and Jürgen Ziegler 2 1 Visualization.
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
Markup and Validation Agents in Vijjana – A Pragmatic model for Self- Organizing, Collaborative, Domain- Centric Knowledge Networks S. Devalapalli, R.
Searching and Browsing Using Tags Nikos Sarkas Social Information Systems Seminar DCS, University of Toronto, Winter 2007.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Web Search Module 6 INST 734 Doug Oard. Agenda The Web Crawling  Web search.
Algorithmic Detection of Semantic Similarity WWW 2005.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.
ACIS Introduction to Data Analytics & Business Intelligence Database s Benefits & Components.
CIW Lesson 6MBSH Mr. Schmidt1.  Define databases and database components  Explain relational database concepts  Define Web search engines and explain.
ASSIST: Adaptive Social Support for Information Space Traversal Jill Freyne and Rosta Farzan.
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
ASSOCIATIVE BROWSING Evaluating 1 Jin Y. Kim / W. Bruce Croft / David Smith by Simulation.
1 Discovering Web Communities in the Blogspace Ying Zhou, Joseph Davis (HICSS 2007)
Drupal Basics for Content Managers: Editing my Drupal Website Drupal Basics for Content Managers: Editing my Drupal Website Cynthia Mijares,
General Architecture of Retrieval Systems 1Adrienn Skrop.
Third Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
WEB STRUCTURE MINING SUBMITTED BY: BLESSY JOHN R7A ROLL NO:18.
MINING DEEP KNOWLEDGE FROM SCIENTIFIC NETWORKS
Search User Behavior: Expanding The Web Search Frontier
An Empirical Study of Learning to Rank for Entity Search
Lesson 6: Databases and Web Search Engines
Search Engine Optimisation
StYLiD: Structured Information Sharing with User-defined Concepts
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Web IR: Recent Trends; Future of Web Search
Chapter 2 Database Environment.
Lesson 6: Databases and Web Search Engines
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Database Design Hacettepe University
Anatomy of a modern data-driven content product
Information Retrieval and Web Design
Presentation transcript:

A Web of Concepts Dalvi, et al. Presented by Andrew Zitzelberger

Vision Transform hyperlinked bags of words into semantically rich aggregate view of information on the web.

Concept Things of interest – Searching for information – Accomplishing a task Reservations, etc.

Instances Record of a concept – Restaurant Gochi (19980 Homestead Rd Cupertino CA) – Academia? Publications, research institutions

Instance Representation Loosely-structured record (lrec) – Attribute-key, value pairs – Unique id field Entity matching problem – Metadata Attribute list

Domain Set of related concepts – Academic community domain = {publications, people, conferences}

Usage Study Instance vs. Concept Search yelp.com – Month of queries resulting in a click (restaurants) – 59% specific business URL – 19% search URL either specific business or group – 11% specific group URL

Usage Study Concept Attribute Search Remove restaurant name and location information from query Co-occuring words: – Menu (3%), coupons (1.8%), online, weekly specials, locations (1.5%) – Nutrition, to go, delivery, careers, cod

Usage Study Aggregation Value 59% clicked on at least one other URL 35% clicked on at least two other URLs Small manual evaluation indicates pages are often about the same business.

Usage Study Concepts vs. Browsing 42% of homepage visits are from search engine – Immediately following URL 11.5% location 9% menu 1% coupons 10.5% of user trails contain more than one distinct instance of the restaurant concept

Extraction Create new records from the web – Information extraction – Linking – Analysis Meta-data tagging (cuisine type)

Domain-centric vs. Site-centric Extraction Site-centric extraction – Wrappers for page structure – Probabilistic models (CRF) Domain-centric extraction – Fields of interest – Statistical properties (single zip code, etc.) – Structure components (lists, link relationships)

Domain-centric Extraction Aggregator mining – Learn from extracted knowledge (similar menus) Matching – Text is “about” a record (restaurant review)

Application Aggregation

Application Session Optimization User understanding – Historical modeling – Session modeling Content understanding Example: Birks – Birks and Mayors (luxury Jewelers) vs. Birk’s Steakhouse

Application Browse Optimization Alternatives: (Restaurants) – Similar type of cuisine – Similar location – Similar quality Augmentations: (Camera) – Batteries – Memory cards

Concept Search Result Pages – shows multiple records Concept Pages – information about an instance Article Pages – a piece of authored text

Advertising Increase in targeted advertisements Target concepts rather than keywords

Challenges Transfer learning – Transfer extractor knowledge Tracking uncertainty – Accuracy issues – “Web of concepts is not a one time affair” Wrapper problems Concept updates Relevance Measures – User satisfaction

Related Work Information Extraction/Integration Systems Dataspace Systems Semantic Web

Future Work Enrich representation model – Path storage to data – Provenance, versions, uncertainty – Hierarchal relationships (containment or inheritance) Ranking of disparate sources