Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Web of Concepts Dalvi, et al. Presented by Andrew Zitzelberger.

Similar presentations


Presentation on theme: "A Web of Concepts Dalvi, et al. Presented by Andrew Zitzelberger."— Presentation transcript:

1 A Web of Concepts Dalvi, et al. Presented by Andrew Zitzelberger

2 Vision Transform hyperlinked bags of words into semantically rich aggregate view of information on the web.

3 Concept Things of interest – Searching for information – Accomplishing a task Reservations, etc.

4 Instances Record of a concept – Restaurant Gochi (19980 Homestead Rd Cupertino CA) – Academia? Publications, research institutions

5 Instance Representation Loosely-structured record (lrec) – Attribute-key, value pairs – Unique id field Entity matching problem – Metadata Attribute list

6 Domain Set of related concepts – Academic community domain = {publications, people, conferences}

7 Usage Study Instance vs. Concept Search yelp.com – Month of queries resulting in a click (restaurants) – 59% specific business URL – 19% search URL either specific business or group – 11% specific group URL

8 Usage Study Concept Attribute Search Remove restaurant name and location information from query Co-occuring words: – Menu (3%), coupons (1.8%), online, weekly specials, locations (1.5%) – Nutrition, to go, delivery, careers, cod

9 Usage Study Aggregation Value 59% clicked on at least one other URL 35% clicked on at least two other URLs Small manual evaluation indicates pages are often about the same business.

10 Usage Study Concepts vs. Browsing 42% of homepage visits are from search engine – Immediately following URL 11.5% location 9% menu 1% coupons 10.5% of user trails contain more than one distinct instance of the restaurant concept

11 Extraction Create new records from the web – Information extraction – Linking – Analysis Meta-data tagging (cuisine type)

12 Domain-centric vs. Site-centric Extraction Site-centric extraction – Wrappers for page structure – Probabilistic models (CRF) Domain-centric extraction – Fields of interest – Statistical properties (single zip code, etc.) – Structure components (lists, link relationships)

13 Domain-centric Extraction Aggregator mining – Learn from extracted knowledge (similar menus) Matching – Text is “about” a record (restaurant review)

14 Application Aggregation

15 Application Session Optimization User understanding – Historical modeling – Session modeling Content understanding Example: Birks – Birks and Mayors (luxury Jewelers) vs. Birk’s Steakhouse

16 Application Browse Optimization Alternatives: (Restaurants) – Similar type of cuisine – Similar location – Similar quality Augmentations: (Camera) – Batteries – Memory cards

17 Concept Search Result Pages – shows multiple records Concept Pages – information about an instance Article Pages – a piece of authored text

18 Advertising Increase in targeted advertisements Target concepts rather than keywords

19 Challenges Transfer learning – Transfer extractor knowledge Tracking uncertainty – Accuracy issues – “Web of concepts is not a one time affair” Wrapper problems Concept updates Relevance Measures – User satisfaction

20 Related Work Information Extraction/Integration Systems Dataspace Systems Semantic Web

21 Future Work Enrich representation model – Path storage to data – Provenance, versions, uncertainty – Hierarchal relationships (containment or inheritance) Ranking of disparate sources


Download ppt "A Web of Concepts Dalvi, et al. Presented by Andrew Zitzelberger."

Similar presentations


Ads by Google