Download presentation
Presentation is loading. Please wait.
1
Partitioning Search-Engine Returned Citations for Proper-Noun Queries Reema Al-Kamha
2
The Problem Search engines return too many citations Example: “Bonnie Lake” Google returns around 800 citations Citations ranked best first Many refer to the same object Can we partition by same object? Proper Noun Queries Discard citations not of the right kind Partition the rest by same object Retain the best-first ranking
3
“Bonnie Lake” Query to Google
4
The Interface
5
“Bonnie Lake” Query Result
6
Thesis Statement Proper-noun Query Classify Partition Questions Can we build a system to do this? How well will it work?
7
Classification Group 1: those of the chosen kind Group 2: those not of the chosen kind Partition Three facets Attributes Links Page Similarity Sub-facets for each facet Confidence Matrix for each sub-facet (Weighted) Mean for each facet Final Confidence Matrix Methods
8
Attributes Attribute(s) (One-to-One) Latitude and longitude Single Attribute (Functional Determination) Province with a lake’s name Multiple Attributes (Functional Determination) Campground name and highway with a lake’s name Attributes (Nonfunctional Determination) Country with a lake’s name Distinguishing Attribute State for a lake
9
Links Returned citations that link together Returned citations that have a common URL prefix: common domain, common server, and common path. www.cs.byu.edu/info/dwembley.html www.cs.byu.edu/info/directory.php
10
12345678 11.50.89.50 21 31 41 51 61 71 81 Confidence Matrix for Returned Citations that Link Together 14
11
Page Similarity Similarity between each two returned citations Similarity between two citations-referenced documents
12
12345678 110010000 2.001.22.00.36.01.00.41 3.00 1.99.00 410010000 50.9901.00 6.33.00.29.00.221.00.56 7.00.01.00.01.001.99 8.00.99.00 1 Confidence Matrix for Similarity between two Citation-Referenced Documents
13
12345678 11.00 10.17.00 21.11.00.18.01.00.21 31.001.00.15.01.00 410 51.11.01.50 61.00.08 71.50 81 Modified Confidence Matrix for Similarity between two Citation-Referenced Documents
14
Final Matrix 12345678 11.25.95.25.34.25 21.30.25.34.26.25.36 31.25.74.36.26.25 41 51.30.26.50 61.25.29 71.50 81 {3,5,7,8}{6} 1,4 3,5 5,8 7,8 {2}{1,4}
15
Measurements Classification (Precision and Recall) Number of Partitions (Precision and Recall) Each Partition (Precision and Recall)
16
Contribution Solve one type of object-identity problem Provide an additional tool for search engine queries
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.