Nearest Neighbor Searching Under Uncertainty

Slides:



Advertisements
Similar presentations
Nearest-Neighbor Searching Under Uncertainty II Pankaj K. Agarwal(Duke) Boris Aronov(NYU-Poly) Sariel Har-Peled(UIUC) Jeff M. Phillips(Utah) Ke Yi(HKUST)
Advertisements

k-Nearest Neighbors Search in High Dimensions
Efficient Processing of Top- k Queries in Uncertain Databases Ke Yi, AT&T Labs Feifei Li, Boston University Divesh Srivastava, AT&T Labs George Kollios,
Spatio-temporal Databases
Proximity Searching in High Dimensional Spaces with a Proximity Preserving Order Edgar Chávez Karina Figueroa Gonzalo Navarro UNIVERSIDAD MICHOACANA, MEXICO.
PARTITIONAL CLUSTERING
        iDistance -- Indexing the Distance An Efficient Approach to KNN Indexing C. Yu, B. C. Ooi, K.-L. Tan, H.V. Jagadish. Indexing the distance:
K-NEAREST NEIGHBORS AND DECISION TREE Nonparametric Supervised Learning.
Similarity Search on Bregman Divergence, Towards Non- Metric Indexing Zhenjie Zhang, Beng Chi Ooi, Srinivasan Parthasarathy, Anthony K. H. Tung.
School of Computer Science and Engineering Finding Top k Most Influential Spatial Facilities over Uncertain Objects Liming Zhan Ying Zhang Wenjie Zhang.
Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer.
Indexing the imprecise positions of moving objects Xiaofeng Ding and Yansheng Lu Department of Computer Science Huazhong University of Science & Technology.
Computational Geometry -- Voronoi Diagram
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
With thanks to Zhijun Wu An introduction to the algorithmic problems of Distance Geometry.
Spatio-temporal Databases Time Parameterized Queries.
Principal Component Analysis
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Lection 1: Introduction Computational Geometry Prof.Dr.Th.Ottmann 1 History: Proof-based, algorithmic, axiomatic geometry, computational geometry today.
Even faster point set pattern matching in 3-d Niagara University and SUNY - Buffalo Laurence Boxer Research partially supported by a.
Lars Arge1, Mark de Berg2, Herman Haverkort3 and Ke Yi1
CS232.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
CS Instance Based Learning1 Instance Based Learning.
FLANN Fast Library for Approximate Nearest Neighbors
Efficient Algorithms for Matching Pedro Felzenszwalb Trevor Darrell Yann LeCun Alex Berg.
Clustering methods Course code: Pasi Fränti Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu,
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
CSIE in National Chi-Nan University1 Approximate Matching of Polygonal Shapes Speaker: Chuang-Chieh Lin Advisor: Professor R. C. T. Lee National Chi-Nan.
B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)
Nearest Neighbor (NN) Rule & k-Nearest Neighbor (k-NN) Rule Non-parametric : Can be used with arbitrary distributions, No need to assume that the form.
A Quantitative Analysis and Performance Study For Similar- Search Methods In High- Dimensional Space Presented By Umang Shah Koushik.
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
NEAREST NEIGHBORS ALGORITHM Lecturer: Yishay Mansour Presentation: Adi Haviv and Guy Lev 1.
Open Problem: Dynamic Planar Nearest Neighbors CSCE 620 Problem 63 from the Open Problems Project
Clustering Prof. Ramin Zabih
Clustering of Uncertain data objects by Voronoi- diagram-based approach Speaker: Chan Kai Fong, Paul Dept of CS, HKU.
Geometric Problems in High Dimensions: Sketching Piotr Indyk.
Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &
Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality Piotr Indyk, Rajeev Motwani The 30 th annual ACM symposium on theory of computing.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 6: Nearest and k-nearest Neighbor Classification.
Computational Geometry Piyush Kumar (Lecture 1: Introduction) Welcome to CIS5930.
Information Technology Selecting Representative Objects Considering Coverage and Diversity Shenlu Wang 1, Muhammad Aamir Cheema 2, Ying Zhang 3, Xuemin.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University.
IntroductionReduction to Arrangements of Hyperplanes We keep the canonical triangulation (bottom, left corner triangulation) of each convex cell of the.
Debrup Chakraborty Non Parametric Methods Pattern Recognition and Machine Learning.
Nearest-Neighbor Searching Under Uncertainty Wuzhou Zhang Joint work with Pankaj K. Agarwal, Alon Efrat, and Swaminathan Sankararaman. To appear in PODS.
A Binary Linear Programming Formulation of the Graph Edit Distance Presented by Shihao Ji Duke University Machine Learning Group July 17, 2006 Authors:
Computational Geometry Piyush Kumar (Lecture 1: Introduction) Welcome to CIS5930.
k-Nearest neighbors and decision tree
Tutorial 3 – Computational Geometry
Clustering Uncertain Taxi data
Christian Wolf 1, Jean-Michel Jolion 2, Walter G
Computational Geometry
K Nearest Neighbor Classification
Spatio-temporal Databases
Near(est) Neighbor in High Dimensions
Nearest-Neighbor Classifiers
Prepared by: Mahmoud Rafeek Al-Farra
Junqi Zhang+ Xiangdong Zhou+ Wei Wang+ Baile Shi+ Jian Pei*
Efficient Algorithms for the Weighted k-Center Problem on a Real Line
Confidence Intervals for a Standard Deviation
Range Queries on Uncertain Data
Aggregate-Max Nearest Neighbor Searching in the Plane
Data Mining Classification: Alternative Techniques
Spatio-temporal Databases
Lecture 15: Least Square Regression Metric Embeddings
Classification Using K-Nearest Neighbor
Ronen Basri Tal Hassner Lihi Zelnik-Manor Weizmann Institute Caltech
Presentation transcript:

Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University

Nearest Neighbor Searching (NNS) Applications Pattern Recognition, Data Compression Statistical Classification, Clustering Databases, Information Retrieval Computer Vision, etc. http://en.wikipedia.org/wiki/Nearest_neighbor_search

Nearest Neighbor Searching Under Uncertainty Discrete pdf Continuous pdf

Nearest Neighbor In Expectation _________

Bisector In Case Of Gaussian For Gaussian distribution, bisector is a line! Hard to get explicit formula! Figure: http://www.cs.utah.edu/~hal/courses/2009S_AI/Walkthrough/KalmanFilters/

Squared Distance Function bisector is simple and beautiful! In case of discrete pdf, bisector is also a line! In both cases, compute the Voronoi diagram, solve it optimally! However, not a metric !

Sampling Continuous Distributions Sometimes working on continuous distributions is hard…. Lower bounds on other metrics and distributions are also possible…. Let’s focus on discrete pdf then….

Expected Nearest Neighbor In L1 Metric (Manhattan metric)

Expected Nearest Neighbor In L1 Metric ( cont. ) Source: Range Searching on Uncertain Data [P.K.Agarwal et al. 2009]

Geometric Reduction

Building Block: Half-Space Intersection and Convex Hulls Upper hulls correspond to lower envelopes, an example in 2D Source: page 252 – 253, Computational Geometry: Algorithms and Applications, 3rd Edition[Mark de Berg et al. ]

Segment-tree Based Data Structures for Expected-NN In L1 Metric

Segment-tree Based Data Structures for Expected-NN In L1 Metric ( cont

Segment-tree Based Data Structures for Expected-NN In L1 Metric ( cont Size of data structure Preprocessing time Query time Summary of the result

Approximate L2 Metric It’s a metric when P is centrally symmetric!

Approximate L2 Metric ( cont. ) More complex!

Future Work Approximate the expected NN in L2 metric Work harder in the near future! Approximate the expected NN in L2 metric Study the complexity of expected Voronoi diagram Study the probability case

Questions? Thanks! Main References: [1] Pankaj K. Agarwal, Siu-Wing Cheng, Yufei Tao, Ke Yi: Indexing uncertain data. PODS 2009: 137-146 [2] Pankaj K. Agarwal, Lars Arge, Jeff Erickson: Indexing Moving Points. J. Comput. Syst. Sci. 66(1): 207-243 (2003) Questions?