Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 11: Nearest Neighbor Search

Similar presentations


Presentation on theme: "Lecture 11: Nearest Neighbor Search"β€” Presentation transcript:

1 Lecture 11: Nearest Neighbor Search
Left to the title, a presenter can insert his/her own image pertinent to the presentation.

2 Plan Distinguished Lecture Nearest Neighbor Search Scriber?
Quantum Computing Oct 19, 11:30am, Davis Aud in CEPSR Nearest Neighbor Search Scriber?

3 Sketching π‘Š: β„œ 𝑑 β†’ short bit-strings
given π‘Š(π‘₯) and π‘Š(𝑦), can distinguish between: Close: ||π‘₯βˆ’π‘¦||β‰€π‘Ÿ Far: ||π‘₯βˆ’π‘¦||>π‘π‘Ÿ With high success probability: only 𝛿=1/ 𝑛 3 failure prob. β„“ 2 , β„“ 1 norm: 𝑂( πœ– βˆ’2 β‹…log 𝑛) bits 𝑦 π‘₯ π‘Š π‘Š 010110 010101 Is ||π‘Š π‘₯ βˆ’π‘Š 𝑦 ||≀𝑑 ? Yes: close, ||π‘₯βˆ’π‘¦||β‰€π‘Ÿ No: far, ||π‘₯βˆ’π‘¦||> 1+πœ– π‘Ÿ

4 Near-linear space and sub-linear query time?
NNS: approaches Sketch π‘Š: uses π‘˜=𝑂( πœ– βˆ’2 β‹… log 𝑛 ) bits 1: Linear scan++ Precompute π‘Š(𝑝) for π‘βˆˆπ· Given π‘ž, compute π‘Š π‘ž For each π‘βˆˆπ·, estimate distance using π‘Š π‘ž , π‘Š(𝑝) 2: Exhaustive storage++ For each possible 𝜎∈ 0,1 π‘˜ compute 𝐴 𝜎 = point π‘βˆˆπ· s.t. ||π‘Š 𝑝 βˆ’πœŽ| | 1 <𝑑 On query π‘ž, output 𝐴[π‘Š π‘ž ] Space: 2 π‘˜ = 𝑛 𝑂 1/ πœ– 2 Near-linear space and sub-linear query time?

5 Locality Sensitive Hashing
[Indyk-Motwani’98] Random hash function β„Ž on 𝑅 𝑑 satisfying: for close pair (when π‘žβˆ’π‘ β‰€π‘Ÿ) Pr⁑[β„Ž(π‘ž)=β„Ž(𝑝)] is β€œhigh” for far pair (when π‘žβˆ’π‘β€² >π‘π‘Ÿ) Pr⁑[β„Ž(π‘ž)=β„Ž(𝑝′)] is β€œsmall” Use several hash tables π‘ž 𝑝′ 𝑝 π‘ž 𝑃1= β€œnot-so-small” 𝑃2= Pr⁑[β„Ž(π‘ž)=β„Ž(𝑝)] 𝜌= log 1/ 𝑃 1 log 1/ 𝑃 2 π‘›πœŒ, where 1 𝑃1 𝑃2 π‘žβˆ’π‘ π‘Ÿ π‘π‘Ÿ

6 LSH for Hamming space Hash function 𝑔 is usually a concatenation of β€œprimitive” functions: 𝑔 𝑝 = β„Ž 1 (𝑝), β„Ž 2 (𝑝),…, β„Ž π‘˜ (𝑝) Fact 1: 𝜌 𝑔 = 𝜌 β„Ž Example: Hamming space 0,1 𝑑 β„Ž 𝑝 = 𝑝 𝑗 , i.e., choose 𝑗 π‘‘β„Ž bit for a random 𝑗 𝑔(𝑝) chooses π‘˜ bits at random Pr β„Ž 𝑝 =β„Ž π‘ž = 1 – π»π‘Žπ‘š 𝑝,π‘ž 𝑑 𝑃 1 =1βˆ’ π‘Ÿ 𝑑 β‰ˆ 𝑒 βˆ’π‘Ÿ/𝑑 𝑃 2 =1βˆ’ π‘π‘Ÿ 𝑑 β‰ˆ 𝑒 βˆ’π‘π‘Ÿ/𝑑 𝜌= log 1/ 𝑃 1 log 1/ 𝑃 2 = π‘Ÿ/𝑑 π‘π‘Ÿ/𝑑 = 1 𝑐 Pr⁑[β„Ž(π‘ž)=β„Ž(𝑝)] 1 𝑃1 𝑃2 π‘žβˆ’π‘ π‘Ÿ π‘π‘Ÿ

7 Full Algorithm Data structure is just 𝐿= 𝑛 𝜌 hash tables: Query:
Each hash table uses a fresh random function 𝑔 𝑖 𝑝 = β„Ž 𝑖,1 (𝑝),…, β„Ž 𝑖,π‘˜ (𝑝) Hash all dataset points into the table Query: Check for collisions in each of the hash tables until we encounter a point within distance π‘π‘Ÿ Guarantees: Space: 𝑂 𝑛𝐿 =𝑂( 𝑛 1+𝜌 ), plus space to store points Query time: 𝑂 𝐿⋅(π‘˜+𝑑) =𝑂( 𝑛 𝜌 ⋅𝑑) (in expectation) 50% probability of success.

8 Choice of parameters π‘˜,𝐿 ?
𝐿 hash tables with 𝑔 𝑝 = β„Ž 1 (𝑝),…, β„Ž π‘˜ (𝑝) Pr[collision of far pair] = 𝑃 2 π‘˜ Pr[collision of close pair] = 𝑃 1 π‘˜ Success probability for a hash table: 𝑃 1 π‘˜ 𝐿=𝑂 1/ 𝑃 1 π‘˜ tables should suffice Runtime as a function of 𝑃 1 , 𝑃 2 ? 𝑂 1 𝑃 1 π‘˜ π‘‘π‘–π‘šπ‘’π‘‡π‘œπ»π‘Žπ‘ β„Ž+𝑛 𝑃 2 π‘˜ Hence 𝐿=𝑂( 𝑛 𝜌 ) set π‘˜ s.t. =1/𝑛 = 𝑃 2 𝜌 π‘˜ = 1/𝑛 𝜌 𝑃 1 𝑃 1 2 𝑃 2 π‘˜=1 𝑃 2 2 π‘˜=2 π‘Ÿ π‘π‘Ÿ

9 Analysis: correctness
Let 𝑝 βˆ— be an π‘Ÿ-near neighbor If does not exists, algorithm can output anything Algorithm fails when: near neighbor 𝑝 βˆ— is not in the searched buckets 𝑔 1 π‘ž , 𝑔 2 π‘ž , …, 𝑔 𝐿 π‘ž Probability of failure: Probability π‘ž, 𝑝 βˆ— do not collide in a hash table: ≀1βˆ’ 𝑃 1 π‘˜ Probability they do not collide in 𝐿 hash tables at most 1βˆ’ 𝑃 1 π‘˜ 𝐿 = 1βˆ’ 1 𝑛 𝜌 𝑛 𝜌 ≀1/𝑒

10 Analysis: Runtime Runtime dominated by: Distance computations:
Hash function evaluation: 𝑂(πΏβ‹…π‘˜) time Distance computations to points in buckets Distance computations: Care only about far points, at distance >π‘π‘Ÿ In one hash table, we have Probability a far point collides is at most 𝑃 2 π‘˜ =1/𝑛 Expected number of far points in a bucket: 𝑛⋅ 1 𝑛 =1 Over 𝐿 hash tables, expected number of far points is 𝐿 Total: 𝑂 πΏπ‘˜ +𝑂 𝐿𝑑 =𝑂( 𝑛 𝜌 log 𝑛 +𝑑 ) in expectation

11 LSH in practice If want exact NNS, what is 𝑐?
π‘˜ safety not guaranteed fewer false positives If want exact NNS, what is 𝑐? Can choose any parameters 𝐿,π‘˜ Correct as long as 1βˆ’ 𝑃 1 π‘˜ 𝐿 ≀0.1 Performance: trade-off between # tables and false positives will depend on dataset β€œquality” fewer tables 𝐿

12 LSH Algorithms Hamming space Euclidean space Space Time Exponent 𝒄=𝟐
Reference Hamming space 𝑛 1+𝜌 𝑛 𝜌 𝜌=1/𝑐 𝜌=1/2 [IM’98] 𝜌β‰₯1/𝑐 [MNP’06, OWZ’11] Euclidean space 𝑛 1+𝜌 𝑛 𝜌 𝜌=1/𝑐 𝜌=1/2 [IM’98, DIIM’04] Other ways to use LSH πœŒβ‰ˆ1/ 𝑐 2 𝜌=1/4 [AI’06] 𝜌β‰₯1/ 𝑐 2 [MNP’06, OWZ’11] Table does not include: 𝑂(𝑛𝑑) additive space 𝑂(𝑑⋅log 𝑛) factor in query time

13 LSH Zoo ( β„“ 1 ) To be or not to be To sketch or not to sketch sketch sketch Hamming distance [IM’98] β„Ž: pick a random coordinate(s) β„“ 1 (Manhattan) distance [AI’06] β„Ž: cell in a randomly shifted grid Jaccard distance between sets: 𝐽 𝐴,𝐡 = 𝐴∩𝐡 𝐴βˆͺ𝐡 β„Ž: pick a random permutation πœ‹ on the universe β„Ž 𝐴 = min π‘Žβˆˆπ΄ πœ‹(π‘Ž) min-wise hashing [Bro’97] Claim: Pr[collision]=J(A,B) be not not not or to be not or to …11101… 1 …01111… 1 …21102… …01122… {be,not,or,to} {not,or,to, sketch} be to πœ‹=be,to,sketch,or,not

14 LSH for Euclidean distance
[Datar-Immorlica-Indyk-Mirrokni’04] LSH function β„Ž(𝑝): pick a random line β„“, and quantize project point into β„“ β„Ž 𝑝 = 𝑝⋅ℓ 𝑀 +𝑏 β„“ is a random Gaussian vector 𝑏 random in [0,1] 𝑀 is a parameter (e.g., 4) Claim: 𝜌=1/𝑐 β„“ 𝑝


Download ppt "Lecture 11: Nearest Neighbor Search"

Similar presentations


Ads by Google