Download presentation
Presentation is loading. Please wait.
2
Data Depth Jason Burrowes-Jones
3
Presentation Outline Background Review What is known Project Objectives Present Work and Results Future Goals
4
Background Background Review Smallest half space containing X Median: a point of max depth (not necessarily a data point)
5
Background Why the interest in the median? –ROBUSTNESS i.e. median resists effects of polluted data. –Gives a sense of the data from the centre outwards. Importance of Data Depth –Eliminating outliers –Location estimate
6
Background Inspired by 1-dimensional case d(x) = min. #a i in any half space containing X Proposed by John Tukey in 1974 Generalization of Depth in R 2 1. Tukey “Depth”
7
Background Rotate line through X 180 degrees. Keep count of data points on both sides of the line. Depth is smallest count as line is rotated.(red line in this case) Finding the Tukey Depth of X Cost: O(nlog(n))
8
Background Depth of Tukey Median, μ C n : cost to compute Tukey Median
9
Background Proposed by Regina Liu in 1989 Simplicial depth is # of triangles that contain X Median is a point of maximal depth Always a point such that: 2. Simplicial Depth in R 2
10
Background Lemma 1 –Given points A,B,C and reference point X, let A / be any point on ray starting at X and going through A. Then: Finding the Simplicial Depth of X X B C A A/A/ A/A/
11
Background Lemma 2 –Given the points A, B and C on a unit circle centered at the origin. Let A* be antipodal to A, then Δ ABC contains the origin if and only if A* is on the short arc joining B and C. A B C A*
12
Background Algorithm Sort points in radial order (θ 1,…..,θ n ) –Upper half: (θ 1,…..,θ t ) –Lower half: (θ t+1,…..,θ n ) for i=1 to t Pick the diameter D though θ i Count triangles in upper half θ1θ1 θ2θ2 θ3θ3 θtθt
13
Background θ1θ1 θ2θ2 θ3θ3 θtθt Algorithm Sort points in radial order (θ 1,…..,θ n ) –Upper half: (θ 1,…..,θ t ) –Lower half: (θ t+1,…..,θ n ) for i=1 to t Pick the diameter D though θ i Count triangles in upper half
14
Background x θ1θ1 θ2θ2 θ3θ3 θtθt Algorithm Sort points in radial order (θ 1,…..,θ n ) –Upper half: (θ 1,…..,θ t ) –Lower half: (θ t+1,…..,θ n ) for i=1 to t Pick the diameter D though θ i Count triangles in upper half
15
Algorithm Sort points in radial order (θ 1,…..,θ n ) –Upper half: (θ 1,…..,θ t ) –Lower half: (θ t+1,…..,θ n ) for i=1 to t Pick the diameter D though θ i Count triangles in upper half x θ1θ1 θ2θ2 θ3θ3 θtθt for i=t+1 to n Do the same for lower half Depth = sum / 3 Computing d(x) can be done O(n)
16
Background Computation of simplicial median: Depth of simplicial median:
17
Background What is known? Computing a centre point (Jadav, Mukhopadyay). –Tukey Depth –O(n) Computing a high tukey depth point in the plane (S. Langerman,W. Steiger). –Pruning technique –O(n(log n) 2 )
18
What is known Computing a High Tukey Depth Point in the Plane Definition Fact
19
What is known What is a witness half space? –A witness halfspace defines the depth of a point P. –It is the halfspace with the fewest number of data points. –d(P)=k h k points n-k points P
20
What is known Algorithm Deep(S,A*) A A*, P* any point of A* While A is not empty 1.Find a point 2.Compute the depth of Q and let h be the witness halfspace for Q in S. 3.If the d(Q) is greater than d(P*) then: P*=Q, end if 4.Prune
21
Project Objectives Efficient Algorithm –Simplicial Median
22
Present Work and Results Find a point P of high simplicial depth.
23
Proposal Take random sample of size m Find simplicial median of this sample. – point would have high depth in the source data. Use pruning technique to determine if there is a data point of higher depth. n(log(n)) c Working on details
24
Future Goals Adapt the pruning technique to the simplicial median Write a paper –Present at a conference
25
Background Acknowledgements DIMACS REU Sponsors William Steiger Evil computer scientists and mathematicians
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.