Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Depth Jason Burrowes-Jones Presentation Outline Background Review What is known Project Objectives Present Work and Results Future Goals.

Similar presentations


Presentation on theme: "Data Depth Jason Burrowes-Jones Presentation Outline Background Review What is known Project Objectives Present Work and Results Future Goals."— Presentation transcript:

1

2 Data Depth Jason Burrowes-Jones

3 Presentation Outline Background Review What is known Project Objectives Present Work and Results Future Goals

4 Background Background Review Smallest half space containing X Median: a point of max depth (not necessarily a data point)

5 Background Why the interest in the median? –ROBUSTNESS i.e. median resists effects of polluted data. –Gives a sense of the data from the centre outwards. Importance of Data Depth –Eliminating outliers –Location estimate

6 Background Inspired by 1-dimensional case d(x) = min. #a i in any half space containing X Proposed by John Tukey in 1974 Generalization of Depth in R 2 1. Tukey “Depth”

7 Background Rotate line through X 180 degrees. Keep count of data points on both sides of the line. Depth is smallest count as line is rotated.(red line in this case) Finding the Tukey Depth of X Cost: O(nlog(n))

8 Background Depth of Tukey Median, μ C n : cost to compute Tukey Median

9 Background Proposed by Regina Liu in 1989 Simplicial depth is # of triangles that contain X Median is a point of maximal depth Always a point such that: 2. Simplicial Depth in R 2

10 Background Lemma 1 –Given points A,B,C and reference point X, let A / be any point on ray starting at X and going through A. Then: Finding the Simplicial Depth of X X B C A A/A/ A/A/

11 Background Lemma 2 –Given the points A, B and C on a unit circle centered at the origin. Let A* be antipodal to A, then Δ ABC contains the origin if and only if A* is on the short arc joining B and C. A B C A*

12 Background Algorithm Sort points in radial order (θ 1,…..,θ n ) –Upper half: (θ 1,…..,θ t ) –Lower half: (θ t+1,…..,θ n ) for i=1 to t Pick the diameter D though θ i Count triangles in upper half θ1θ1 θ2θ2 θ3θ3 θtθt

13 Background θ1θ1 θ2θ2 θ3θ3 θtθt Algorithm Sort points in radial order (θ 1,…..,θ n ) –Upper half: (θ 1,…..,θ t ) –Lower half: (θ t+1,…..,θ n ) for i=1 to t Pick the diameter D though θ i Count triangles in upper half

14 Background x θ1θ1 θ2θ2 θ3θ3 θtθt Algorithm Sort points in radial order (θ 1,…..,θ n ) –Upper half: (θ 1,…..,θ t ) –Lower half: (θ t+1,…..,θ n ) for i=1 to t Pick the diameter D though θ i Count triangles in upper half

15 Algorithm Sort points in radial order (θ 1,…..,θ n ) –Upper half: (θ 1,…..,θ t ) –Lower half: (θ t+1,…..,θ n ) for i=1 to t Pick the diameter D though θ i Count triangles in upper half x θ1θ1 θ2θ2 θ3θ3 θtθt for i=t+1 to n Do the same for lower half Depth = sum / 3 Computing d(x) can be done O(n)

16 Background Computation of simplicial median: Depth of simplicial median:

17 Background What is known? Computing a centre point (Jadav, Mukhopadyay). –Tukey Depth –O(n) Computing a high tukey depth point in the plane (S. Langerman,W. Steiger). –Pruning technique –O(n(log n) 2 )

18 What is known Computing a High Tukey Depth Point in the Plane Definition Fact

19 What is known What is a witness half space? –A witness halfspace defines the depth of a point P. –It is the halfspace with the fewest number of data points. –d(P)=k h k points n-k points P

20 What is known Algorithm Deep(S,A*) A  A*, P*  any point of A* While A is not empty 1.Find a point 2.Compute the depth of Q and let h be the witness halfspace for Q in S. 3.If the d(Q) is greater than d(P*) then: P*=Q, end if 4.Prune

21 Project Objectives Efficient Algorithm –Simplicial Median

22 Present Work and Results Find a point P of high simplicial depth.

23 Proposal Take random sample of size m Find simplicial median of this sample. – point would have high depth in the source data. Use pruning technique to determine if there is a data point of higher depth. n(log(n)) c Working on details

24 Future Goals Adapt the pruning technique to the simplicial median Write a paper –Present at a conference

25 Background Acknowledgements DIMACS REU Sponsors William Steiger Evil computer scientists and mathematicians


Download ppt "Data Depth Jason Burrowes-Jones Presentation Outline Background Review What is known Project Objectives Present Work and Results Future Goals."

Similar presentations


Ads by Google