Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Effective Coreset Compression Algorithm for Large Scale Sensor Networks Dan Feldman, Andrew Sugaya Daniela Rus MIT.

Similar presentations


Presentation on theme: "An Effective Coreset Compression Algorithm for Large Scale Sensor Networks Dan Feldman, Andrew Sugaya Daniela Rus MIT."— Presentation transcript:

1 An Effective Coreset Compression Algorithm for Large Scale Sensor Networks Dan Feldman, Andrew Sugaya Daniela Rus MIT

2 =Data

3 How much data?

4

5 1 GPS Packet = 100 bytes (latitude, longitude, time)

6 1 GPS Packet = 100 bytes every 10 seconds

7 ~40 Mb / hour or ~1 Gb / day

8 per device

9 ~300 million smart phones sold in 2010 http://mobithinking.com/mobile-marketing-tools/latest-mobile-stats

10 For 100 million devices

11 ~ 100 petabytes per day For 100 million devices

12 ~ 100 thousand terabytes per day

13

14 2 terabytes each

15 x50000 / day

16 A lot of data.

17 GPS-points Data iPhones can collect high-frequency GPS traces GPS-point = (latitude, longitude, time) latitudelongitudetime 1.295783103.78168:44:57 1.295785103.78168:44:59 1.295782103.78168:45:00 1.295782103.78168:45:01 1.29579103.78178:45:04 1.295802103.78178:45:05 1.295915103.78188:45:08 1.29598103.78198:45:09 1.296015103.78198:45:10 1.296057103.7828:45:11 ………

18 Example

19 3-D Visualization

20 Challenges Storing data on iPhone is expensive Transmission data is expensive Hard to interpret raw data Dynamic real-time streaming data

21 Key Insight: Identify Critical Points Approximate the n points by k << n semantically meaningful connected segments

22 Our Approach Central Expy, Singapore Ayer Rajah Expy, Singapore Chin Swee Rd, Singapore 261 Outram Rd, Singapore 169057 1 St Andrew's Rd, Singapore 178957 390A Havelock Rd, Singapore 169664 5A Raffles Ave, Singapore 039801 7 Raffles Blvd, Singapore 039595 N Buona Vista Rd, Singapore 5 Lower Kent Ridge Rd, Singapore 4 Medical Dr, Singapore 117594 20 Leonie Hill, Singapore 113 Devonshire Rd, Singapore 239878 121 Devonshire Rd, Singapore 239882 15 Grange Rd, Singapore 27 Grange Rd, Singapore 239700 Natl Youth Council, Singapore 25K Paterson Rd, Singapore 238517 321 Orchard Rd, Singapore 238866 220 Orchard Rd, Singapore 238852 timelatitudelongitude 8:44:571.295783103.7816 8:44:591.295785103.7816 8:45:001.295782103.7816 8:45:011.295782103.7816 8:45:041.29579103.7817 8:45:051.295802103.7817 8:45:081.295915103.7818 8:45:091.29598103.7819 8:45:101.296015103.7819 8:45:111.296057103.782 ………

23 Solution overview Semantically compress data points – Use coresets Fit lines to the semantic points – Use splines on coreset Reverse geo-cite to get directions

24

25

26

27

28

29

30

31 Problem Statement Input: set P of n data points in R d and integer k Output: optimal k-spline for P that provides semantic compression for large data set P

32 Related Work

33

34 Our Main Compression Theorem Example application

35 Streaming and Parallel Computation

36 Previous Work for streaming

37 p1p1 p2p2 p3p3 p4p4 p5p5 p7p7 p6p6 p8p8 p9p9 p 10 p 11 p 12 p 13 p 15 p 14 p 16 Streaming Compression using merge & reduce

38 Our Main Streaming Theorem

39 p1p1 p2p2 p3p3 p4p4 p5p5 p7p7 p6p6 p8p8 p9p9 p 10 p 11 p 12 p 13 p 15 p 14 p 16 Parallel computation

40 Summary Central Expy, Singapore Ayer Rajah Expy, Singapore Chin Swee Rd, Singapore 261 Outram Rd, Singapore 169057 1 St Andrew's Rd, Singapore 178957 390A Havelock Rd, Singapore 169664 5A Raffles Ave, Singapore 039801 7 Raffles Blvd, Singapore 039595 N Buona Vista Rd, Singapore 5 Lower Kent Ridge Rd, Singapore 4 Medical Dr, Singapore 117594 20 Leonie Hill, Singapore 113 Devonshire Rd, Singapore 239878 121 Devonshire Rd, Singapore 239882 15 Grange Rd, Singapore 27 Grange Rd, Singapore 239700 Natl Youth Council, Singapore 25K Paterson Rd, Singapore 238517 321 Orchard Rd, Singapore 238866 220 Orchard Rd, Singapore 238852 timelatitudelongitude 8:44:571.295783103.7816 8:44:591.295785103.7816 8:45:001.295782103.7816 8:45:011.295782103.7816 8:45:041.29579103.7817 8:45:051.295802103.7817 8:45:081.295915103.7818 8:45:091.29598103.7819 8:45:101.296015103.7819 8:45:111.296057103.782 ………

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85 5000 points 300 points

86 Running time

87 Space

88 Tested Data sets NameNo. of Users Time Extent Data Size ~ Source Subject in Singapore 12 Days300kProbe device and iPhone application Taxi-Cabs in San-Francisco 5004 Months 300MBPublic data (“Crowdad”) Taxi-Cabs in Boston 254 Years15GBMIT

89 The Experiment

90

91 Experiments: Subject in Singapore Compression Ratio Error Ratio

92 Experiments: 500 San-Francisco Taxi-cabs

93 Website Coreset Display Data Display Visualization of Result of Algorithm - A Coreset

94 Contribution Semantic compression of data from sensors Line simplification using – One pass over data – Logarithmic space (for massive data sets) – Linear time – Provable bounded error


Download ppt "An Effective Coreset Compression Algorithm for Large Scale Sensor Networks Dan Feldman, Andrew Sugaya Daniela Rus MIT."

Similar presentations


Ads by Google