Download presentation
Presentation is loading. Please wait.
1
Location Clustering Peter Kamm Marcel Flores Peter Kamm Marcel Flores
2
The Data Set Sessions Contains a collection of connections over the course of a week User ID, Start time, stop time, Tower ID 25 million lines! Sessions Contains a collection of connections over the course of a week User ID, Start time, stop time, Tower ID 25 million lines!
3
...a little more A tower location mapping Tower ID, Longitude, Latitude, Zip Code Allows us to map to a real world location Data set is not complete There are many towers we do not have a location for A tower location mapping Tower ID, Longitude, Latitude, Zip Code Allows us to map to a real world location Data set is not complete There are many towers we do not have a location for
4
Applications Load balancing on the cell-phone networks themselves Social Networking Integrate online social networks with the real world Accounts for mobility and usage patterns Load balancing on the cell-phone networks themselves Social Networking Integrate online social networks with the real world Accounts for mobility and usage patterns
5
Analysis See which locations are active at what times Where do people congregate? How strongly do they congregate? Does the locations affect their usage Connection Duration How does this map out into the physical world? See which locations are active at what times Where do people congregate? How strongly do they congregate? Does the locations affect their usage Connection Duration How does this map out into the physical world?
6
Day and Night Hotspots Now uses a proper qualitative metric Looks at all ratio of day to night (or night to day, depending on which is larger) Rejected locations with <100 day or night sessions Gives us a number >1 to rank strength of location Daytime is defined as 4am to 4pm Day has more “very strong” hotspots Now uses a proper qualitative metric Looks at all ratio of day to night (or night to day, depending on which is larger) Rejected locations with <100 day or night sessions Gives us a number >1 to rank strength of location Daytime is defined as 4am to 4pm Day has more “very strong” hotspots
7
Day and Night Ranks Top Day Hotspots Tower IDDay HitsNight HitsRatio 1571038421149.213 661387333940.923 1246349214623.918 1361025513867.399 Top Night Hotspots Tower IDDay HitsNight HitsRatio 64451679375.611 241439720185.083 103161225934.861 89101165634.853
8
Strength Distribution Day - 4,479 total Night - 10,812 total
9
Day and Night Plots
10
Day/Night Durations
11
Day Avg Durations
12
Durations Day/night hotspots tend to exhibit similar patterns of usage Longest connections during morning/evening commute Urban towers get longer connections in mornings, residential neighborhoods get longer connections in evenings Day/night hotspots tend to exhibit similar patterns of usage Longest connections during morning/evening commute Urban towers get longer connections in mornings, residential neighborhoods get longer connections in evenings
13
Physical Locations Have to be done by hand, smaller sample Incomplete, do not have locations for all towers For the highest ranked locations Sadly the top 4 shown previously not in location data set! In fact, none of the high-ratio day or night spots appear (until down to a ratio of <2)! Have to be done by hand, smaller sample Incomplete, do not have locations for all towers For the highest ranked locations Sadly the top 4 shown previously not in location data set! In fact, none of the high-ratio day or night spots appear (until down to a ratio of <2)!
14
Some Locations… Tower 79 - Night Tower, 1.255 ratio Located in Englewood Residential South Chicago Not very strong ratio Tower 79 - Night Tower, 1.255 ratio Located in Englewood Residential South Chicago Not very strong ratio
15
Tracing a User Turns out, the data set was (maybe) rich enough to provide information on a per user level! Followed the first 5000 users in the data set, ranked them based on activity Considered the busiest (by hand) Compared to day/night ratio of each location Turns out, the data set was (maybe) rich enough to provide information on a per user level! Followed the first 5000 users in the data set, ranked them based on activity Considered the busiest (by hand) Compared to day/night ratio of each location
16
Tracing a User: Results User 1: Busiest at tower 24 (20,729) Night tower with a 2.339 ratio But the user accounts for over 99% of the tower traffic! 2nd Busiest at tower 1197 (3,660) Night tower with a 1.528 ratio Again accounts for 99% of traffic! User 1: Busiest at tower 24 (20,729) Night tower with a 2.339 ratio But the user accounts for over 99% of the tower traffic! 2nd Busiest at tower 1197 (3,660) Night tower with a 1.528 ratio Again accounts for 99% of traffic!
17
Tracing a User: Results User 5: Busiest at tower 258 (7,449) Night tower with a 1.711 ratio (75% of traffic!) No location data 2nd Busiest at tower 309 (5,773) Night tower with a 1.765 ratio (only 60%…) Residential, Longview, Washington User 5: Busiest at tower 258 (7,449) Night tower with a 1.711 ratio (75% of traffic!) No location data 2nd Busiest at tower 309 (5,773) Night tower with a 1.765 ratio (only 60%…) Residential, Longview, Washington
18
Tracing a User: Results Had to go to user 113 to get a more reasonable user Busiest at tower 100 (1,602) Night tower at 1.207 ratio Not an unreasonable amount of traffic Solon, Iowa Second busiest at 5045 (602) Night tower at 2.004 Had to go to user 113 to get a more reasonable user Busiest at tower 100 (1,602) Night tower at 1.207 ratio Not an unreasonable amount of traffic Solon, Iowa Second busiest at 5045 (602) Night tower at 2.004
19
Single-user Durations
20
Hotspots Looking at certain user traces, seemed that certain users seemed to use the busiest towers the most So are the busiest towers really seeing a lot of users, or a few very busy users? Analyzed the numbers of unique users that a tower sees in a day Looking at certain user traces, seemed that certain users seemed to use the busiest towers the most So are the busiest towers really seeing a lot of users, or a few very busy users? Analyzed the numbers of unique users that a tower sees in a day
21
Unique User Data Count how many users a specific tower sees over the duration Allows us to give an alternate ranking of the tower traffic Easily ignore points where a single user accounts for the majority of a towers traffic Actual data is forthcoming… Count how many users a specific tower sees over the duration Allows us to give an alternate ranking of the tower traffic Easily ignore points where a single user accounts for the majority of a towers traffic Actual data is forthcoming…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.