Download presentation
Presentation is loading. Please wait.
Published byAnn Merritt Modified over 9 years ago
1
1 TRAFFIC PREDICTION THROUGH DATA MINING IN LAS VEGAS Benjamin Pecheux bpecheux@vizuri.com
2
2 Overview Motivation RTCSNV - FAST Approach Data and Infrastructure Data and Infrastructure Assessment Visualizations Prediction Test
3
3 Motivation Most traffic monitoring and management system display current traffic status Existing traffic prediction use simple trending algorithms or complex simulation in-the-loop models Traffic Management Center (TMC) want to be proactive rather than reactive to improve traffic conditions, reduce delays
4
4 FAST Freeway and Arterial System of Transportation (FAST) in Las Vegas, NV
5
5 FAST FAST Dashboard http://bugatti.nvfast.org/
6
6 Approach
7
7 Large scale data analysis using RTCFAST data Reviewed FAST data collection architecture Obtained FAST historical traffic data Evaluated data produced by FAST in correlation with data collection architecture Assess the potential of using the collected data for traffic prediction Developed a test prediction using the best data available
8
8 Tools Cloud infrastructure: Google Cloud Relational Database: Google Cloud SQL Cloud Instances: Google Compute No SQL Database: Google BigQuery Storage Google Cloud Storage Statistics and visualization: R, ggplot2, ggmap Development: Rstudio
9
9 Data and Infrastructure
10
10 Traffic Sensor Data FAST archived data for all 2013 Extracted from FAST dashboard database backups Data is archived at a 15 min time interval 499 Road sensors: Wavetronix radar sensor208 devices ISS radar sensor121 devices Unknown170 devices Total of 35,164,505 records for 2013 Wavetronix radar sensor19,401,481 records ISS radar sensor9,787,761 records Unknown5,975,263 records
11
11 FAST Data Collection Infrastructure Unknown device type present in the dataset, possibly loop detectors
12
12 Location of Various Sensors Green – Wavetronix Radar Sensors, Serial communication connection to McCain 170 Controllers, Ethernet Cable to RuggedCom RS9000, Fiber daisy chain to Hub, Fiber to TMC Red – Wavetronix Radar Sensors, Ethernet Wireless communication connection to McCain 170, same path to TMC as Green Blue – ISS Radar Sensors, same path to TMC as Red
13
13 FAST Archive Data Fields FieldData TypeFieldData Type IpmIdintegerVolume6integer DateTimeStampdatetimeOccupancyinteger PathintegerSpeedinteger RoadIndexintegerPoll_Countinteger RoadwayIDintegerFailureinteger SegmentIDintegerRoadTypecharacters LaneintegerLocationcharacters DeviceIDintegerPolling_Periodinteger VolumeintegerInvalidinteger Volume1integerDetectorIDcharacters Volume2integerDayOfWeekinteger Volume3integerDateValuedatetime Volume4integerHourIdxinteger Volume5integerHolidayinteger
14
14 Construction, Road Weather and Incident Data Construction data Valuable to distinguish “bad sensors data” from “good sensor” with disrupted traffic Available from NDOT but not aggregated Road Weather Data Only 31 days (need all of 2013) Incident Data Available from FAST but not useful for prediction at this point as they happened somewhat randomly
15
15 Data and Infrastructure Assessment
16
16 Visualizing a million FAST data in 2013 35+ million of records in 2013 30 columns per records => ~ 1 billion data points Pie charts and histogram don’t work anymore New types of graphs are required Color schemes Trending, flattening, aggregation or binning techniques
17
17 Treemap
18
18 Treemap -7.5 Unknown Connection to TMC Fiber Connection from Controller to TMC
19
19 Treemap -7.5 Serial Connection from Sensor to Controller Ethernet Wireless Connection from Sensor to Controller Unknown Connection from Sensor to TMC Unknown Sensor Fiber Connection from Controller to TMC
20
20 Treemap -7.5 Serial Connection from Sensor to Controller Ethernet Wireless Connection from Sensor to Controller Unknown Connection from Sensor to TMC Image Sensing System Unknown Sensor Fiber Connection from Controller to TMC Wavetronix Radar Sensor
21
21 Treemap -7.5 Serial Connection from Sensor to Controller Ethernet Wireless Connection from Sensor to Controller Unknown Connection from Sensor to TMC Image Sensing System Unknown Sensor Fiber Connection from Controller to TMC Wavetronix Radar Sensor One box represents one sensor Size of box represents number of records
22
22 FAST Dataset Treemap (499)
23
23 FAST Sensor Locations (331 Geocoded))
24
24 Completeness of FAST Data Completeness is expressed as: Sensor generating data for all 12 months of the year 2013 Good or Bad Data Good Data is defined as data for every month with less than 20% of that month being invalid Bad Data is defined as everything else Count records archived (15 min interval) for every month of 2013 for each of the 499 FAST sensors Construction data can be used to better understand missing data
25
25 Completeness of FAST Data: Example 1
26
26 Completeness of FAST Data: Example 2
27
27 Completeness of FAST Data: Example 3
28
28 Completeness of FAST Data: Example 4
29
29 Completeness of FAST Sensor Data Across 2013
30
30 Final Completeness of FAST Sensor Data (288/499)
31
31 Completeness of Data Across 2013: Before Step 1
32
32 Completeness of Data Across 2013: After Step 1
33
33 Completeness of Data Across 2013: What Remains
34
34 Completeness Filter Outcome 7,280,003 records removed 27,884,502 records remain Secondary Network Primary Network Device Total Records Records Passed Percent Remaining Sensors FiberSerial/EthernetWavetronix radar sensor16,407,33215,072,54791.86%141 FiberWirelessISS radar sensor9,787,7617,300,04274.58%73 FiberWirelessWavetronix radar sensor2,994,1492,420,76580.85%19 Unknown 5,975,2633,091,14851.73%55 Total35,164,50527,884,502288
35
35 FAST Data Quality of 15 minute Aggregation Assessed the quality for each FAST sensor found to have Complete Data in 2013 FAST sensors data is pulled every minute for 15 minutes then averaged and recorded For the average to be reliable, there needs to be a limited amount of error during the 15 data pulls made Based on FAST guidance, records can be trusted if they have 10 or more successful pulls out of 15 attempts or 66% successful
36
36 FAST Sensor Data Quality Across 15 min Intervals
37
37 FAST Sensor Data Quality Across 15 min Intervals (264/499)
38
38 Quality of Data Across 2013: Before Step 2
39
39 Quality of Data Across 2013: After Step 2
40
40 Quality of Data Across 2013: What Remains
41
41 Quality Filter Outcome 1,341,366 records removed 26,543,136 records remain 24 sensors removed Secondary Network Primary Network Device Total Records Records Passed Percent Remaining Sensors FiberSerial/EthernetWavetronix radar sensor16,407,33214,937,94391.04%138 FiberWirelessISS radar sensor9,787,7617,136,57672.91%71 FiberWirelessWavetronix radar sensor2,994,1492,420,76580.85%19 Unknown 5,975,2632,047,85234.27%36 Total35,164,50526,543,136264
42
42 Fast Data Traffic Flow Usability Flow as a basis for traffic prediction Can we calculate traffic flow using the data collection by the “good” sensors we have isolated? We observed that some records are showing Speed > 0 and Volume = 0 or Speed = 0 and Volume > 0 Assess how much of the “good” datasets these rows represents Select FAST sensors with less than 20% flow error for all 2013 and for each 15 min interval
43
43 Usable (Flow) FAST Sensor Data for 2013
44
44 Usable (Flow) FAST Sensor Data for 2013 (242/499)
45
45 Usable Sensors for 2013: Before Step 3
46
46 Usable Sensors for 2013: After Step 3
47
47 Usable Sensors for 2013: What Remains
48
48 Usability Filter Outcome 1,838,859 records removed 24,704,277 records remain 22 sensors removed Secondary Network Primary Network Device Total Records Records Passed Percent Remaining Sensors FiberSerial/EthernetWavetronix radar sensor16,407,33214,062,01185.71%127 FiberWirelessISS radar sensor9,787,7616,679,73568.25%66 FiberWirelessWavetronix radar sensor2,994,1492,126,52571.02%16 Unknown 5,975,2631,836,00630.73%33 Total35,164,50524,704,277242
49
49 Failure Map: All Sensors that Failed Data Analysis
50
50 Failure Map: Fiber – Wireless – Wavetronix radar sensor
51
51 Failure Map: Fiber – Wireless – ISS radar sensor
52
52 Failure Treemap
53
53 FAST Sensor Data Top 10 Road Sensors for Prediction Prediction Model Candidates
54
54 Top 10 Sensors to Consider for Prediction
55
55 Visualizations
56
56 Additional Visualizations Additional visualizations to allow for further review of the behavior of each of the 499 FAST sensors Two types of graphs were generated Hexagonal binning plot Calendar heatmap
57
57 Hexagonal Binning plot
58
58 428.3.344 Hexagonal Binning plot
59
59 Calendar Heat map
60
60 Roadway ID 437 Calendar Heatmap
61
61 Prediction Test
62
62 Prediction limitations FAST data is aggregated at 15 minutes interval Traffic is random (stochastic) and cannot be predicted precisely Need inline traffic simulation with origin and destination 1. Classification Example: Estimate future phase (Three phase traffic theory) Free flow, Synchronized flow and Wide Moving Jam 2. Regression Example: Estimate future speed
63
63 Prediction Algorithms K-Nearest Neighbor Simplest prediction method Sensitive locality and noise Neural network Emulate biological neurons network Complex, difficult to implement and tune Random Forest Thousands of decision trees Simple, brute force Handle noisy dataset and locality
64
64 Random Trees or Random Forest Algorithm Predictors Records Outcomes
65
65 Random Trees or Random Forest Algorithm
66
66 FAST sensor selection Road sensor 350.1.158
67
67 FAST sensor 350.1.158 location
68
68 FAST sensor 350.1.158 Hexagonal binning plot
69
69 FAST sensor 350.1.158 Calendar heatmap
70
70 FAST sensor 350.1.158 Classification of sensor data
71
71 FAST sensor 350.1.158 Classification of sensor data Free flow Wide Moving Jam Synchronized Flow
72
72 Ancillary data Predictors so far Year, month, day and time 2013 Traffic data classified by traffic phase Not enough to create a good prediction model Need more datasets to add more possible predictors Weather prediction (always included in other prediction) Construction schedule Las Vegas Events attendance data Hotel and Casino occupancy data
73
73 Weather Data Road Weather Information System (RWIS) Not enough data KLAS NOAA archive Mean Temperature, Wind Speed
74
74 Weather Data Data obtained from KLAS FieldType FieldType PSTdate Min Sea Level Pressure (In)float Max Temperature (F)integer Max Visibility (Miles)integer Mean Temperature (F)integer Mean Visibility (Miles)integer Min Temperature (F)integer Min Visibility (Miles)integer Max Dew Point (F)integer Max Wind Speed (MPH)integer Mean Dew Point (F)characters Mean Wind Speed (MPH)integer Min Dew Point (F)characters Max Gust Speed (MPH)integer Max Humidityinteger Precipitation (In)float Mean Humidityinteger Cloud Coverinteger Min Humidityinteger Eventscharacters Max Sea Level Pressure (In)float Wind Dir. (Degrees)integer Mean Sea Level Pressure (In)float
75
75 Las Vegas Events Data Las Vegas Convention Calendar Event Venue Start and End Date Attendance DateVenueOccupancy 12/28/12Hard Rock Hotel & Casino190 12/29/12Hard Rock Hotel & Casino190 12/30/12Hard Rock Hotel & Casino190 12/31/12Hard Rock Hotel & Casino190 1/1/13Hard Rock Hotel & Casino190 12/31/12Bellagio31 1/1/13Bellagio31 1/2/13MGM Grand Hotel and Casino650 1/3/13MGM Grand Hotel and Casino650 1/4/13MGM Grand Hotel and Casino650 1/5/13MGM Grand Hotel and Casino650 1/3/13Caesars Palace252 1/4/13Caesars Palace252 1/3/13Tropicana Las Vegas300 1/4/13Tropicana Las Vegas300 1/5/13Tropicana Las Vegas300
76
76 Predictors and outcome For each 15 minute interval of 2013 Predictors: 94 predictors combining date, time, weather, events attendance Outcome: Traffic phase
77
77 Predictors selection Out of 94 predictors, 20 with non zero variance for sensor 350.1.158 Date and time Month, Day, Hour, Minute (15-minute interval), DayOfWeek Weather data Mean Temperature in Fahrenheit, Mean Wind Speed in MPH Events data Bellagio, Caesars Palace, Tropicana Las Vegas, Mandalay Bay Resort And Casino, Bally’s Las Vegas, Circus Circus Hotel Casino and Theme Park, Monte Carlo Resort and Casino, Venetian Resort Hotel Casino, ARIA Resort And Casino, Cosmopolitan of Las Vegas, South Point Hotel Casino And Spa, Westin Las Vegas Hotel Casino And Spa, MEET Las Vegas
78
78 Prediction model training Two-way partition Cross validation Bootstrapping
79
79 Prediction model test results
80
80 Three days prediction test
81
81 Possible use of prediction
82
82 Possible use of prediction
83
83 What’s next?
84
84 Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.