Download presentation
Presentation is loading. Please wait.
1
Kentucky Transportation Cabinet
Big Data Environment Monitoring road activities with a real-time snow and ice information management system using Spark and Hadoop Vineet Kumar Clay Thompson
2
Kentucky Transportation Cabinet is an executive branch agency responsible for overseeing the development and maintenance of a safe, efficient multi-modal transportation system throughout the Commonwealth. The Cabinet manages more than 27,000 miles of highways, including roughly 20,500 miles of secondary roads, 3,600 miles of primary roads, and more than 1,400 interstate and parkway miles. KYTC also provides direction for 230 licensed airports and heliports and oversees all motor vehicle and driver's licensure for more than three million drivers in the Commonwealth. Business request from the Division of Maintenance for data support related to Snow/Ice Program Clay Thompson Branch Manager, Enterprise Data Branch Vineet Kumar Enterprise Data Architect. Currently focusing on Data Warehouse and Big data Environment.
4
Snow Removal Level of Service (LOS) and Annual Cost
27,000 Miles of State Maintained Roads $45 Million for a “normal” winter $78 Million for $65 Million $43 Million for Route Designation Level of Service (LOS) Interstates 1 hour response time Priority A 2 hour turn around time Priority B 4 hour turn around time Priority C 8 hour turn around time after A’s and B’s Clear
5
Project Goals and Challenges
Reduce Waste Increase Level of Service Roadway or lane blocking, Construction Activities, Weather observations(Air and road Temp) Provide Accurate Record Keeping Snow Removal Vehicle Location and calculations(salt rate, liquid rate etc) Perform Detailed Historic Analytics Move Towards Dynamic Routing Create Predictive Models
6
AVL(Automated Vehicle Location) Equipment/Feed
120 Trucks Currently Equipped Mixture of State and Contract Trucks 1400 when fully equipped 10 Seconds Time Interval – 80K records Data Fields Speed, Direction, Latitude, Longitude, Date/Time, Treatment rate (Salt), Air/Pavement Temp, Plow Status Traffic Alerts and Jams from Waze (Crowd Source Data) HERE (Sensor across Kentucky) – Traffic Speed Data Mesonet Stations – Across Kentucky (Rainfall, Wind speed, Air Temperature etc)
7
Problem Solution Lat = 38.5… Long = -84.6… Dir = 350° Speed = 45mph
Salt = x tons Temp = 32° Road ID = 1234 Road Name = I-75 Mile Point = Priority = A Road Type = I Cost = $2.00 … Snapped AVL Data Northbound 18m Lat = 38.5… Long = -84.6… Dir = 350° Speed = 30mph Salt = x tons Temp = 32° … Road ID = ? Road Name = ? Mile Point = ? Priority = ? Road Type = ? Cost = ? Incoming Data Unknown 30m 15m Southbound
8
Mile Point Calculation
d1 d1 d2 d2 (x2,y2) (x2,y2) d3 d3 (x1,y1) (x1,y1) Total=d1+d2+d3, where d=distance between two points
9
Mile Point Calculation(Vincenty's formulae)
Iterative Method to calculate distance between two Points on the surface of Sphere Developed by Thaddeus Vincenty in 1975. Given the coordinates of the two points (Φ1, L1) and (Φ2, L2), the ellipsoidal distance s. Then iteratively evaluate the following equations until λ converges: Loop until minimum 20 iterations or λ <10−12 Source :
10
AVL Processing ~ 35 Seconds 84,000 AVL records 70,000+ road segments
120 trucks every 10 seconds 70,000+ road segments 28,000 Miles State Maintained Roads divided into ½ mile segments or less HERE Traffic Waze KYTC Snow Plows Twitter / Social Media KYMesonet Air Temps KYTC RWIS Pavement Temps CoCoRahs Precipitation ~ 35 Seconds Process Time Takes 40+ minutes in SQL Server/Oracle
11
Data Feeds HERE Traffic Sensors Waze Alerts Waze Jams Twitter
KYMesonet – For Air Temps, Wind Speed. KY RWIS – Pavement Temps. CoCoRahs Precipitation.
12
Chris.Lambert@ky.gov | http://transportation.ky.gov/realtime
13
Time-line to date Fall/Winter 2014 – Project Begins
Implemented ESRI GeoEvent Processor Store data in SQL Server/ ESRI SDE Summer 2015 – Reevaluate System Complex jobs to load data into SQL and loading was not consistent. Too much data for SQL Server to store and backups were not completing on time. SQL Server was taking 40 min to process 80K+ records. Implemented Cloudera Hadoop Distribution(CDH 5.4). ESRI GIS Tools for Hadoop. Store data in Hadoop. Process every 10min using Hive( Spatial Data Queries) and post summary results back to SQL Server. Dynamic Dashboards using Solr Search.
14
Time-line to date Winter 2015 – Converted hive Queries into Spark and Production move. Moved processing to Spark from Hive Queries. Spark is taking 35 Seconds vs 6-7min (hive). Change job schedule to 2 min from 10min. More feeds added to Hadoop environment; Currently 9 feeds loading and processing using Spark and Sqoop.
15
Kentucky Hadoop Cluster Hardware Environment
Nodes: (CDH 5.4) Name Node:2 (“Managers”) Data Nodes: 5 (“Processors”) Gateway Servers : 2 Cloudera Manager :1 Storage: 60TB usable (180TB RAW) Data replicated 3x as failover/backup. Hadoop Cloudera Distribution Investment ~$200,000 Investment $150K hardware $50K software licensing (24X7 support)
16
ArcGIS Server GeoEvent Processor AGOL 2 min SDE/SQL
18
Snow/Ice Roadway Advisory Map
19
Treatment Treatment Decay Time Decay Value Last Hour 1-2 Hours ago
20
Benefits and Impact Reduce Waste Materials
Are we over applying (Specially when temperature is very low or high(above 40F)? Material disperse calculations(Salt and Liquid) within 10 sec Route Optimization (time and fuel) Travel time information Increase Level of Service(LOS) Public Map and Information availability. Automated Vehicle Location within 10sec Notifications to Trucks and Emergency staff
21
Benefits and Impact Analytics
Division of Planning data processing jobs used to take days(3-4 days); restart of program if technical issue arises. Same job - taking around 2 hours max to complete. Near real-time information with Solr Search Dashboards How does treatment effect life of pavement? Environmental impact – too much salt near to Bridge Are we meeting Level of Service? Predictive Analytics and Data storage for longer period
22
Lessons Learned Start with Quick start VMs and free download versions.
Start with small cluster and expand later. Get support (Must) Great online articles and knowledge base site. Create Service Request and your technical issues will be addressed. Some scripting expertise ( Python, Scala or Java) In our case, it took 15 months from development to production( includes Cloudera Hadoop distribution selection)
23
Thank You!!!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.