T-Share: A Large-Scale Dynamic Taxi Ridesharing Service Shuo Ma, Yu Zheng, Ouri Wolfson Microsoft Research Asia University of Illinois at Chicago This is a joint work with… Overview, taxi-sharing, accept user queries, subject to capacity and time constraints, and minimize the increase of travel distance for each query.
Background Taxi-sharing is of great social and environmental importance Serving more demands: Peak hours vs Off-peak hours Reduce energy consumption and air pollutants emission Could save taxi fares while increasing the income of taxi drivers Taxi is a transportation modes between public transportation and public transportation, providing door-to-door commuting services However, peak hours, simply increasing the number.. Does not work Taxi sharing can increase the capacity of the taxi system without adding new taxis.
Background Challenges Wide range of applications Dynamic: Dynamic queries: anytime and anywhere, lazy users Dynamic taxis Real-time query processing large-scale: millions of users and tens of thousands of taxis Wide range of applications Private vehicles Logistic industry for transporting goods
Value Government Passengers Save 800 million liter gasoline per year Supporting 1M cars for 10 months Worth about 1 billion USD 1.64 billion KG CO2 emission Passengers Serving rate increased 300% Save 42% expense on average Taxi drivers increase profit 16% on average
Problem Definition Query 𝑄=< 𝑄.𝑜 , 𝑄.𝑑 ,𝑄.𝑤𝑝, 𝑄.𝑤𝑑 > Origin and destination: 𝑄.𝑜 and 𝑄.𝑑 Time window for pickup: 𝑄.𝑤𝑝 =(𝑄.𝑤𝑝.𝑒, 𝑄.𝑤𝑝.𝑙) Time window for delivery: 𝑄.𝑤𝑑 =(𝑄.𝑤𝑑.𝑒, 𝑄.𝑤𝑑.𝑙) Given a fixed number of taxis traveling on a road network and a stream of queries, we aim to serve each query 𝑄 in the stream by dispatching the taxi which satisfies 𝑄 with the minimum increase in travel distance. Query, satisfy, problem (optimize for each query, with minimum travel distance)
Architecture Update (status)
Spatio-Temporal Index Grid-based approximation Select an anchor node in each grid
Spatio-Temporal Index For each Grid Spatially-ordered grid cell list 𝑔. 𝑙 𝑑 (spatial closeness) Temporally-ordered grid cell list 𝑔. 𝑙 𝑡 (temporal closeness) Taxi list 𝑔. 𝑙 𝑣 sorted by the arrival time
Taxi Searching Update (status)
Taxi Searching Single-side taxi search Problem 𝑄.𝑜 is located in 𝑔 7 𝑡 𝑖7 + 𝑡 𝑐𝑢𝑟 ≤ 𝑄.𝑤𝑝.𝑙 Merge taxi lists Problem Many candidate taxis Scheduling process is heavy 𝑔 3 𝑔 5 𝑔 9
Dual-Side Taxi Searching Origin side 𝑄.𝑜 in 𝑔 7 𝑡 𝑖7 + 𝑡 𝑐𝑢𝑟 ≤ 𝑄.𝑤𝑝.𝑙 Destination side 𝑄.𝑑 in 𝑔 2 𝑡 𝑐𝑢𝑟 + 𝑡 𝑗2 ≤𝑄.𝑤𝑑.𝑙 𝑔 1 𝑔 2 𝑔 3 𝑔 5 𝑔 9 𝑔 7 𝑔 6
Reduce candidate taxis, allow first fit search
Scheduling Module Calculate schedule for each candidate taxi
Scheduling Module Feasibility check Two steps: first insert 𝑄.𝑜 and then 𝑄.𝑑 Do not change the order of an existing schedule Minimize the increase of travel distance Given a schedule 𝑉.𝑠 composed of 𝑛 points 𝑛+1 positions to insert 𝑄.𝑜 𝑛−𝑖+1 positions to insert 𝑄.𝑑 𝑂( 𝑛 2 ) possible ways of insertion
Scheduling Module Feasibility check (using 𝑄.𝑜 as an example) 𝑡 𝑑 = 𝑄 2 .𝑜→𝑄.𝑜 + 𝑄.𝑜→ 𝑄 1 .𝑑 + 𝑡 𝑤 − 𝑄 2 .𝑜→ 𝑄 1 .𝑑 𝑡 𝑤 : the time spent on waiting for the passenger (𝑄. 𝑜) 𝑠𝑡 =𝑄.𝑤𝑝.𝑙− 𝑎 𝑝 (𝑄. 𝑑) 𝑠𝑡 =𝑄.𝑤𝑑.𝑙− 𝑎 𝑑 If 𝒕 𝒅 ≥𝑴𝒊𝒏{ 𝑸 𝟏 .𝒅 𝒔𝒕 , 𝑸 𝟐 .𝒅 𝒔𝒕 }, fail
Scheduling Module Lazy Shortest Path Calculation Find a lower bounder of travel time between two points 𝑡 𝑂𝐷 ≥ 𝑡 𝑖𝑗 −( 𝑐 𝑖 →𝑂)−(𝐷→ 𝑐 𝑗 ) 1. 𝑐 𝑖 →𝑂 + 𝑡 𝑂𝐷 ≥ (𝑐 𝑖 →𝐷) 2. (𝑐 𝑖 →𝐷)+(𝐷→ 𝑐 𝑗 ) ≥ 𝑡 𝑖𝑗 O D (𝑐 𝑖 →𝐷)≥ 𝑡 𝑖𝑗 - (𝐷→ 𝑐 𝑗 ) 3. 𝑐 𝑖 →𝑂 + 𝑡 𝑂𝐷 ≥ 𝑡 𝑖𝑗 − (𝐷→ 𝑐 𝑗 )
Pricing Scheme Taxi fare per mile is higher for multiple passengers than for a single passenger The taxi fare of shared distances is evenly split among the riding passengers 𝐹𝑎𝑟𝑒= 𝑝 (𝑑 1 + ∑ 𝑚=2 𝑐 𝛼+1 ∗ 𝑑 𝑚 𝑚 ) 𝑇𝑜𝑡𝑎𝑙_𝑃𝑟𝑜𝑓𝑖𝑡=𝑝( 𝐷 𝑛 + 1+𝛼 ∗ 𝐷 𝑟 )
Evaluation Settings Big data A trajectory dataset generated by over 33,000 taxis in Beijing over 3 months Built experimental platform based on the data Big data 400 million kilometres 790 million points 20 million trips (46% occupied)
Evaluation Experimental platform Learn the distribution of queries on the road network over time of day from the data Assume the arrival of queries follows a Poisson distribution Learn the transition probability between different road segments 𝒓 𝟏 𝒓 𝟐 𝒑 𝒊𝟏 𝒑 𝒊𝟐 #. Of queries 𝒓 𝒊
Settings of experimental platform Definition Value The start time of simulation 9 am The end time of simulation 9:30 am The number of taxis 2,980 The pickup window size 5 minute The length of a time bin The # of time bins in a frame 12 Number of queries 27,000
Evaluation Baselines No ridesharing Single-side and First Fit Ridesharing (SF) Single-side and Best-fit Ridesharing (SB) Dual-side and First Fit Ridesharing (DF) Dual-side and Best-fit Ridesharing (DB)
Results Effectiveness
Results Efficiency
Conclusion Win-win-win scenario Candidate taxi selection based on a spatio-temporal index Dual-side search saves 50% computational load Have the similar effectiveness as compared with the single-side search Taxi scheduling based on Feasibility check Lazy shortest path computing saves 83% computational load Serve 720k queries per hour on a single machine Future work Consider more constraints: monetary constraints Dynamic time estimation Other factors: like social trust and credit
Thanks! Yu Zheng yuzheng@microsoft.com Homepage