Presentation is loading. Please wait.

Presentation is loading. Please wait.

ML Approaches – Conceptual Stuff Nitin Kohli DS W210 – Capstone Project.

Similar presentations


Presentation on theme: "ML Approaches – Conceptual Stuff Nitin Kohli DS W210 – Capstone Project."— Presentation transcript:

1 ML Approaches – Conceptual Stuff Nitin Kohli DS W210 – Capstone Project

2 Sequence Matching – Marathon Runner Analogy Imagine you are watching a runner run a marathon During the marathon, a runner reaches various checkpoints and their time is recorded For instance, if there are 26 checkpoints in the race, and we know the runner’s time at the first 3 checkpoints, we can use this information to deduce the time at the 4 th checkpoint, 5 th checkpoint, and so on In general, we are able to infer about time to complete the remainder of the race for that particular runner

3 Sequence Matching We apply this analogy to trains within the BART system The trains depart a “starting” station at a particular time, and check in at checkpoints (train stops) along the way This information gives us a partial story of the sequence of arrival times for the train To deduce the remainder of the times, we can match these incomplete sequences on complete historical sequences to deduce the next arrival time

4 Lag Time Analysis Unlike the marathon runner in our previous analogy, once a train arrives at a given station it does not immediately continue It pauses for a bit to allow passengers to get on and off the train before continuing Thus, we need to supplement the arrival time from the sequence matching by accounting for the lag time at a given station This is done using a Ridge Regression with features such as (but not limited to): Length of the train Which stop the train is at Time of the arrival (Estimated) Whether the arrival is in the AM or PM

5 Summary: Sequence Matching -> Lag Time Prediction -> Updated Departure Time -> Repeat … … … Lag Model Sequence Matching Model

6 Tech Summary of System Level Prediction 1. User enters various information 2. We first need to tell the user the train will arrive at the selected station for departure - This means we need to query the MySQL db to find the most recent trains heading in the direction of the user - Then, we need to use the previous train stops as an input to perform a sequence match - Once we have a sequence match, we can predict from the matched sequences - This will give us a predicted arrival time at the next stop - But at each stop, the train will wait some time before departing from that station - This is were the lag_times model comes in - the current stop, length of the train, etc are used to predict how long the train will wait at a given station 3. Repeat this process until we have both the departing destination and arrival destination predictions 4. Output these values back to the user in the UI

7 ML Approaches – “Mathy” Stuff Nitin Kohli DS W210 – Capstone Project

8 The following slides have (for the most part) all the math that was done to construct the system level prediction It includes the first model, which was used to iterate on to get the more accurate second model

9 Conceptual Framework: k-Nearest Sequences In the picture on the right, note that there are 5 distinct paths Within each path, trains can run in either direction Thus, there are 10 directional paths For each directional path, we will denote the stops using {1,2,…,n} For example, on the orange line from Richmond to Fremont, 1 will refer to Richmond 2 will refer to El Cerrito del Norte 3 will refer to El Cerrito Plaza, etc.

10 Conceptual Framework Continued … … …

11 … … … … … … …

12 Approach 1: Complete the whole sequence

13 Approach 1: Empirical Results 0.1260.1590.1610.160.158Predicted6:326:406:426:446:45 Actuals6:326:406:416:436:44

14 Approach 2: A Dynamic Probabilistic System

15

16 Solution: Invoke the Weak Law of Large Numbers

17 Dynamic Probabilistic System Algorithm


Download ppt "ML Approaches – Conceptual Stuff Nitin Kohli DS W210 – Capstone Project."

Similar presentations


Ads by Google