Capstone Project. NYC Taxi DataSet The data is stored in CSV format, organized by year and month. In each file, each row represents a single taxi trip.

Slides:



Advertisements
Similar presentations
DURHAM RIDESHARE Save Money…Save Time Take Back Your Morning.
Advertisements

Dispatch Client David Bigham. Recommended system Overview of Dispatch Client components New features Tips Topics.
International Symposium on Road Pricing 2003 Evaluation of Singapore’s Electronic Road Pricing (ERP) System ( present) A P G Menon MSI Global.
The Maize Inflorescence Project Website Tutorial Nov 7, 2014.
Congestion Mitigation Strategies: Alternatives to the City’s plan New York City Traffic Congestion Mitigation Commission December 10, 2007.
GREATER NEW YORK A GREENER Travel Demand Modeling for analysis of Congestion Mitigation policies October 24, 2007.
NEW YORK CITY TRAFFIC CONGESTION MITIGATION COMMISSION NYSDOT Comments on New York City Traffic Congestion Mitigation Plan Bob Zerrillo, Director, Office.
POD Do the following numbers represent a proportional relationship?
Smarter Outlier Detection and Deeper Understanding of Large-Scale Taxi Trip Records: A Case Study of NYC Jianting Zhang Department of Computer Science.
Overview of Changes Old (revised 12/15/11)New Reset button does not clear form fields. Reset button clears all form fields and reverts form back to original.
Constructing Popular Routes from Uncertain Trajectories Authors of Paper: Ling-Yin Wei (National Chiao Tung University, Hsinchu) Yu Zheng (Microsoft Research.
Thinking Mathematically
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
Geog 458: Map Sources and Errors Uncertainty January 23, 2006.
Measuing Preferences, Establishing Values, The Empirical Basis for Understanding Behavior David Levinson.
LOUISIANA TRAVEL POLICY TRAINING
Congestion Mitigation: Options for Evaluation New York City Traffic Congestion Mitigation Commission January 10, 2008.
Enabling a national road and street database in population statistics Pasi Piela ESRI UC San Diego 2014.
Chapter 2 Review.
Call-n-Ride Transportation Program
The London Congestion Charge. Facts Traffic speed in central London had fallen more that 20% since the 1960s (14.2 mph to 10mph) I n 1998 drivers in inner.
Gabriela Martinez 11/10/08 Honda Civic Honda Civic  Crash test rating: Good  MPG: 25 city/34 hwy miles  Affordable used car.
Big Data for Smart-Cities undergoing Climate Change William Solecki, CUNY – Hunter College EDF Workshop - Columbia University 15 October
7-1: Buying a Home. Costs of Financing a home: Purchase price = tag price Downpayment = a percentage of the purchase price; between 0% and 30% Interest.
Lecture 4 Geodatabases. Geodatabases Outline  Data types  Geodatabases  Data table joins  Spatial joins  Field calculator  Calculate geometry 
1 Financing New York’s Regional Transportation System Robert D. Yaro President Regional Plan Association Greater Vancouver Livability Forum June 1, 2009.
TRAVELING FOR SUNY COBLESKILL
Taiwan Taoyuan International Airport. (TPE) Group members 4A1C0023 粘佳惠 4A1C0038 劉睿涵 4A1C0040 黃香菱 4A1C0096 蔡怡臻.
Border Data Warehouse. Vancouver, BC Bellingham, WA The Cascade Gateway.
GIS Tutorial 1 Lecture 4 Geodatabases. Outline  Data types  Geodatabases  Data table joins  Spatial joins  Field calculator  Calculate geometry.
CS 157B: Database Management Systems II March 20 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
Assessing the Marginal Cost of Congestion for Vehicle Fleets Using Passive GPS Data Nick Wood, TTI Randall Guensler, Georgia Tech Presented at the 13 th.
Do Now 9/13/10 Take out HW from Wednesday. Take out HW from Wednesday. Text p , #2-32 evenText p , #2-32 even Copy HW in your planner. Copy.
Commuting time for every employed: combining traffic sensors and many other data sources for population statistics Pasi Piela EFGS Krakow.
Parallel and Perpendicular Lines Write the equation of a line that passes through a given point, parallel to a given line. Write the equation of a line.
Key Partners Key Activities Value Proposition Customer Relationships Customer Segments Drivers Phone Company Taking calls Scheduling Convenience Safety.
Transportation Disadvantaged Local Coordinating Board (TDLCB)
Mapping Out Beijing Chloe Schaefer. Attributes of Roads in Beijing City is served by five ring roads. From the center of the city outward they are:
Real-Time Trip Information Service for a Large Taxi Fleet
Capstone Project Fall Course Information Instructor Ye Zhao –Office: MSB 220 – Fall 2015 (MSB162) –Time: Tue, Thu 10:45am.
Pervasive Computing MIT SMA 5508 Spring 2006 Larry Rudolph 1 Location, Location, Location Larry Rudolph.
John Balfour Clyde & Co London Conference on EU Passenger Law Towards 2020 Leuven, 6 December 2011 FARE TRANSPARENCY – REGULATION 1008/2008 AND DIRECTIVE.
Summer Vacation 4 Nights/5 Days Add your name here.
5 DAY ROAD TRIP BY: THIS GUY. 3 HOURS TO DESTINATION From Salt Lake City, Utah to New York City, New York it will take about 3 hours by plane. The cost.
ACCESS LESSON 1 DATABASE BASICS VOCABULARY. BACKSTAGE VIEW A menu of options and commands that allows you to access various screens to perform common.
Mike May President and founder of Sendero Group, accessible GPS Advisor to Uber Traveling More Independently Using iPhone Apps CTEBVI Conference Friday,
Elissa Vert 3 rd Leete 03/15/15 Housing and Transportation Needs.
Transportation Revenue Sources Presentation to the Discovery Institute October 6, 2004 Amy Arnis Deputy Director Strategic Planning and Programming Washington.
Travel Kristin Ellis and Jess Camuti Office of the New York State Comptroller.
Beginning Fare File Management Hudson Fare Files 104 – Rev. 8/15 Point to Point (PTP)
Relational Databases Today we will look at: Different ways of searching a database Creating queries Aggregate Queries More complex queries involving different.
Consumer Preferences for Refueling Availability: Results of a Household Survey Marc W. Melaina, National Renewable Energy Laboratory Cory Welch, Blue Summit.
Hudson Fare Files 103 – Alternate Fare Files
OBJECTIVES 1-2 Travel Expenses-1
Partner Billing and Reporting
Introduction to Database Systems
How to Format a TA © 2016 Brigham Young University–Idaho.
T-Share: A Large-Scale Dynamic Taxi Ridesharing Service
POD 4 Do the following numbers represent a proportional relationship?
Final Project – Anomalies Detection
Call Now How to Search a Good Taxi Service in Udaipur.
Administrative Meeting, 2/26/18
The Cost of Car Ownership
BTW 2017 Data Science Challenge SDSC17
CS & CS Capstone Project & Software Development Project
STAT 689 Class Project STAT 689 Class Project
Packet 9 Objectives 1 – 10.
Objective translate verbal phrases into expressions.
Overview Welcome & Introductions MTM Overview
Comparison and Analysis of Big Data for a Regional Freeway Study in Washington State Amanda Deering, DKS Associates.
Presentation transcript:

Capstone Project

NYC Taxi DataSet The data is stored in CSV format, organized by year and month. In each file, each row represents a single taxi trip. Table 1 below gives a small sample of this data. There are several entries per second for four years. The raw trip data takes up about 116GB in text CSV format.

NYC Taxi DataSet

The data is organized as follows: Medallion (car ID). Hack license (driverID). Vender id Rate_code (taximeter rate). Store_and_fwd_flag (unknown attribute).

NYC Taxi DataSet Pickup datetime: start time of the trip, mm-dd-yyyy hh24:mm:ss EDT. Dropoff datetime: end time of the trip, mm-dd-yyyy hh24:mm:ss EDT. Passenger count: number of passengers on the trip, default value is one. Trip time in secs: trip time measured by the taximeter in seconds.

NYC Taxi DataSet Trip distance: trip distance measured by the taximeter in miles. Pickup_longitude and pickup_latitude: GPS coordinates at the start of the trip. Dropoff longitude and dropoff latitude: GPS coordinates at the end of the trip.

NYC Taxi DataSet Fare data is also available from A sample of the fare data is shown in Table 2 below. This dataset contains the following attributes: Medallion: car ID. Hack license: driverID. Vender id: Pickup datetime: start time of the trip, mm-dd-yyyy hh24:mm:ss EDT.

NYC Taxi DataSet Fare amount: the meter fare, it should include the Newark surcharge, in USD. Surcharge: Extra fees, such as rush hour and overnight surcharges, in USD. Mta tax: Metropolitan commuter transportation mobility tax, in USD. Tip amount: tip amount, in USD.

NYC Taxi DataSet Tolls amount: total price paid for tolls, summed across all tolls for the trip, in USD. Total amount: all charges that are presented to the passenger at time of fare payment (includes tip for non-cash trips), in USD.

NYC Taxi DataSet

Trajectory Data Query Model Existing query models of the trajectory data interested in searching and finding trajectories or trips with respect to a given range or point. (e.g. “find all objects within a given area (or at a given point) sometime during a given time interval” or “find the k-closest objects with respect to a given point at a given time interval”)

Trajectory Data Query Model The coordinate based queries: Point Queries: (e.g. find the location of specific object between 1:00pm-1:30pm). Region Queries: (e.g. find all trajectories or trips passed through R region between 1:00pm-1:30pm). K- Nearest Neighbor Queries: (e.g. find all trajectories or trips within 500m of a gas station between 1:00pm-1:30pm).

Trajectory Data Query Model The trajectory based queries: Topological Queries: (e.g. “When did vehicle X enters street Y most recently”). Navigational Queries: (e.g. “What is the current speed of vehicle X”).

A Study of New York City Taxi Trips From: Visual Exploration of Big Spatio-Temporal Urban Data: A Study of New York City Taxi Trips. Nivan Ferreira, Jorge Poco, Huy T. Vo, Juliana Freire, and Claudio T. Silva

For NYC DataSet: 2013 : – 2013: NYC TaxiVis Paper:

Questions