Context This data set was created to help Kaggle users in the [New Your City Taxi Trip Duration][1] competition. New features were generated using [Wolfram Mathematica][2] system. Hope that this data set will help both young and experienced researchers in their data mastering path. All sources can be found [here][3]. Content Given dataset consists of both features from initial dataset and generated via Wolfram Mathematica computational system. Thus, all features can be split into following groups: - Initial features (extracted from initial data), - Calendar features (contains of season, day name and day period), - Weather features (information about temperature, snow, and rain), - Travel features (geo distance with estimated driving distance and time). Dataset contains the following columns: - **id** - a unique identifier for each trip, - **vendorId** - a code indicating the provider associated with the trip record, - **passengerCount** - the number of passengers in the vehicle (driver entered value), - **year**, - **month**, - **day**, - **hour**, - **minute**, - **second**, - **season**, - **dayName**, - **dayPeriod** - day period, e.g. late night, morning, and etc., - **temperature**, - **rain**, - **snow**, - **startLatitude**, - **startLongitude**, - **endLatitude**, - **endLongitude**, - **flag** - this flag indicates whether the trip record was held in vehicle memory before sending to the vendor because the vehicle did not have a connection to the server - Y=store and forward; N=not a store and forward trip, - **drivingDistance** - driving distance, estimated via Wolfram Mathematica system, - **drivingTime** - driving time, estimated via Wolfram Mathematica system, - **geoDistance** - distance between starting and ending points, - **tripDuration** - duration of the trip in seconds (value -1 indicates test rows). [1]: https://www.kaggle.com/c/nyc-taxi-trip-duration [2]: http://www.wolfram.com/mathematica/ [3]: https://github.com/wol4aravio/Kaggle.Mathematica.NewYorkCityTaxiTripDuration