A convergence of research developments is driving the ongoing wave of Data Science. Unimaginable volumes of data are generated every day by a plethora of logistics providers.
Most of the navigation applications today offer an Expected Time of Arrival or ETA as an appealing add-on feature. Many on-demand logistics companies like Uber, Ola, and Jugnoo are displaying their predicted ETA for an enhanced customer experience. Predicting accurate estimated time arrival is imperative to logistics providers as well to differentiate from the also-rans.
Diversification of ETA prediction
With 90% of the world’s logistics traffic transporting via ocean, vessel arrival certainty is crucial for integrated decision making and overall supply chain performance. As a wide range of operations and array of resources are involved in a container terminal, predicting the precise arrival of ocean vessel is essential for effective cross-functional productivity. Critical features for fortifying its prediction includes sensor generated feeds, social media, previous port delay, live weather trends/forecasts, trade-lane congestion, port activity, vessel dwell times and many more. Its benefits are manyfold – waiting time is shortened and owner’s cost is reduced through asset optimisation.
Another eccentric application is ETA prediction for aircraft. Predicting aircraft arrival while en-route is a challenging endeavor. Considering various factors ranging from environmental circumstances to flight level and cruise airspeed, accurately predicting flight’s punctuality becomes increasingly onerous. Few parameters are uncontrollable by operators like Air Traffic Control (ATC) reroutes, congestion, holding patterns which also impacts arrival time. Feature generation like average winds encountered in flights during a particular month and variable selection accounts for prediction accuracy. Aircraft ETA predictions also rely on aircraft performance models and trajectory models. Enhanced ETA is used for streamlining operations across various facets of the airline’s blocks.
Taking consideration of routing engine-service providers, Google Maps provide comparatively precise ETA while crunching huge data. Former Google engineer, Richard Russell explained on Quora how Google Maps determines its ETA. “Like in similar products, Google maps ETAs are based on a variety of things, depending on the data available in a particular area. These things range from official speed limits and recommended speeds, likely speeds derived from road types, historical average speed data over certain time periods (sometimes just averages, sometimes at particular times of day), actual travel times from previous users, and real-time traffic information. They mix data from whichever sources they have, and come up with the best prediction they can make.”
Engineering features for Machine Learning
There are numerous supply chain cases where evaluating Google Maps ETA would not be a good fit. Delivery time rely on more than just the length of drop off leg (pickup location to drop location); deliveries involving waiting time needs calculation of supplementary time for a precise prediction. For instance, Jugnoo Fatafat and food delivery services like Uber Eats need to predict food preparation time also for displaying total delivery time to customers.
In the sphere of Machine Learning, data pre-processing and feature engineering are paramount. The figure above demonstrates a linear relationship between distance covered and total time taken for the task. It is also crystal clear that there are few rides that took longer to deliver within few kilometers. These noise in the data need to be removed and other outliers which are incorrectly measured/entered are dropped to improve prediction. Engineering new features influence predictive models by understanding underlying problems.
Additional variables such as vehicle type (bike or car), agent type, day of the week, hour of the day, month and year, statistical transformations (logarithmic transformation and cyclic projections of hour and day), pickup and dropoff location clusters (to capture neighborhood behavior) and average speed at particular hour are derived. External data sources including weather (temperature, hourly precipitation), public holidays and events (concerts or sporting events) also improve the predictive model.
Building Data Science Workflow
Considering a food delivery service provider, predicting food delivery ETA needs a lot of calculations at the backend. The restaurant acknowledges and prepares the items when a customer places an order. Food preparation time varies according to its complexity, number of items ordered and how busy restaurant is at that time. Then, the delivery partner gets to the restaurant, parks and picks up items, walks back to car/bike to drive to the customer’s location. After reaching the customer’s location, he finds out parking space and finally delivers to the end customer. So, the ML model needs to calculate the end-to-end delivery time taking into consideration amount of time taken at each stage.
Novel algorithms are developed to solve these business problems at scale for seamless user experiences. Machine learning workflow is generic for all use-cases irrespective of any business problems such that it is scalable and reproducible. Machine learning workflow includes six steps:
1) Data pre-processing and feature engineering
2) Training models
4) Production deployment
5) Make predictions
6) Monitoring and optimization
A combination of linear regression, random forest, long-short-term-memory (LSTM) and ensembling techniques are deployed for optimal prediction results. KATO leveraged techniques ranging from linear to deep learning models and productionized with TensorFlow.
Despite implementing highly sophisticated Machine learning algorithms, predicting arrival time is not always accurate as few situations are inherently unpredictable. Predicting a road crash or sudden rerouting is difficult. ETA accuracy improvement is a continuous challenging problem to solve, with so many blind spots.
Taking few cases into account, for Jugnoo – Estimated Arrival Time (ETA) prediction resulted in augmenting customer experience and reducing order/ride cancellation by 7%. Internal and external data sources are inputs to the machine learning models.
KATO is actively pushing its boundaries of Data Science and reimagining the variety and complexity of problems that can be solved. We demonstrated our expertise by solving some toughest problem industries and organizations are facing today. KATO helped numerous clients using machine learning to solve their business problems. Early adopters of artificial intelligence are now reaping a range of its benefits. KATO provides promising AI solutions to companies new to space.