Navigating through real-world challenges in Timeseries Forecasting
Explore how to unlock business value using time series forecasting approaches with real-life data instead of keeping it on an academic research level.
September 27, 2023
When it comes to predict the future based on past historical events or available information in the future itself, time series forecasting has established itself as a go-to strategy. Whether with simple statistical approaches or state-of-the-art deep learning models, it warrants a comprehensive understanding of time-dependent factors, both from the past and possibly, from the future. There has been great success in these methods in academical research. But how do we move beyond the comfort of those datasets that i refer to as “Toy Datasets”? Real-life data in business holds a lot more complexity and heterogenity that we need to overcome for successfully creating some business value.
Tuning the Ear to Understand the Melody of Data
The key to mastering time series forecasting lies in learning the rhythm that drives the data. Benchmark datasets often provide a simplification of this rhythm, showcasing clear seasonal patterns and trends without strong influences from external factors. From air passenger numbers to Boston house prices, typical benchmark datasets in academia are united by simple underlying problem dynamics.
However, when operations move from this controlled environment into the chaos of real-world data, traditional methods can falter, giving rise to a wide array of challenges.
The World of Spikes, Cold Starts, and Intermittent Periods
In the business landscape, data spikes are a common occurrence. Promotions and sales days can cause drastic occurrences that can easily throw off a prediction model leading to bad consequences for the business planning. Companies also have to grapple with the so-called “cold start problem” where new product launches obviously cannot provide any historical data so far but still accurate predictions are required for the given product to steer the business effectively.
There is no Prediction Model yet to serve them all!
Since dealing with datasets that can exceed millions of predicted items including above described data complexities, we need to find matching model approaches. Therefore there is not one model that will cover all these complexities by itself best. For time series in a low volume sector with only less variance and no external drivers, statistical models empirically perform well. For high volume products including high variance by external factors, deep learning or machine learning models are best practice.
Our tech stack at paretos gives us the opportunity to effectively ensemble from those approaches and therefore provide optimal results for various different problem characteristics.
A Guiding Light Amidst the Chaos: Running Baselines
But how we do we know if a prediction model actually performs good, bad, superb or just mediocre?
The first and most crucial step in time series forecasting is the running of baselines. This will provide a lot of context for the performance of the final prediction model and is essentially to steer the machine learning training iterations.
Analysing model performance based on data clusters (for example clustered by volume, variance, forecasting error etc.) further helps in honing the predictions. This becomes essential especially in datasets with more than ~100k predicted items where the decision maker needs to be guided to the most important items straight away to review them fast and effective.
As a physicist always seek to find well-suited explanations and modelling approaches for complex mechanisms. Actively working with various KMUs and international companies in the field of forecasting and demand planing enabled me to generate maximal business value with the help of state-of-the-art machine learning approaches.
Paretos is the leading AI-based Decision Intelligence software for optimized decision-making processes. With paretos, businesses can identify, plan, execute, and track business potentials through automated processes across the entire organization.