What is Statistical Forecasting?
Executive Summary
- A statistical forecast is a distinct way of creating a forecast.
- We cover what it is and why they are called statistical.
Introduction
What is statistical forecasting is a common question.
See our references for this article and related articles at this link.
Most statistical forecasts, in developed countries at least, are generated by systems.
They are the most commonly automated way of creating a forecast and serve as the baseline forecast in most cases. However, even the majority of people that work in statistical forecasting do not know why they are called statistical.
We cover this as well as why this mode of creating a statistical forecast is so important.
What is Statistical Forecasting?
Let us review some of the foundational statistical forecasting methods.
- A two or three-period moving average.
- A level (a many periods moving average).
- A trend, with a specific percentage increase month year.
- A seasonal, with a repeating factor.
Each one of these methods listed above is based on taking sampled parts of the univariate sales history — where the term statistical comes from. Let us jump to the definition of statistical to see why.
Statistics is the discipline that concerns the collection, organization, analysis, interpretation and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as “all people living in a country” or “every atom composing a crystal”. When census data cannot be collected, statisticians collect data by developing specific experiment designs and survey samples. – Wikipedia
Therefore, it is the sampling that is performed by statistics that gives statistical forecasting its name.
Let us take two simple examples.
Example #1: A 3 Period Moving Average
Three Period Moving Average Forecast
Line Item | January Sales | February Sales | March Sales | April Sales | May Forecast |
---|---|---|---|---|---|
Demand History | 50 | 50 | 70 | 60 | |
Forecast | 60 |
Using the three-period moving average, Feb, Mar, and Apr are averaged to arrive at the forecast for May. Statistically speaking, the Feb/Mar and Apr are considered the most representative sample for what will occur in May. This means that Jan and all of the sales histories in the periods prior to Jan (which would be the overall population) are removed from the sample.
In areas of statistics like running a census, only a sample is taken. This is because it would be too expensive and time-consuming to poll the entire population. However, statistical forecasting is different. It has access to far more data points or samples than it actually uses. One can create a forecast where all of the periods are used, but some are weighted more than others. But in practical reality, in statistical forecasting, in nearly all cases, some periods are left out of the forecast altogether.
This is because, from historical error measurement (hopefully that is), it is determined that the sample used is the most predictive or representative of the future values.
Example #2: A Repeating Seasonal (Quarterly)
In the following forecast model (it is a model rather than a method because everything is specified), the same color-coding scheme applies as in the previous example. Red values are removed from the sample, and the black values are used or part of the sample.
Repeating Seasonal
Line Item | January Sales | February Sales | March Sales | April Sales | May Forecast |
---|---|---|---|---|---|
Demand History | 50 | 50 | 70 | 60 | |
Forecast | 50 |
In this forecast, it merely repeats the sales history from three periods before. This would be an extremely seasonal pattern repeating on a quarterly basis. Seasonal patterns can have any repetition pattern. That is, the pattern can repeat over a year, or over two years, etc.. In this case, the repetition pattern is very short. While it seems simple, I have applied this exact forecast model at several companies.
Again, the black value shows the periods (on in this case period) used as part of the sample. The sample is very small, in this case, being just a single period.
One can go through this same exercise with all statistical forecasting methods and color code, which periods are being used for the forecast (the sample) and which are removed (the unused population), and the same principle applied. A best-fit algorithm, which is an automated procedure, does a large amount of calculation and involves some complex mathematics to accomplish the task. But in general terms, it is merely selecting the statistical forecasting models that show the lowest error when looking historically at the data.
Statistical forecasting is so powerful because it allows the creation of any number of forecasts automatically. Compared to judgment methods, collaborative methods, or machine learning algorithms (which uses multiple data streams, rather than the single data stream of statistical forecasting), statistical forecasting has a high value add versus the effort required to create the forecast.
To see the full screen just select the lower right-hand corner and expand. Trust us, expanding makes the experience a whole lot more fun
Conclusion
Statistical forecasting receives its name due to the sampling nature of the approach. Some periods of the sales history are included to create the forecast, and some periods are excluded. Statistics apply to an entire field of study, which includes many different mathematical techniques to test a hypothesis. Calling statistical forecasting statistical is overstating the complexity of what is actually occurring, and a more accurate term might be univariate sampling forecasting, but the time to name the forecasting approach has passed.
Because of the high forecast creation efficiency of statistical forecasting methods, each forecast that is mixed with the statistical must be evaluated for how much value it adds over the statistical. And this comes down to forecast error measurement. It is the position of Brightwork Research & Analysis that the number one factor holding back the evaluation of non-statistical forecasting methods versus statistical (as we cover in the article Forecast Error Myth #3: Sales And Marketing Have their Forecast Error Measured). As well as the better assignment of forecast models to product locations is the insufficient ability of companies to perform effective forecast error measurement and analysis.
A More Straightforward Approach to Forecast Error Calculation
Observing ineffective and non-comparative forecast error measurements at so many companies, we developed the Brightwork Explorer to in part, have a purpose-built application that can measure any forecast and to compare this one forecast versus another.
The application has a very straightforward file format where your company’s data can be entered, and the forecast error calculation is exceptionally straightforward. Any forecast can be measured against the baseline statistical forecast — and then the product location combinations can be sorted to show which product locations lost or gain forecast accuracy from other forecasts.
This is the fastest and most accurate way of measuring multiple forecasts that we have seen.