How to Understand What is an Outlier in Forecasting

Executive Summary

  • Outliers are easily identified in applications, but the question is how to deal with them.
  • The questions to ask regarding when outliers should be removed.

Introduction

The question of what is an outlier and outlier removal is always a topic if keen interest on forecasting projects which is why it is helpful to have a specific outlier definition. In this article, you will learn how an outlier and what are common approaches for dealing with them.

What is an Outlier?

An outlier is a data point in the history that diverges from the other data points. An outlier can either be overly high or overly low compared to the other data points in the time series.

unforecastable-demand-history-1

In this time series, the obvious outlier would be for period 7. Outliers are easy to identify either graphically or through calculation.

What is an Outlier in Forecasting Software?

Almost all supply chain forecasting applications have a way of setting the outlier detection.

  • The outliner notification is normally set by the number of standard deviation away from the mean to say what is an outlier.
  • Some forecasting applications auto-remove the outlier that is above the defined number of standard deviations.
  • Other forecasting applications will only identify the outlier for the user, but not remove it for creating the forecast.

Outlier Remedy

Outlier Removal Outlier removal is a very interesting and controversial topic which should be even more controversial than it is. This is particularly true considering that it is a major technique of falsifying forecasts. Outlier removal is the removal of historical data points that are in variance with the other historical data points.

Most statistical demand planning applications have a field for outlier identification or removal.

One example can be seen on Smoothie’s Model Options screen.

  • Typically people on the project will recommend the removal of outliers from the previous demand history.
  • This is often an issue that confuses many people. The rule on outlier removal is relatively simple. If the outlier in question has a high probability of repeating in the future, the outlier should not be removed.
  • Automating outlier remove is more difficult than is often initially guesstimated. This can be performed by using the TRIMMEAN function in Excel. This removes outliers in a controllable fashion, and by testing the forecast accuracy before and after the function is used.

The Issues with Outlier Removal by the Army Corp of Engineers Prior to Hurricane Katrina

Outlier removal was a primary reason for the results of Hurricane Katrina. This is because the Army Corp of Engineers simply removed storms of a much worse level than Katrina when they performed the initial planning. This was done deliberately in order to develop a standard of levee construction it was interested in building (and then added further incompetence by not building to that standard).

Removal of outliers contains considerably less intrigue in supply chain forecasting than in finance or medicine and food additive testing. But this is only because the supply chain organization is only using the forecast internally.

  • Forecasts that are consumed internally tend to have less bias than those produced for external consumption. This is because there is less incentive to produce a forecast that will be positively received by the forecast’s customers.
  • It is a general rule that internally used forecasts tend to be more accurate than forecasts that are produced for external consumption or are sold.
  • However, while internally consumed forecasts are “better” and more reliable than externally consumed forecast, there are still many problems with the internally reported accuracy measurements of forecasts that the forecasting department reports to the rest of the company.

Outlier Management in Demand Planning Systems

Outliers can be easily tagged by the system and can be removed in a way that does not alter the actual demand history loaded into the model. Instead, it is stored in a separate row along with another measure—the adjusted history. The forecasting system then forecasts using the adjusted history, and in that way, the actual history and adjusted history are kept separate.

Outlier Management in Demand Works Smoothie

In Demand works Smoothie, outliers can be identified based upon the number of standard deviations away from the mean. The higher the standard deviation, the higher the tolerance is set for outliers. However, in the application Demand Works Smoothie, it does not remove outliers based upon this selection; rather, it merely identifies them for the planner.

Outlier Management in JDA DM

JDA DM also will identify outliers in the interface. Exactly how this appears is shown on the following page:

In JDA DM, the yellow stars in the interface identify the outliers. Outliers can be observed as high or low points, but the identification with a graphical element (a star in JDA’s case) is a good practice so that the planner can see when their outlier threshold is exceeded. The outlier threshold is set below:

Determining Whether Outliers Should be Removed

Outlier identification is the easy part; outlier removal is where the real work begins. Outlier removal requires planners with domain expertise to make the decision as to whether the outlier should or should not be included in the demand history for forecasting. This requirement for judgment based upon domain expertise is one reason why it is not a good practice to automatically remove outliers based simply upon their distance from a mean value. Historical data periods can be far from the mean and yet still be valid data points to use for creating a forecast.

Conclusion

The central premise of outlier removal is that one-time events should be removed from the demand history in order to prevent them being used to produce a forecast biased by events that will not be repeated.

The determination of what is and what is not a one-time event is often a sea of disagreement, even among different individuals with the same amount of domain expertise.

Research Contact

  • Interested in Accessing Our Forecasting Research?

    The software space is controlled by vendors, consulting firms and IT analysts who often provide self-serving and incorrect advice at the top rates.

    • We have a better track record of being correct than any of the well-known brands.
    • If this type of accuracy interests you, contact us and we will be in touch.

Search Our Other Forecasting Content

Brightwork Forecast Explorer for Monetized Error Calculation

Improving Your Forecast Error Management

How Functional is the forecast error measurement in your company? Does it help you focus on what products to improve the forecast? What if the forecast accuracy can be improved, by the product is an inexpensive item? We take a new approach in forecast error management. The Brightwork Explorer calculates no MAPE, but instead a monetized forecast error improvement from one forecast to another. We calculate that value for every product location combination and they can be any two forecasts you feed the system:

  • The first forecast may be the constant or the naive forecast.
  • The first forecast can be statistical forecast and the second the statistical + judgment forecast.

It’s up to you.

The Brightwork Forecast Explorer is free to use in the beginning. See by clicking the image below:

References

I cover forecasting topics like outliers in the following book.

Forecasting Software Book

FORECASTING

Supply Chain Forecasting Software

Providing A Better Understanding of Forecasting Software

This book explains the critical aspects of supply chain forecasting. The book is designed to allow the reader to get more out of their current forecasting system, as well as explain some of the best functionality in forecasting, which may not be resident in the reader’s current system, but how they can be accessed at low-cost.

The book breaks down what is often taught as a complex subject into simple terms and provides information that can be immediately put to use by practitioners. One of the only books to have a variety of supply chain forecasting vendors showcased.

Getting the Leading Edge

The book also provides the reader with a look into the forefront of forecasting. Several concepts that are covered, while currently available in forecasting software, have yet to be widely implemented or even written about. The book moves smoothly between ideas to screen shots and descriptions of how the filters are configured and used. This provides the reader with some of the most intriguing areas of functionality within a variety of applications.

Chapters

  • Chapter 1: Introduction
  • Chapter 2: Where Forecasting Fits Within the Supply Chain Planning Footprint
  • Chapter 3: Statistical Forecasting Explained
  • Chapter 4: Why Attributes-based Forecasting is the Future of Statistical Forecasting
  • Chapter 5: The Statistical Forecasting Data Layer
  • Chapter 6: Removing Demand History and Outliers
  • Chapter 7: Consensus-based Forecasting Explained
  • Chapter 8: Collaborative Forecasting Explained
  • Chapter 9: Bias Removal
  • Chapter 10: Effective Forecast Error Management
  • Chapter 11: Lifecycle Planning
  • Chapter 12: Forecastable Versus Unforecastable Products
  • Chapter 13: Why Companies Select the Wrong Forecasting Software
  • Chapter 14: Conclusion
  • Appendix A:
  • Appendix B: Forecast Locking
  • Appendix C: The Lewandowski Algorithm.

Software Ratings: Demand Planning

Software Ratings

Brightwork Research & Analysis offers the following free demand planning software analysis and ratings. See by clicking the image below:

software_ratings