How to Understand SAP’s Strange Predictive Analytics for ML

Executive Summary

  • SAP states that they have advanced predictive analytics for machine learning.
  • In this article, we review the predictive analytics that SAP offers.

Video Introduction: How to Understand SAP’s Strange Predictive Analytics for ML

Text Introduction (Skip if You Watched the Video)

Before HANA and machine learning, SAP used to market something called predictive analytics. This was out-of-the-box forecasting that SAP state would in nearly an automated fashion. SAP later migrated the same promises from predictive analytics to HANA and machine learning. You will learn of the SAP’s predictive analytics claims and our analysis of these claims, how SAP tried to sell public algorithms to their customers, and how the false claims they make to IT buyers that using HANA will speed the overall process.

Our References for This Article

If you want to see our references for this article and other related Brightwork articles, see this link.

Notice of Lack of Financial Bias: We have no financial ties to SAP or any other entity mentioned in this article.

  • This is published by a research entity, not some lowbrow entity that is part of the SAP ecosystem. 
  • Second, no one paid for this article to be written, and it is not pretending to inform you while being rigged to sell you software or consulting services. Unlike nearly every other article you will find from Google on this topic, it has had no input from any company's marketing or sales department. As you are reading this article, consider how rare this is. The vast majority of information on the Internet on SAP is provided by SAP, which is filled with false claims and sleazy consulting companies and SAP consultants who will tell any lie for personal benefit. Furthermore, SAP pays off all IT analysts -- who have the same concern for accuracy as SAP. Not one of these entities will disclose their pro-SAP financial bias to their readers. 

Anomaly Detection with Principal Component Analysis

Here is the example SAP provides.

“A railway operator uses sensors in locomotives. Four motors each have four temperature sensors. If the motors are working correctly, all 16 sensors send data about a synchronous increase or decrease of temperature. PCA notes when this behavior changes. You would use this algorithm to monitor this behavior and to detect if sensors send temperature data that differ from other sensors, which might indicate that a motor is damaged and needs to be maintained.”

We are trying to think under what circumstances this ML algorithm would make sense to use.

Distance-Based Failure Analysis Using Earth Mover’s Distance

“An airplane contains electric devices that have batteries inside. These electric devices are equipped with at least two sensors that send data. Sensor A sends data about measurements of electric current, sensor B sends data about voltage measurements. An electric device could also have a sensor C that sends data about temperature measurements. The data sent by the three sensors not only depends on the electric device itself, but also on other factors that affect the electric device and its batteries. These factors could be the weather conditions at heights of several kilometers, how often the device is used in the cockpit, under which conditions the pilot uses the device, and so on. It is therefore normal that data sent from the three sensors might vary around a certain mean score. The data from each sensor can be visualized in a one-dimensional histogram. For multidimensional visualizations, scatterplots are used. This visualization is like a fingerprint of each battery in the airplane. To compare the sensor data of different batteries without looking at and comparing each visualization, a distance measure for probability distributions is needed. One of these measures is the Wasserstein metric, or EMD. It can be used to measure deviation from a known good reference fingerprint of a battery, or to measure differences between several batteries of the same type, for example.”

The same thing applies to this algorithm.

Remaining Useful Life Prediction Using Weibull

“The Weibull algorithm can be used to calculate the expected remaining useful life (RUL) of an asset, and to calculate the probability of failure of an asset.”

This would be used to determine the failure rate of, say, a service part.

Anomaly Detection Using Multivariate Autoregression

“An example might be the changes in the outflow temperature of a system, which after a while is also reflected in the inflow temperature of a downstream system. MAR can handle different kinds of sensor values, and autonomously ranks their influence on each other. The algorithm can therefore handle noisy or random signals.”

This is another strange choice for a machine learning algorithm.

Strange ML Algorithms

SAP’s choice of ML algorithms is quite weird. Common ML algorithms include the following:

  1. Support Vector Machines
  2. Learning Vector Quantization
  3. Naive Bayes
  4. Classification and Regression Trees
  5. Linear Discriminant Analysis
  6. Logistic Regression
  7. Linear Regression

Interestingly we can’t see the ML algorithms that were selected by SAP being used.

ASUG on SAP Predictive Analytics

“So how does one actually buy Predictive Analysis? There are, it turns out, several different methods, and I’ll do my best to make it clear:

1. It’s bundled into SAP Lumira, within the Lumira code base, Gadalla says. But there’s a catch: You have to buy a key code to “light it up,” he says.

2. But if you buy Predictive Analysis, you get Lumira for free. (Got it?)

3. Predictive Analysis is embedded within about 20 different HANA apps (such as CRM, SCM, Partner Relationship Management and Liquidity Forecaster).

4. Predictive Analysis is also bundled with a HANA box, “where I’m using HANA as a server, where I’m doing my analysis,” Gadalla says.

Current pricing on SAP Predictive Analysis is $20,000 per seat with a minimum purchase of 5 seats, according to Gadalla.

No discussion of a new SAP product would be complete without that requisite HANA mention, of course. Gadalla reports that HANA went head to head with top competitors in doing clustering analysis, and he claims that competitors’ wares took 40 hours to do the job whereas HANA took just 2 minutes. “It was the same exact equation and same exact data,” he says.””

First, all of the algorithms used by SAP are in the public domain. SAP had nothing to do with creating them. So why are customers being changed to use them?

Secondly, the HANA database will do nearly nothing to help anyone run predictive analytics faster. We run predictive analytics ML on a laptop using MySQL, and it usually takes less than 20 minutes for them to run and on a 9-year-old laptop. This is a time we can switch to another task. And the time in running ML is in analyzing the output, not in running the algorithm. It can take hours to analyze the output, and during that time, we can run another ML algorithm. This runtime has virtually nothing to do with how quickly the overall process of analysis occurs. We could upgrade the computer or even access a high performance server from AWS, which is faster than anything SAP offers; however, it simply makes so little difference, we don’t even bother.

Do SAP’s Statements on Predictive Analytics Make Any Sense?

SAP is lying to customers concerning its ML.

  1. The ML algorithms selected by SAP don’t make much sense for what customers would typically use them. The use case examples show how odd it would be to use them. Only the Weibull algorithm appears usable (for predicting service parts failure).
  2. The database does not matter that much for running algorithms so that connecting to HANA is inaccurate. And the comparison of 40 hours to 2 minutes is stupid. That did not happen. And HANA’s is only designed around optimizing analytics.
  3. The ML algorithms are not SAP’s; they are public domain. Secondly, the algorithms that SAP selected don’t appear to be ones that customers would want to use. Anyone can run these ML algorithms on data in their system without paying SAP anything, and HANA is irrelevant to running ML algorithms. But why would any customer limit themselves to SAP’s incorrectly selected algorithms? Anyone can run a public domain algorithm on any data set without SAP being involved in any way.
  4. SAP is lying from 4 different dimensions all in one and trying to take advantage of the customer’s lack of knowledge around ML.

Conclusion

This article by SAP receives a 1 out of 10 for accuracy. Years after these claims were made, none of them came true.