How Real is the SAP Machine Learning and Data Science Story?

What this Article Covers

  • How SAP Has Coopted the Term Machine Learning
  • Fact Checking on SAP’s Predictive Analytics Claims
  • SAP’s Extraordinary History of Puffery
  • Repeating Claims Made in the Past That Never Occurred
  • Business Objects and Lumira
  • What Happened with HANA Predictive Analytics?
  • Using Machine Learning
  • The Types of Machine Learning Algorithms
  • Classification and Clustering
  • Percent Regression, Percent N/A
  • How New is the Bottle of Machine Learning

Introduction

The term machine learning has been growing as a term to use and is now migrating into software vendors who appear to be intent on using it to sell software.

As we can see from Google Trend, the interest in the term Machine Learning is growing. It is now four times more popular as a searched term than it was just a few years ago. 

SAP has a lengthy history of co-opting things it has nothing to do with. Marketplaces, collaboration, inventory optimization, good UI design, databases, IoT, HTML5, the list goes on and on. SAP can immediately partner with a company, and claim capabilities merely through an association of a partnership. 

How SAP Has Coopted the Term Machine Learning

SAP has co-opted the term machine learning with fantastic speed. And as they have done so, they have applied their normal marketing flourish, which means that the way SAP represents machine learning has little to do with how machine learning actually works.

I am a long time SAP consultant and now SAP researcher, and it is difficult to recollect SAP using the term machine learning in any of that time…..until very recently that is. However, as soon as SAP using the term, it has claimed to be the world experts in machine learning. In fact, in one video for its Leonardo product, it claimed to be “the only company” with the machine learning to help its customers.

This is a typical video for Leonardo, SAP’s IoT and ML solution. 

The term machine learning has begun to appear on SAP employees with startling speed. People with no background of any kind in forecasting or statistics, and from utterly unrelated backgrounds ranging from CRM to Hybris to Fiori, now have machine learning attached to their titles on LinkedIn.

Here is one example…

Digital transformation lead at SAP(Hybris, S/4Hana, Logistics, Machine learning)

What an interesting combination of skills!…..ML has nothing to do with Hybris or S/4HANA or logistics. Apparently, ML is now just something to add to the new sexy term.

This is categorized as ML, but the approach applied is ARIMA, which is a selection of options which include Exponential Smoothing, Seasonal Trend Decomposition, Trigonomeric Box Cox, Holt-Winters. 

Someone should tell SAP that none of these methods is classified as a method of ML. They are traditional univariate statistical forecasting methods. ARIMA is primarily used as a univariate statistical approach (although it can be used for multivariate). But the video shows quite clearly that univariate analysis is being performed. ML is always multivariate.

This video proposes ML as a way to automate intelligent decision making in a company. It sounds like artificial intelligence and makes it sound like ML works like Skynet from the movie the Terminator. But ML algorithms don’t work like this at all. They are highly specific and run by data scientists or analysts and they must be set up and will address a narrow area. In the scenario outlined by the video, the company that purchased this software would not need people really outside of performing physical work, as SAP “Skynet” would be doing all the thinking. 

Repeating Claims Made in the Past That Never Occurred

Something to observe is how close SAP’s claims are in machine learning to its earlier claims made about analytics. Notice this video which was published in 2013.

`

This video was 4.5 years ago, and at that time the term was “analytics.” When visiting SAP accounts, none of the things described in this video are apparent.

Fact Checking on SAP’s Predictive Analytics Claims

The claims appear to be almost identical, but instead of using the terms predictive analytics and predictive algorithms, SAP merely substituted the term machine learning for the older terms, and now makes the same claims but with a different term. SAP’s analytics solutions keep permutating and dying off.

Interest, as tracked by Google Trends in Business Objects, has declined by a factor of 7.5 since before the SAP acquisition. For a while, Business Objects was going to be ported to most of SAP’s accounts after the acquisition, but now Business Objects is in a steep decline and is barely discussed on SAP projects.

Then SAP Lumira was going to be the “Tableau killer,” but it has disappeared mostly from the market, although interest in the product is still there, although it is declining. 

What Happened with HANA Predictive Analytics?

  • The analytics that came with HANA never panned out, and companies that use HANA for analytics primarily use it with the BW, which is as difficult to work with as it ever was before HANA was introduced.
  • The idea was that companies would perform analytics in S/4HANA with Embedded Analytics as we covered in the article The Future of SAP Embedded Analytics, Embedded Analytics never really developed, and so few customers are live on S/4HANA that at this point it is not a relevant discussion point.
  • In one part of the video, a hand swipes through various offerings in a “library” represented by cards. On the cards are linear regression, K nearest neighbor, K-means, C4.5 decision tree and ABC classification. K nearest neighbor is an ML classification algorithm. K-means is a clustering ML algorithm, C4.5 decision tree is an ML classification algorithm, ABC classification is classifying product based upon revenues and has nothing to do with with the other items on the list.

Therefore, SAP stated that it had ML algorithms in 2013, but for whatever reason called them predictive analytics.

In this way, ML can be seen as a way for SAP to rebrand what was predictive analytics. Something that it already tried and was not successful in doing. The screens in this video look fantastic, but again, I have never seen anything that looks remotely like this on any of my SAP clients. Furthermore, the video is highly confusing because it discusses using predictive analytics with HANA. However, HANA is just the database. This would be like saying one is performing predictive analytics with Oracle 12c or IBM DB2.

SAP has a way of explaining things so that they are maximally confusing and the boundaries between what the different items do are blurred. By listening to SAP you know less about a subject than before you started. SAP’s statement regarding analytics does not describe the analytics application that sits on the database. Actually, this would be the same as saying one is running ERP with Oracle 12c or IBM DB2. Analytics and ERP systems are the application. You work in the application, the application sits on the database.

What tool is being recommended to work with HANA, it seems like a tool called Predictive Analytics? That is what are the screens shown in the video? Does that product exist? It is 4.5 years since this video was published, and where is the SAP product Predictive Analytics?

SAP’s Extraordinary History of Puffery

To appreciate SAP’s latest attempt to co-opt a term, one must review SAP’s history of co-opting other areas.

Previously, with HANA, SAP pretended that they were better at databases than Oracle or IBM or Microsoft. That assertion has had 7 years to be evaluated, and it has turned out to be untrue, which we have covered in articles like What is the Actual Performance of HANA?, and HANA as a Mismatch for ERP and S/4HANA.

Now SAP is trying to propose that with virtually zero history in even linear regression, they are the best company for machine learning.

However, other large vendors like SAS have been deep into the types of mathematical programming that is the basis for machine learning since their inception, with SAS owning a major development language for statistics. With HANA, SAP proposed it was better than Oracle in databases. Now with machine learning, SAP is saying that it is better with statistics and algorithms than SAS.

Using Machine Learning

Machine Learning is an incredibly misleading term that seems the less ethical the software vendor, the more they are willing to proposing magical improvements from the usage of the technologies that underpin the term.

SAP as one of the least “reality bound” vendors that we analyze, so naturally, they have recently come out of nowhere to tout all their great machine learning conquests. We expect machine learning to be SAP’s new girlfriend for a while until something else topical arises, and then SAP’s marketing department with tire of its current “girlfriend,” and move on to the next.

Machine learning simply describes a series of mathematical approaches to data analysis and prediction that apply to multivariate data sets. Most forecasting today in business is univariate…. so for example sales history. Machine learning means the use of multivariate data sets. So sales history + weather or + product family or + economic factors, etc..

The way in which machine learning is really at least in part old wine in new bottles is the analysis of how many of the machine learning algorithms for the programming language R, are categorized as the following:

Classification and Clustering

Classification simply places observations into buckets, and this is performed by the classifier algorithm. If the classification were known, then it would be assigned already as a field. So for instance.

ML Classification Table

 Group 1Group 2
Product ABCYesNo
Product XYZNoYes
Product RFQYesNo

But the intent is to run the algorithm such that the new classification can be found, and for forecasting purposes, be used to perform forecasting. The desire is to find a classification that has predictive power. Interestingly, classification is often performed with logistics regression, which is regression with the dependent variable is not numeric, but is a category (So Blue, Green, Red, etc..)

The way ML algorithms are often classified is by being..

  1. Classification or Clustering, and then
  2. Regression or not regression.

That is they are often both classification and regression or some other combination.

However, as a classification is a type of regression, we can see the thread of ML coming back to regression.

Clustering is similar to classification, but very much is attached to the graphical presentation of data, hence the ability of the algorithm to separate out a scatter plot into clusters. Clustering is far less common in the business world than it is in the sciences.

Of the most common ML algorithms, 1/2 are a type of regression. 

Percent Regression, Percent N/A

Machine learning is a branch of data science. And surveys into the algorithms used by data scientists show that regression is the most common method used to find relationships among data scientists. Other methods are used, but it should be remembered that data science is still a growing field, and the most advanced data science activity is not in mainline businesses. They are in the sciences and in high finance. So the most advanced data science methods are applied in specialized areas of the economy.

It’s time for a cork check on the hype around machine learning! Many of these ML methods have been around for a long time. 

How New is the Bottle of Machine Learning

Hardware and software keep improving, so in essence progress is continual. I was just checking a functionality recently in software that was written around 18 years ago. It required a detailed 3.5 pages of code with various if statements to process what is processed today with a single command. But with ML there seems to be a clear intent to explain it as something completely new. Yet, regression has been around for a very long time. Regression is available in SAP DP, which is a supply chain forecasting application, for example, but is not very much used because it is normally too complex to be implemented in a business setting that normally has short timeline expectations for improvement and where more complex methods take more training time, which usually is also in short supply. This is jarring to people that do not work on forecast projects, but it is my observation from working on many of them and having been on forecasting projects since the mid-1990s.

Sometimes the topic of environments that use SAS for forecasting come up. SAS is a complex high investment type of forecasting product (primarily that is, they have a more standard forecasting product, but it’s not their bread and butter). However, it is important to recognize that SAS is infrequently outside of just a few industries. For example, SAS is very popular in finance and insurance, where the budgets are far bigger than is normally the cases, and there is much more time to concentrate on producing complex forecasts for a smaller number of forecasted items.

Conclusion

Machine learning has been around for a while, although it has just recently been picked up as a marketing term for some vendors, and this push is making ML become far higher profile than it was before. This does not demonstrate success with ML on projects, it does indicate that marketing departments at companies are publishing it, and expressing it as something important to customers.

Through SAP’s history, they have had precisely zero to do with machine learning, and all that SAP is doing it putting open source algorithms into their software (perhaps that is) and declaring that they are all about machine learning. This will be true for a while, but will most likely change when a new “girlfriend” comes along, and SAP switches to whatever happens to be trendy at that time. When that happens, we can expect, as with predictive analytics, for the same claims to be recycled once again and applied in this case to XYZ rather than to machine learning or to predictive analytics. With no one fact checking SAP, SAP is free to repeat this strategy out into the future.

SAP has no real interest in making any of these things work on customers. Their interest is in co-option in order to create something enticing to make sales and to pump up the stock price by tricking Wall Street. If SAP had a real interest in making things work, they would go back and reinvest in making the things that they made previous claims that they failed to meet. But they don’t do that. SAP is at this point in their lifecycle primarily a marketing organization that simply jumps from hot topic to hot topic. That is the luxury a company that has such thorough control over the media output that is published about them, and the consulting advice that their partner consulting firms communicate to customers.

But there is no reason to listen to SAP on machine learning, and there is no reason to wait for them to add machine learning items from the public domain, and then claim they are something SAP came up with, to “Leonardo.” The algorithms are available to use without SAP, and without waiting for SAP to move from pretending to have something to do with machine learning to figuring out machine learning.

References

https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

https://en.wikipedia.org/wiki/Statistical_classification

https://en.wikipedia.org/wiki/Logistic_regression

https://en.wikipedia.org/wiki/Cluster_analysis