The Reality of Machine Learning Versus the Hype of ML

Executive Summary

  • Appreciating how is ML is used.
  • Why so many vendors are currently selling the illusion of automated machine learning.

Introduction

A recent article about SAP and machine learning garnered attention from some people with insightful views into ML projects’ reality. This article is meant to elaborate and curate these comments because we thought we did not want to have them only in the LinkedIn comment section.

Our References for This Article

If you want to see our references for this article and other related Brightwork articles, see this link.

Appreciating How is ML Used

ML is the subject of a great deal of vendor puffery at the moment. However, Ahmed Azmi of the Dubai Technology Entrepreneur Centre brings up the following points about the reality of using ML.

I give my customers 2 questions to ask ML vendor reps.

  1. “How does your company use ML internally? The correct answer for SW/HW vendors is “we don’t” – End of call.
  2.  If they claim to have the ML internal competencies, the ask is a free pilot.

This also ends the ML pitch because 99% of customers couldn’t integrate their application data silos for the past 50 years to build viable ML model training and test sets that produce usable predictions with commercial grade confidence scores.

Ahmed brings up a point that is being overlooked in the vendor hype around ML, which is, does the data even exist within companies to perform ML? ML requires multiple data streams, and it is an inherently multi-variable approach to pattern recognition.

His comment about bringing silos together to obtain multivariate data is right on target.

The Story of Causal Factor Inclusion in MCA Solutions

There was a vendor called MCA Solutions. They created an application for service parts planning and are now owned by PLC.

The story that was explained to me by several of their consultants was that customers would rate them based on how many variables they had space to perform causal forecasting. So they had development add up to eight, I think. This allowed the application to be sold into accounts because it met the “RFP specification.”

However, once the implementation would begin, it more often the case challenging to find the company had maintained one variable. If they did, most often, it was only one.

Is machine learning currently or about to become a broadly applied set of techniques. This seems like an odd claim, about as strange as a large percentage of the population taking an interest in quantum physics.   

The Elite in ML Versus Everyone Else

A common strategy is to make it appear as if ML is available to everyone. This is a little like saying that quantum physics is for everyone.

Ahmed addresses this in this quote.

SAS works with Visa, Walmart, and F500 because ML is about expensive skills, tedious data quality, prep, and integration NOT software. The algorithms, as you said, have existed for years. The reason AWS and Google are now commercializing ML is (A) They’ve used it internally for years to drive retail, search, and Ad core businesses. (B) they hired most of the PhDs and threw them at the problem for the past 10 years so customers don’t have to. Everyone else’s ML comes down to a million $ professional services contract + a million $ server + a million $ ETL licenses and user training.

Ask for a pilot and watch the matrix explode!

This comment on AWS and Google may lead to ML being something that sufficiently specializes such that it is something done by technology firms that have figured out how to use it themselves. SAP never figured out how to implement ML internally. It is instead something that they have co-opted. That is, it is something they want to sell, not something they have demonstrated competency within.

Ahmed’s comment highlights this exact point.

How hungry are companies to put the work in to apply authentic machine learning? These two different types of forecasting have very different levels of commitment and effort attached to them.

The True Appetite for Self Implemented ML?

Brightwork has been writing for years that most companies greatly underinvest in forecasting. However, when we talk about underinvesting, we mean far simpler approaches than ML.

I ran into a problem a few clients ago where I was questioned why I was testing things that “the company already knew were true.” When I asked for a yes or no answer as to whether something had been tested, I received a technobabble response from the on-site SAP DP resource. When I completed the test, it turned out what the company thought they “knew” was incorrect. In the area of forecasting, companies hold all manner of untested beliefs. The appetite for testing, even testing at a far lower level than the complicated and time-consuming testing required for machine learning (including data acquisition and cleansing), is often not there.

In every case, when it comes to performing testing in forecasting, I bring a scientific approach and concern for what is true in the data to the client, not the other way around. That indicates a limited appetite for ML once you get into the nitty-gritty and past the presentation of ML as a Skynet surrogate. Are their elite entities like Google and Facebook with unlimited amounts of money that will fund ML studies?

Yes.

But to one of Ahmed’s other points, that cannot be generalized the way SAP is generalizing ML.

Is it possible that this increases the appetite for detailed analytical projects? Sure. But if these projects are based upon unrealistic expectations, which is the approach SAP, and several other vendors are taking, this will result in disappointed customers. Secondly, for this to happen, companies will need to address the type of people they have in management. The management in most companies is not particularly high on patience. Most are type A individuals, and type A is focused on short cycles of learning and “getting things done.” But any in-depth analysis, which would include machine learning, requires long periods of learning. But this is not being presented to executive decision-makers. What is being offered is that a highly complex type of pattern recognition project is a short cycle of learning because the AI/ML is just “baked” into their applications.

What Qualifies as Machine Learning?

Working with ML algorithms, it seems peculiar how much work we have to do for what is supposedly a self-managing system. The following video from SAP illustrates how SAP presents ML to prospects. We have to ask the question, considering the amount of work it takes, is the machine doing the learning, or is the analyst/data scientist.
Ahmed addresses this issue with the following comment.

What do we mean when we say ML? My definition is simple. ML is a system that learns from data. Google Search is ML because it gives better results with more search data. A CRM is ML when it gives more accurate lead conversion scores with more sales data fed into the system by sales reps.

So ML is a system that learns and gets better using data in a supervised or unsupervised setup. For this to happen, you need skill sets very few companies can afford. The infrastructure is also very sophisticated because learning happens at certain thresholds when very large labeled data sets become available for model training and testing. In short, it’s expensive and hard work.

On top of all this, if the data you need is fragmented into silos and you have no access to it because it’s non-standard formats owned exclusively by proprietary application stacks, good luck with that.

Ahmed is getting into the heart of an important related topic. As noted, it is difficult to see how a number of the ML algorithms are ML. If all these things were ML, why do analysts have to keep running different algorithms sequentially and figuring out what ones to run when the previous one does not work? The people that started using the term ML knew what it meant. But the people that are repeating the term seem to think it is some high-end AI.

To Ahmed’s point, the method applied has to adjust to the information. SAP and other vendors pitch ML as this robot not because it is accurate with current ML approaches but because companies think they can save money on people because “the system will automatically do it.”

Selling the Illusion of Automated Machine Learning

This article was removed from YouTube by SAP.

SAP’s ML explanations are pure fantasy, particularly concerning SAP’s history and likely future in the area. But, they are not designed to be correct but to sell software. Does SAP care if these things come true? One would think yes, but actually, no. Either way, it is virtually guaranteed that all of the claims that SAP is making about ML will be pushed forward and ascribed to a new sexy term when ML becomes “played out.” In this way, SAP can continually be seen as leading-edge, without ever actually having to fulfill any of the leading edges’ expectations. 

This is what SAP is doing, overselling ML as a silver bullet.

It is precise as analytics was oversold.

This gets to a related topic, but where are all of these great analytics we keep hearing about are transforming the planet? We don’t seem to see them. That is related to SAP, but it generalizes outside of SAP. Tableau and others have also oversold analytics, and it has been a bubble for some time now.

Conclusion

In these comments, Ahmed gets to what is real about ML.