How Many IBM and Other AI Projects Will Fail Due to a Lack of Data?

Executive Summary

  • Vendors and consulting firms have been aggressively selling AI in forecasting software and AI projects.
  • Customers are finding something curious about these ongoing projects.


We are now someway into the AI/ML bubble. What are AI projects finding to their dismay? A lack of data for running AI/ML.

Our References for This Article

If you want to see our references for this article and other related Brightwork articles, see this link.


Quotes from IBM on AI Projects

“Many ambitious artificial intelligence-backed projects never come to fruition due in large part to issues with data collection and cleaning, according to Arvind Krishna, PhD, IBM’s senior vice president of cloud and cognitive software.

During an interview with The Wall Street Journal earlier this month, Dr. Krishna noted that a common reason projects using IBM Watson AI often unravel is that companies are unprepared for the amount of time and money they must spend just collecting and preparing data. Those unglamorous yet crucial tasks, he said, make up approximately 80 percent of an entire project.

This quote is problematic from multiple dimensions.

Breaking the Watson Quote from IBM’s Overall AI Projects

Watson has been a failed product for IBM. It is AI directed at health care, which is still essentially non-functional after billions and over a decade of investment. However, this article is not about Watson (we have quotes about IBM Watson’s problems in the references). This quotation is about AI writ large. But, curiously, IBM ran into its data problems with Watson, as the following quote describes.

“The employees said there was never clear agreement, for example, on how to merge data gathered by the three companies into a unified format that could be used by Watson. That made it more difficult to deliver insights to help hospitals target medical services to specific patients, cut costs, and improve the quality of care.

With this acquisition, IBM will be one of the world’s leading health data, analytics and insights companies, and the only one that can deliver the unique cognitive capabilities of the Watson platform,” Deborah DiSanzo, general manager for IBM Watson Health, said in a statement following the Truven acquisition.

But the deals presented the difficult task of harmonizing all that data – housed in different formats, and focused on different aspects of patient care – into a model that could be digested by Watson, a challenge that is not unique to IBM.” – STAT

Perhaps IBM is not the company to rely upon for “spiffing” up your data for your AI project. As it is now quite clear, they were not able to figure out how to do it for their internal project, for which they had more resources than anyone individual IBM project will likely ever match. IBM Watson is a specific healthcare-focused AI solution. However, IBM appears also to call AI not related to that particular item Watson as well, which is, of course, confusing.

Let us review this portion of the quote from Dr. Krishna.


“often unravel is that companies are unprepared for the amount of time and money they must spend just collecting and preparing data.”

When IBM sold the project, did they explain the level of effort this would take? This quote makes it sound like someone else, that IBM does not communicate with, is selling AI projects that IBM consulting then has to work. Is Dr. Krishna that his own IBM sales team is communicating with these same customers before the IBM AI project begins?

Dr. Krishna goes on…

“You run out of patience along the way, because you spend your first year just collecting and cleansing the data,” he said. “And you say, ‘Hey, wait a moment, where’s the AI? I’m not getting the benefit.’ And you kind of bail on it.”

Questions Related to this IBM Quotation

How accurate are the IBM machine learning quotations.
Item NumberQuestion AreaQuestion
1Setting Customer ExpectationWas the data effort explained by IBM to customers? Has IBM ever oversold the benefits of AI and undersold the work effort required to get the data so it is in a state that it can be used by AI algorithms?
2How Long Until Data Begins to Be Usable?Does the data availability appear after the first year, or is this just the starting point?
3What is the Efficacy of the ML Algorithms?What about IBM AI projects that are sold on a promise of AI providing great improvements in forecasting accuracy which then, after the algorithms are run, don't and it turns out the entire premise of the project was flawed?
4Forecasting AI Project BenefitsIf the data is not close to being ready to run AI/ML algorithms, on what basis is IBM forecasting AI benefits to specific customers?

The question of underselling the data effort and overselling the benefits of AI is all-important because IBM has routinely oversold its Watson solution as the following quotation attests.

“But it also earned ill will and skepticism by boasting of Watson’s abilities. “They came in with marketing first, product second, and got everybody excited,”” –  Robert Wachter, chair of the department of medicine at the University of California, San Francisco


“Robert Burns, a professor of health care management at the University of Pennsylvania’s Wharton School, said the complexity of integrating mis-matched data sets has vexed hospitals and other health care entities for decades. It is folly, he said, for IBM, or any company outside the industry, to suggest the problem can quickly be solved to cure terminal diseases or dramatically improve health care delivery.” – STAT

And, of course, this is in no way limited to IBM. It is difficult to find a consulting company in IT that is not making outrageous claims around AI. Let us review several.

Getting Your AI From Wipro

Wipro, a firm not known for forecasting, is now your one-stop-shop for AI. 

Getting Your AI From Infosys

Infosys is another AI expert. So many AI experts to choose from among Cap Gemini’s video. That man later married that robot. 

Getting Your AI From Capgemini

This video from Cap Gemini is filled with inaccuracies, but if it does not “jack you up on AI,” it is unclear if anything will.

As with WiPro and Infosys, Cap Gemini is a non-entity in the forecasting space, but that does stop them from producing a killer video.

IBM’s AI Projects Tend to Fizzle Out?

Still, Dr. Krishna maintained that the fairly common occurrence of halted AI projects is “the nature of any early technology.” Even as so many fizzle out, IBM still has about 20,000 more ongoing AI projects, a number that he deemed indicative of overall success.”

There is a severe problem with Dr. Krishna’s statement here. This is because AI is not new. Is Dr. Krishna unaware of this fact?

AI has failed to produce results in at least two separate historical AI bubbles (in the 1960s and early 1970s, the 1980s), each one of them followed by an “AI winter.” Many of the people working in data science/AI are not even aware of these previous bubbles. And how far back AI goes surprises most people, we discuss this topic with.

“Many of them predicted that a machine as intelligent as a human being would exist in no more than a generation and they were given millions of dollars to make this vision come true.

Eventually, it became obvious that they had grossly underestimated the difficulty of the project. In 1973, in response to the criticism from James Lighthill and ongoing pressure from congress, the U.S. and British Governments stopped funding undirected research into artificial intelligence, and the difficult years that followed would later be known as an “AI winter“.” – Wikipedia

For those of you who have not tried SodaStream, you really should. It not only can add fizzle to new drinks, but it can give that “sparkling quality” to drinks that have gone flat. The problem? As of yet, there is no SodaStream for AI projects. 

To review a portion of the quote from Dr. Krishna.

“Even as so many fizzle out, IBM still has about 20,000 more ongoing AI projects, a number that he deemed indicative of overall success.”

And when questioned about IBM’s success in AI, he responded defensively with the following quotation.

““I think 20,000 is not slow,” he said. “I think 20,000 projects is, what I would call, successful.””

This brings up the following questions

  • How does IBM have 20,000 ongoing AI projects?
  • Successful for whom, the customer of IBM?

IBM certainly sees this as a success, but IBM only cares about billing hours on projects. By this definition, even AI projects where hours are billed but not work are considered auspicious by the consulting company. However, IBM clients measure success, not by IBM’s metric. That is customers that invest in AI measure the benefit by how AI improves the accuracy of their various predictions.

The idea that IBM would have so many AI projects ongoing, and that there would be so little published about the benefits of AI received by companies is odd.

Another question is, why is IBM placing data science resources on-site and billing for them if the data is mostly unavailable and if it may take a year or more to develop the data? Would IBM sell an automobile service plan for a customer that has yet to purchase an automobile? It seems like a simple question to ask of what data the client has that can be used. Without this, IBM has no idea if their client can benefit from an AI project.

The AI Project Preparedness Matrix

This topic of data availability brings up the question of how common it is for companies that engage in AI projects have the necessary items to pull such projects off successfully.

To evaluate this, below are the author’s individual estimates and three other experienced resources in forecasting and ML/AI.

Issues with Dr. Krishna/IBM's Quotes

Accuracy measurement of Dr. Krishna statements.
Item NumberIssueDescription
1Misrepresentation of IBM WatsonIBM Watson is not a successful product. In fact Watson has failed quite heavily and left a litany of dissatisfied customers that IBM does not acknowledge. IBM failed at their own internal data integration project, leading in part to Watson's downfall.
2Confusion or Commingling of Watson with IBM AI.Watson is not the same as IBM AI, or an IBM AI project.
3AI's DevelopmentAI is not new. This leads to the natural question of why Dr. Krishna would state that it is new. Does Dr. Krishna and IBM sales mislead prospects by repeating that AI is new in order to minimize and deflect from AI's true history?
4Responsibility for Setting Sales ExpectationsDr. Krishna describes a scenario where IBM has no responsibility for explaining the effort in investing in data development to IBM's AI customers. It is difficult to believe that IBM properly apprises customers of these difficulties. Therefore, it fits with Dr. Krishna's incentives to state that "customers don't seem aware," when IBM puts informing them secondary to selling AI projects.
5Measuring AI SuccessDr. Krishna seems to measure AI success by how many IBM AI projects are ongoing, rather than how successful those projects are at delivery benefits.

The Otherworldly Claims of AI

Consulting firms are making significant and unsubstantiated claims around AI. Consulting firms with no background in either AI or forecasting are making world-changing claims about their AI capabilities, and the claims appear to be uniform.

  • AI is being proposed to defeat other methods in an almost universal manner, all without evidence. This is true.
  • AI is becoming homogenized to improve just about everything. AI’s benefits are claimed to be so universal, that in short order, it will be challenging to declare what is not an improved outcome of applying AI.
  • Many companies that eventually do assemble their multivariate data will find that in a higher percentage of cases, the AI/ML is not able to show benefit versus far more straightforward and less expensive forecasting techniques. Dr. Krishna states the following.

“In the world of IT in general, about 50% of projects run either late, over budget or get halted. I’m going to guess that AI is not dramatically different.”

Not all IT projects have the same success rate. This is something else that Dr. Krishna should know. Because they are so firmly based upon false claims. The AI Project Preparedness Matrix above indicate that most of the AI projects that are sold are sold into companies that can’t complete them successfully.

Who Are the AI Poll Contributors?

  1. Shaun Snapp: Shaun is the article author and an experienced forecasting consultant, and the author of four books on forecasting.
  2. Ahmed Azmi: Ahmed has many years of experience in the AI/ML space.
  3. Steve Morlidge: Steve is a long term forecasting consultant, author, or forecasting journal publications, and the author of several books on forecasting.
  4. Anonymous: The anonymous entry is someone from a software vendor with many years of industry forecasting experience and publications in the forecasting literature.