How to Understand Why In Memory Computing is a Myth

Executive Summary

  • Non in memory computing makes no sense, therefore memory computing makes no sense.
  • AWS covers HANA’s in memory nature. Placing a database 100% placed into memory is not a good thing.
  • We cover the long history of database memory optimization.

Introduction to In Memory Computing

SAP has been one of the major proponents of something called “in memory computing.” Hasso Plattner has written four books on the topic. You will learn how in memory for databases works.

Hasso Plattner has been pushing the importance of in-memory computing for a number of years. Hasso Plattner’s books aren’t books in the traditional sense. They are sales material for SAP. The books we have read by Hasso Plattner uniformly contain exaggerations as to the benefits one can expect from “in memory computing.”

However, there have been some inaccuracies concerning the specific topic of memory management with HANA.

What is Non In Memory Computing?

So all computing occurs in memory there is no form of computing that is performed without memory because the results would be unacceptable. Computing has been using more and more memory as anyone who purchases a computer can see for themselves. While at one time a personal computer might sell with 4 GB of memory (or RAM), 16 GB is now quite common on new computers.

The Problem with the Term In Memory Computing

SAP took a shortcut when they used the phrase “in memory” computing. The computer I am typing on has loaded the program into memory. So the term “in-memory computing” is a meaningless term.

Instead, what makes HANA different is it requires more of the database to be loaded into memory. And HANA is the only database I cover that works that way. With this in mind, the term should have been

“more database in memory computing.”

**There is a debate as to how may tables are loaded into memory. Not the large tables and not the column-oriented tables. This is the opposite of what SAP has said about HANA. The reason for this debate is SAP has provided contradictory information on this topic. 

That is accurate. SAP’s term may roll off the tongue better, but it has the unfortunate consequence of being inaccurate.

And it can’t be argued that it is correct.

Here is a quote from AWS’s guide on SAP HANA, which is going to tend to be more accurate than anything SAP says about HANA.

“Storage Configuration for SAP HANA: SAP HANA stores and processes all or most of its data in memory, and provides protection against data loss by saving the data in persistent storage locations. To achieve optimal performance, the storage solution used for SAP HANA data and log volumes should meet SAP’s storage KPI.”

However, interestingly, this following statement by AWS on HANA’s sizing is incorrect.

“Before you begin deployment, please consult the SAP documentation listed in this section to determine memory sizing for your needs. This evaluation will help you choose Amazon EC2 instances during deployment. (Note that the links in this section require SAP support portal credentials.)”

Yet it is likely, not feasible for AWS to observe that SAP’s sizing documentation will cause the customer to undersize the database so that the customer will purchase HANA licenses on false pretenses and then have to go back to purchase more HANA licenses after the decision to go with HANA has already been made.

Bullet Based Guns?

Calling HANA “in-memory computing” is the same as saying “bullet based shooting,” when discussing firearms.

Let us ask the question: How would one shoot a firearm without using a bullet?

If someone were to say their gun was better than your gun (which in essence SAP does regarding its in-memory computing) and the reason they give is that they used “with bullet shooting technology,” you would be justified in asking what they are smoking. A gun is a bullet based technology.

How to Use a Term to Create Confusion Automatically

This has also lead to a great deal of confusion about how memory is used by computers among those that don’t spend their days focusing on these types of issues. And this is not exclusive to SAP. Oracle now uses the term in-memory computing as do many IT entities. Oracle references the term also as can be seen in the following screenshot taken from their website.

Is 100% of the Database Placed into Memory a Good Thing?

However, the question is whether it is a good or necessary thing. And it is difficult to see how it is.

It means that with S/4HANA even though only a small fraction of the tables are part of a query or transaction, the entire database of tables is in memory at all times.

Now, let us consider the implications of what this means for a moment. Just think for a moment how many tables SAP’s applications have, and how many are in use at any one time.

Why do tables not involved in the present activity, even tables that are very rarely accessed need to be in memory at all times?

The Long History of Database Memory Optimization

People should be aware that IBM and Oracle and Microsoft all have specialists that focus on something called memory optimization.

Microsoft has documents on this topic at this link.

Outsystems, which is a PaaS development environment that connects exclusively to SQL Server has its own page on memory optimization to the database which you can see at this link.

The specialists that work in this area figure out how to program the database to have the right table in memory to meet the demands of the system, and there has been quite a lot of work in this area for quite a long time. Outside of SAP, there is little dispute that this is the logical way to design the relationship between the database and the hardware’s memory.

Conclusion

In summary, if a person says “in-memory computing” the response should be “can we be more specific.” Clear thinking requires the use of accurate terms as a logical beginning point.

SAP’s assertion the entire database must be loaded into memory is unproven. A statement cannot be accepted if it both has no meaning and if what it actually means (as in the entire database in memory) is unproven.

Brightwork Disclosure

Financial Bias Disclosure

This article and no other article on the Brightwork website is paid for by a software vendor, including Oracle and SAP. Brightwork does offer competitive intelligence work to vendors as part of its business, but no published research or articles are written with any financial consideration. As part of Brightwork’s commitment to publishing independent, unbiased research, the company’s business model is driven by consulting services; no paid media placements are accepted.

HANA & S/4HANA Question Box

  • Have Questions About S/4HANA & HANA?

    It is difficult for most companies to make improvements in S/4HANA and HANA without outside advice. And it is close to impossible to get honest S/4HANA and HANA advice from large consulting companies. We offer remote unbiased multi-dimension S/4HANA and HANA support.

    This article is free, we do not answer questions for free. Filling out this form is for those that have a budget. If that describes you, just fill out the form below and we'll be in touch asap.

References

I cover how to interpret risk for IT projects in the following book.

The Risk Estimation Book

 

Software RiskRethinking Enterprise Software Risk: Controlling the Main Risk Factors on IT Projects

Better Managing Software Risk

The software implementation is risky business and success is not a certainty. But you can reduce risk with the strategies in this book. Undertaking software selection and implementation without approximating the project’s risk is a poor way to make decisions about either projects or software. But that’s the way many companies do business, even though 50 percent of IT implementations are deemed failures.

Finding What Works and What Doesn’t

In this book, you will review the strategies commonly used by most companies for mitigating software project risk–and learn why these plans don’t work–and then acquire practical and realistic strategies that will help you to maximize success on your software implementation.

Chapters

Chapter 1: Introduction
Chapter 2: Enterprise Software Risk Management
Chapter 3: The Basics of Enterprise Software Risk Management
Chapter 4: Understanding the Enterprise Software Market
Chapter 5: Software Sell-ability versus Implementability
Chapter 6: Selecting the Right IT Consultant
Chapter 7: How to Use the Reports of Analysts Like Gartner
Chapter 8: How to Interpret Vendor-Provided Information to Reduce Project Risk
Chapter 9: Evaluating Implementation Preparedness
Chapter 10: Using TCO for Decision Making
Chapter 11: The Software Decisions’ Risk Component Model

Risk Estimation and Calculation

Risk Estimation and Calculation

See our free project risk estimators that are available per application. The provide a method of risk analysis that is not available from other sources.

project_software_risk

https://www.ibm.com/blogs/research/2017/10/ibm-scientists-demonstrate-memory-computing-1-million-devices-applications-ai/

https://docs.microsoft.com/en-us/azure/sql-database/sql-database-in-memory

https://s3.amazonaws.com/quickstart-reference/sap/hana/latest/doc/SAP+HANA+Quick+Start.pdf

 

Did Hasso Plattner and His PhD Students Invent HANA?

Executive Summary

  • Hasso Plattner and SAP have put forth a false backstory for how HANA was developed.
  • We cover whether removing aggregates from HANA is the breakthrough it is proposed to be, where TREX and P*Time came from, the importance of “zero response time” from a database and the University of Korea connection.

Introduction

The story of HANA’s development looks more suspicious the closer one looks at it. Luckily for SAP, few people do. You will learn the most probable story for HANA’s origin, and how that story was changed by SAP to glorify Hasso Plattner and to help SAP make false innovation claims.

SAP’s Official Story on the Origin of HANA

Hasso Plattner has been widely credited with inventing HANA. The following quotation from Quora covers the common understanding of this.

“I think I am late to answer this, but I completely agree with Anuj. Vishal was the marketing guy or the idea guy but Hasso designed HANA. Both of them are geniuses and at SAP, Vishal will be always missed.”Quora

And here is the explanation of its genesis from Wikipedia.

“The first major demonstration of the platform was in 2008: teams from SAP SE, the Hasso Plattner Institute and Stanford University demonstrated an application architecture for real-time analytics and aggregation. Former SAP SE executive, Vishal Sikka, mentioned this architecture as “Hasso’s New Architecture”.”

In this article, we will analyze how HANA was invented and who invented it.

The Explanation Who Invented HANA

Hasso wrote four books on HANA. In one of the books The In-Memory Revolution: How SAP HANA Enables Business of the Future. In this books, Hasso Plattner explains the genesis of HANA.

“Its fall 2006: I, Hasso Plattner, am a professor for computer science at the HPI in Potsdam Germany. My chair has the focus on enterprise system architecture, and I have to find a new research area for my PhD candidates. Yes, they have to find the topic for their dissertation themselves, but I wanted to guide them towards something I was really familiar with, a concept for a new Enterprise Resource Planning system. All my professional life I have worked on such systems, and I ask myself, what would they look like if we could start from scratch?”

On a side note, it is interesting that Ph.D. candidates are being directed to work on something beneficial for SAP. So this is a strange university as it seems to be a research outfit for SAP rather than a university. That is how much will the Ph.D. candidates be paid to work on this?

SAP Exploiting Cheap Ph.D. Student Labor?

That is how much will the Ph.D. candidates be paid to work on this?

Let’s see the next quote from Hasso on HANA.

“Ever since we have been building such systems, first at IBM, and now at SAP, they were based on the idea what we know exactly what the users want to know. In order to answer their questions in a reasonable time frame, we maintained aggregated data in real time – meaning that whenever we recorded a business transaction we updated all impacted totals. Therefore, the system was ready to give answers to any foreseeable question, thus labeled a real time system. The new idea I come up with is to drop those totals completely, and to just compress the transaction data per day while keeping all additional data intact. It has not much , one piece of paper, but it is a start.

With this I went to my team of PhD candidates and educated them about data structures and data volumes in typical ERP systems. After a lengthy session on the whiteboard, one student asked me what the compression rate might be from the transaction data to the compression data. I did a calculation for a fictitious financial system, and after a while, came up with the answer.

The student wasn’t the least bit impressed, and said, “From an academic point of view, this compression rate is not very impressive.” My new idea, all that I had, was shattered. I took the eraser, wiped out the aggregates in the drawing on the whiteboard and replied, “Okay, no aggregates anymore.”

Why is the objective here to move to such compressed data? Storage is inexpensive.

Hasso Plattner’s Obsession with Compression

I have read many of Hasso’s writing on this topic of compression. What I can’t determine is if he has a serious mental block on this topic and is fixated on data compression, or if he believes data compression is a big deal. If I had to bet, I would bet that he knows it is not something worth focusing on, but that he does so because it so happens that column-oriented databases like HANA compress better and that he can trick senior executives into thinking it is an important advantage. Throughout HANA’s existence, SAP has been careful to bring the HANA message to the people highest in companies that know the least about databases. Hasso cannot debate and win against people that know databases.

Furthermore, Hasso has greatly overstated the compression that HANA is capable of (and which was subsequently repeated through SAP’s passive surrogate network) as is covered in the article Why John Appleby was So Wrong About his HANA Predictions.

This overall conversation is strange, and it implies that Hasso is making major changes to the design based on the input of a single Ph.D. candidate. And when told about his response Hasso says that “my new idea — was shattered” seems a bit melodramatic.

Is Removing Aggregates a Breakthrough?

“This was the breakthrough I was looking for. No one had ever built a financial system without materialized aggregates, whether updated in real time or through batch updates.”

Is “breakthrough” the correct description of was had occurred? Aggregates are precomputed tables, which allow fast retrieval of information. They take up space, but they are quite useful. Without them, the database must calculate everything that it uses on the fly every time the request is made.

Also, what is the benefit of dispensing with aggregates? The database will be smaller, but unless you price the database per its size, this is hardly an issue. And as it happens, HANA did end up becoming priced per GB/TB. There is an important reason why no one had built a financial system without aggregates. There is close to no benefit to doing so. Every year storage becomes less expensive.

Is This the Proper Research Question?

“Back on the offensive, I asked, “What if we assume the database always has zero response time, what would an ERP system look like? This was a proper research question, and the academic work could begin.”

First, why is this the proper research question? Who out in SAP’s client base was asking for a zero response time database for ERP? I ask because I know that SAP customers have been asking for a lot of things, better customer support, better maturity before releasing products, lower costs, etc..Why weren’t those things the proper research question. Second, who cares? ERP systems don’t require zero response time. ERP systems record transactions and they do it quite quickly already. ERP systems can run into issues when it comes to processing. For instance, performing a procedure like MRP.

But MRP can be sped by adding memory and CPU capacity to the machine, or by doing things like removing the invalid product location combinations run through the system. Database performance is a bottleneck for analytics, but not ERP systems. This is not to say that it can’t be. Furthermore, even if it were true, there is no evidence that HANA is faster than alternatives, and this is particularly true for HANA with ERP, as is covered in the article What is the Actual Performance of SAP HANA.  On the contrary, HANA is not only slower than the alternatives, but it is less stable and the highest maintenance database of the other options.

The Importance of a Zero Response Time Database?

“But how about some experimental work? Shouldn’t we check for a database that could come close to this ideal? Was this, in the end, possible at all? This is the wonderful part of doing research at a university. At SAP, ideas such as zero response time database would not have been widely accepted. At a university you can dream, at least for a while. As long as you can produce meaningful papers, things are basically alright.”

Perhaps, but in the industry, a zero response database would have been less accepted because the benefits versus the costs aren’t there, particularly for ERP systems. Ph.D. students will be more willing to work on things that tend to be less practical. In some cases, this can be a good thing. But in this case, Hasso lead his students on a wild goose chase because databases are already quite sufficiently fast to support what companies want to do.

“We asked SAP whether we could have access to the technologies behind their databases; TREX, an in memory database with columnar storage, P*Time, an in memory database with row storage, and MaxDB, SAP’s relational database. My PhD candidates started playing with these systems, and it became clear in a very short time that building a new ERP system was, by far, not as interesting as building a new database. In the end, they were all computer scientists — accounting, sales ,purchasing, or human resource management are more than the scope of a business school student.”

Is This a Believable Story?

So HANA exists because Hasso’s Ph.D. students found it more interesting to create a new database? It is too bad I was not invited to this little soiree because I could have communicated to Hasso and Hasso’s Ph.D.s that this is a poor use of time. SAP has many problems with their customers. And database speed is not even in the top ten list of problems.

“The compromise was that we build a database prototype from scratch, and all the application scenarios with which we were going to verify the concept of a super fast database had to match those of real enterprise systems.”

Hasso set upon his Ph.D. candidates to build a new ERP system? Is that wise? ERP systems are massive combinations of functionality that should probably be undertaken by SAP development. This is not a good subject for a dissertation.

If Ph.D. candidates are developing a new ERP system for Hasso Plattner, this seems quite exploitive (were they getting stock options), and some Ph.D. candidates aren’t the right people to do it.

Where Did TREX and P*Time Come From?

Hasso states that they “asked SAP” if they could have access to TREX and P*TIME. First, that is unlikely that anyone at SAP would deny Hasso this. It is well-known that McDermott is a puppet and Hasso still runs SAP. So imagine Larry Ellison asking to use an Oracle database for a research project and that request being rejected by Mark Hurd. Would that conceivably happen?

But secondly, there is some information missing in the story presented by Hasso regarding the origin of P*TIME.

Now let us look at the timeline of HANA and its component technologies.

Why does Hasso describe P*TIME as SAP products without bringing up the point that it was recently acquired by SAP in 2005?

Also, this timing looks peculiar. Hasso tasks his Ph.D. students with developing HANA less than a year after purchasing a critical technology to HANA. Hasso was not aware he was going to use P*TIME for something like this at the time he acquired it in 2005? 

Let us look into where P*TIME was acquired.

Sang Kyun Cha, the University of Korea Professor

P*TIME was purchased from the professor Sang Kyun Cha, of the University of Korea. Let us review Sang Kyun Cha’s bio on his webpage.

“SAP HANA – My third generation in-memory database engine (SAP HANA – Wikipedia, Article (Korean))

P*TIME – Founded Transact In Memory, Inc. in Silicon Valley in 2002 (also its wholly-owned subsidiary TIM System in Korea in 2000) to fund the development of the next-generation in-memory DBMS and has led it to the successful acquisition by SAP (the #1 global business software company) in 2005. SAP transformed TIM System to SAP Labs Korea and made an official announcement in March 2008.

Sang Kyun Cha is a professor, an innovator, and an entrepreneur. He worked on three generations of commercialized in-memory database technology since he joined Seoul National University in 1992. In 2000, he founded Transact In Memory, Inc. with his vision of developing an enterprise in-memory database system called P*TIME (Parallel* Transact-In-Memory Engine). The company was quietly acquired by SAP in late 2005.

By SAP’s request, Prof. Cha led SAP’s Korean HANA development.

By early 2006, Prof. Cha’s team completed P*TIME development with an innovative OLTP scalability architecture. With SAP’s in-house column store TREX, P*TIME served as a corner stone of developing SAP HANA, the first distributed enterprise in-memory database system enabling real-time analytics over transactionally integrated row and column stores. Today, SAP and many other companies run ERP, CRM, business warehouse on HANA. By SAP’s request, Prof. Cha led SAP’s Korean HANA development.”

Where is Hasso’s recounting of what Prof Cha was doing at that time? Hasso is proposing that HANA was wholly original, but he would have approved and perhaps driven the purchase of P*TIME.

Conclusion

The story of HANA’s origins as presented by Hasso appears fishy. It has the look of a story that minimizes the inputs that were pre-existing before Hasso even entered the scene with column-oriented databases, and through a very small “addition” allows Hasso to take credit for the work of others.

Hasso seems to imply that TREX and P*TIME were just sort of “sitting around” until he and his Ph.D. candidates used them to create HANA. The story presented by Hasso in his book The In-Memory Revolution seems to minimize the technologies that they relied upon and to propose the idea in a way came out of nowhere and was a lightning bolt of creativity developed by Hasso Plattner. And this is the way in which people who are worth a lot of money (and Hasso is worth north of $20 billion) can manufacture history to position themselves as the inventors of things. A perfect example of this is Thomas Edison. Thomas Edison was a terrible scientist according to Nikola Tesla. And who initially credited Thomas Edison with this invention? It was Thomas Edison himself!

How Hasso Plattner Has Channeled Thomas Edison

A person seeking to do this follows an important pattern. They do not attribute work done before their own. Thomas Edison was famous for doing this exact thing. Column oriented databases have been in existence as long as relational databases. Hasso asserts that because he came up with the idea of not using aggregates — which is simply a switch of a pre-calculated table read for recalculation, that he created something entirely new. However, upon analysis, I can’t see anything of substance added by Hasso. And the one idea Hasso seems to have come up with — dropping aggregates is not a good idea. If you approach an area with a well-developed body of work and with functional commercial products (many column oriented databases were being sold at this time) and you add one minor idea, and that is not even a good idea, you did not invent anything. It’s not innovation, its claim jumping.

What is more apparent that Hasso pushed the idea of column oriented databases — but that is the promotion. Like Thomas Edison, Hasso is a great promoter. But a major contributor to databases? Hasso and his Ph.D.’s did not accomplish this.

The Cover Story

The story presented by Hasso in his book The In-Memory Revolution minimizes the technologies that they relied upon, and to propose the idea in a way came out of nowhere. After HANA was introduced, more inaccurate stories were told by Hasso about the supposed benefits of HANA, and those benefits — particularly vis a vis the competition have been debunked by Brightwork. (see the articles on this site for which areas of SAP’s claims about HANA’s superiority have been debunked)

It seems that not only were the proposed benefits of HANA exaggerated and inaccurate but now the origin of HANA was engineered to make it appear as if HANA was primarily Hasso’s idea. And this has filtered through to become the accepted view. This is commonly proposed by both IT media entities and SAP consulting companies who aggressively bow to anything that SAP asserts. However, how can something that had already been around before HANA was developed be an original idea?

This is a constant problem with SAP where they pose to have innovated things that they never did.

Interesting Questions on HANA’s Development

From this case study, some interesting questions arise.

  • Why did SAP purchase some of the primary components of HANA less than a year before Hasso “invented” HANA?
  • Why was Professor Cha asked to lead the Korean development of HANA?
  • On Professor Cha’s website, he states that his company was “quietly” purchased by SAP. Why was the purchase “quiet?” Was this because SAP planned to take more credit for HANA being internally developed that it was?
  • Why is it that we don’t even hear about Professor Cha in the story told by Hasso about HANA? Was Hasso unfamiliar with Professor Cha’s work, that is unfamiliar with the work of a professor that developed a company that SAP purchased?
  • Did any of the IT media look into this story, or did they simply accept the storyline that HANA and the important component technologies were the inventions of Hasso Plattner?

Brightwork Disclosure

Financial Bias Disclosure

This article and no other article on the Brightwork website is paid for by a software vendor, including Oracle and SAP. Brightwork does offer competitive intelligence work to vendors as part of its business, but no published research or articles are written with any financial consideration. As part of Brightwork’s commitment to publishing independent, unbiased research, the company’s business model is driven by consulting services; no paid media placements are accepted.

HANA & S/4HANA Question Box

  • Have Questions About S/4HANA & HANA?

    It is difficult for most companies to make improvements in S/4HANA and HANA without outside advice. And it is close to impossible to get honest S/4HANA and HANA advice from large consulting companies. We offer remote unbiased multi-dimension S/4HANA and HANA support.

    This article is free, we do not answer questions for free. Filling out this form is for those that have a budget. If that describes you, just fill out the form below and we'll be in touch asap.

References

http://kdb.snu.ac.kr/chask/

[P*TIME](http://dl.acm.org/citation.cfm?id=1316778)

[The In-Memory Revolution: How SAP HANA Enables Business of the Future:Amazon:Books](https://www.amazon.com/Memory-Revolution-Enables-Business-Future/dp/3319166727)

[Is Vishal Sikka really the father of HANA? – Quora](https://www.quora.com/Is-Vishal-Sikka-really-the-father-of-HANA)

[TREX search engine – Wikipedia](https://en.wikipedia.org/wiki/TREX_search_engine)

http://sites.computer.org/debull/A12mar/hana.pdf