Did Hasso Plattner and His PhD Students Invent HANA?

Table of Contents: Select a Link to be Taken to That Section

Executive Summary

Hasso Plattner and SAP have put forth a false backstory for how HANA was developed.
We cover whether removing aggregates from HANA is the breakthrough it is proposed to be, where TREX and P*Time came from, the importance of “zero response time” from a database, and the University of Korea connection.

Introduction

The story of HANA’s development looks more suspicious the closer one looks at it. Luckily for SAP, few people do. You will learn the most probable story for HANA’s origin and how it was changed by SAP to glorify Hasso Plattner and help SAP make false innovation claims.

Our References for This Article

If you want to see our references for this article and other related Brightwork articles, see this link.

Notice of Lack of Financial Bias: We have no financial ties to SAP or any other entity mentioned in this article.

This is published by a research entity, not some lowbrow entity that is part of the SAP ecosystem.
Second, no one paid for this article to be written, and it is not pretending to inform you while being rigged to sell you software or consulting services. Unlike nearly every other article you will find from Google on this topic, it has had no input from any company's marketing or sales department. As you are reading this article, consider how rare this is. The vast majority of information on the Internet on SAP is provided by SAP, which is filled with false claims and sleazy consulting companies and SAP consultants who will tell any lie for personal benefit. Furthermore, SAP pays off all IT analysts -- who have the same concern for accuracy as SAP. Not one of these entities will disclose their pro-SAP financial bias to their readers.

SAP’s Official Story on the Origin of HANA

Hasso Plattner has been widely credited with inventing HANA. The following quotation from Quora covers the common understanding of this.

“I think I am late to answer this, but I completely agree with Anuj. Vishal was the marketing guy or the idea guy but Hasso designed HANA. Both of them are geniuses and at SAP, Vishal will be always missed.” – Quora

And here is the explanation of its genesis from Wikipedia.

“The first major demonstration of the platform was in 2008: teams from SAP SE, the Hasso Plattner Institute and Stanford University demonstrated an application architecture for real-time analytics and aggregation. Former SAP SE executive, Vishal Sikka, mentioned this architecture as “Hasso’s New Architecture”.”

This article will analyze how HANA was invented and who invented it.

The Explanation Who Invented HANA

Hasso wrote four books on HANA. In one of the books, The In-Memory Revolution: How SAP HANA Enables Business of the Future. In this book, Hasso Plattner explains the genesis of HANA.

“Its fall 2006: I, Hasso Plattner, am a professor for computer science at the HPI in Potsdam Germany. My chair has the focus on enterprise system architecture, and I have to find a new research area for my PhD candidates. Yes, they have to find the topic for their dissertation themselves, but I wanted to guide them towards something I was really familiar with, a concept for a new Enterprise Resource Planning system. All my professional life I have worked on such systems, and I ask myself, what would they look like if we could start from scratch?”

On a side note, it is interesting that Ph.D. candidates are directed to work on something beneficial for SAP. So, this is a strange university as it seems to be a research outfit for SAP rather than a university. How much will the Ph.D. candidates be paid to work on this?

SAP Exploiting Cheap Ph.D. Student Labor?

How much will the Ph.D. candidates be paid to work on this?

Let’s see the next quote from Hasso on HANA.

“Ever since we have been building such systems, first at IBM, and now at SAP, they were based on the idea what we know exactly what the users want to know. In order to answer their questions in a reasonable time frame, we maintained aggregated data in real time – meaning that whenever we recorded a business transaction we updated all impacted totals. Therefore, the system was ready to give answers to any foreseeable question, thus labeled a real time system. The new idea I come up with is to drop those totals completely, and to just compress the transaction data per day while keeping all additional data intact. It has not much , one piece of paper, but it is a start.

With this I went to my team of PhD candidates and educated them about data structures and data volumes in typical ERP systems. After a lengthy session on the whiteboard, one student asked me what the compression rate might be from the transaction data to the compression data. I did a calculation for a fictitious financial system, and after a while, came up with the answer.

The student wasn’t the least bit impressed, and said, “From an academic point of view, this compression rate is not very impressive.” My new idea, all that I had, was shattered. I took the eraser, wiped out the aggregates in the drawing on the whiteboard and replied, “Okay, no aggregates anymore.”

Why is the objective here to move to such compressed data? Storage is inexpensive.

Hasso Plattner’s Obsession with Compression

I have read much of Hasso’s writing on this topic of compression. I can’t determine if he has a severe mental block on this topic and is fixated on data compression or if he believes data compression is a big deal. If I had to bet, I would bet he knows it is not worth focusing on. But he does so because it so happens that column-oriented databases like HANA compress better and that he can trick senior executives into thinking it is a significant advantage. Throughout HANA’s existence, SAP has been careful to bring the HANA message to the highest people in companies who know the least about databases. Hasso cannot debate and win against people who know databases.

Furthermore, Hasso has dramatically overstated the compression that HANA is capable of (and subsequently repeated through SAP’s passive surrogate network), as is covered in the article Why John Appleby was So Wrong About his HANA Predictions.

This overall conversation is strange, and it implies that Hasso is making significant changes to the design based on a single Ph.D. candidate’s input. When told about his response, Hasso says that “my new idea — was shattered,” which seems a bit melodramatic.

Is Removing Aggregates a Breakthrough?

“This was the breakthrough I was looking for. No one had ever built a financial system without materialized aggregates, whether updated in real time or through batch updates.”

Is “breakthrough” the accurate description of what had occurred? Aggregates are precomputed tables that allow fast retrieval of information. They take up space, but they are quite helpful. Without them, the database must calculate everything it uses on the fly whenever the request is made.

Also, what is the benefit of dispensing with aggregates? The database will be smaller, but this is hardly an issue unless you price it per its size. And as it happens, HANA did become priced per GB/TB. There is an important reason why no one has built a financial system without aggregates. There is close to no benefit to doing so. Every year, storage becomes less expensive.

Is This the Proper Research Question?

“Back on the offensive, I asked, “What if we assume the database always has zero response time, what would an ERP system look like? This was a proper research question, and the academic work could begin.”

First, why is this the proper research question? Who in SAP’s client base asked for a zero response time database for ERP? I ask because I know that SAP customers have been asking for many things, better customer support, better maturity before releasing products, lower costs, etc..Why weren’t those things the proper research question? Second, who cares? ERP systems don’t require zero response time. ERP systems record transactions, and they do it quite quickly already. ERP systems can run into issues when processing—for instance, performing a procedure like MRP.

But MRP can be sped by adding memory and CPU capacity to the machine or by removing the invalid product location combinations run through the system. Database performance is a bottleneck for analytics, but not ERP systems. This is not to say that it can’t be. Furthermore, even if it were true, there is no evidence that HANA is faster than alternatives. This is particularly true for HANA with ERP, as is covered in the article What is the Actual Performance of SAP HANA. On the contrary, HANA is slower than the alternatives, but it is less stable and has the highest maintenance database of the other options.

What is the Importance of a Zero Response Time database?

“But how about some experimental work? Shouldn’t we check for a database that could come close to this ideal? Was this, in the end, possible at all? This is the wonderful part of doing research at a university. At SAP, ideas such as zero response time database would not have been widely accepted. At a university you can dream, at least for a while. As long as you can produce meaningful papers, things are basically alright.”

Perhaps, but a zero response database would have been less accepted in the industry because the benefits versus the costs aren’t there, particularly for ERP systems. Ph.D. students will be more willing to work on things that tend to be less practical. In some cases, this can be a good thing. But in this case, Hasso leads his students on a wild goose chase because databases are already quite sufficiently fast to support what companies want to do.

“We asked SAP whether we could have access to the technologies behind their databases; TREX, an in memory database with columnar storage, P*Time, an in memory database with row storage, and MaxDB, SAP’s relational database. My PhD candidates started playing with these systems, and it became clear in a very short time that building a new ERP system was, by far, not as interesting as building a new database. In the end, they were all computer scientists — accounting, sales ,purchasing, or human resource management are more than the scope of a business school student.”

Is This a Believable Story?

So, does HANA exist because Hasso’s Ph.D. students found it more attractive to create a new database? It is too bad I was not invited to this little soiree because I could have communicated to Hasso and Hasso’s Ph. D.s that this is a poor use of time. SAP has many problems with its customers. And database speed is not even in the top ten list of problems.

“The compromise was that we build a database prototype from scratch, and all the application scenarios with which we were going to verify the concept of a super fast database had to match those of real enterprise systems.”

Has he set upon his Ph.D. candidates to build a new ERP system? Is that wise? ERP systems are massive combinations of functionality that should probably be undertaken by SAP development. This is not a good subject for a dissertation.

If Ph.D. candidates are developing a new ERP system for Hasso Plattner, this seems quite exploitive (where they get stock options), and some Ph.D. candidates aren’t the right people to do it.

Where Did TREX and P*Time Come From?

Hasso states they “asked SAP” to access TREX and P*TIME. First, it is unlikely that anyone at SAP would deny Hasso this. It is well-known that McDermott is a puppet, and Hasso still runs SAP. Imagine Larry Ellison asking to use an Oracle database for a research project and the request being rejected by Mark Hurd. Would that conceivably happen?

Secondly, some information is missing in Hasso’s story regarding the origin of P*TIME.

Now, let us look at the timeline of HANA and its component technologies.

Why does Hasso describe P*TIME as an SAP product without bringing up the point that it was recently acquired by SAP in 2005?

Also, this timing looks peculiar. Hasso tasks his Ph.D. students with developing HANA less than a year after purchasing a critical technology to HANA. Hasso was unaware he would use P*TIME for something like this when he acquired it in 2005.

Let us look into where P*TIME was acquired.

Sang Kyun Cha, the University of Korea Professor

P*TIME was purchased from Professor Sang Kyun Cha of the University of Korea. Let us review Sang Kyun Cha’s bio on his webpage.

“SAP HANA – My third generation in-memory database engine (SAP HANA – Wikipedia, Article (Korean))

P*TIME – Founded Transact In Memory, Inc. in Silicon Valley in 2002 (also its wholly-owned subsidiary TIM System in Korea in 2000) to fund the development of the next-generation in-memory DBMS and has led it to the successful acquisition by SAP (the #1 global business software company) in 2005. SAP transformed TIM System to SAP Labs Korea and made an official announcement in March 2008.

Sang Kyun Cha is a professor, an innovator, and an entrepreneur. He worked on three generations of commercialized in-memory database technology since he joined Seoul National University in 1992. In 2000, he founded Transact In Memory, Inc. with his vision of developing an enterprise in-memory database system called P*TIME (Parallel* Transact-In-Memory Engine). The company was quietly acquired by SAP in late 2005.

By SAP’s request, Prof. Cha led SAP’s Korean HANA development.

By early 2006, Prof. Cha’s team completed P*TIME development with an innovative OLTP scalability architecture. With SAP’s in-house column store TREX, P*TIME served as a corner stone of developing SAP HANA, the first distributed enterprise in-memory database system enabling real-time analytics over transactionally integrated row and column stores. Today, SAP and many other companies run ERP, CRM, business warehouse on HANA. By SAP’s request, Prof. Cha led SAP’s Korean HANA development.”

Where is Hasso’s recounting of what Prof Cha was doing then? Hasso proposes that HANA was wholly original, but he would have approved and perhaps driven the purchase of P*TIME.

Go to top

Conclusion

As presented by Hasso, the story of HANA’s origins appears fishy. It looks like a story that minimizes the pre-existing inputs before Hasso even entered the scene with column-oriented databases. A tiny “addition” allows Hasso to take credit for others’ work.

Hasso seems to imply that TREX and P*TIME were just “sitting around” until he and his Ph.D. candidates used them to create HANA. The story presented by Hasso in his book The In-Memory Revolution seems to minimize the technologies they relied upon and to propose the idea in a way that came out of nowhere and was a lightning bolt of creativity developed by Hasso Plattner. And this is how people who are worth a lot of money (and Hasso is worth north of $20 billion) can manufacture history to position themselves as the inventors of things. A perfect example of this is Thomas Edison. Thomas Edison was a terrible scientist, according to Nikola Tesla. And who initially credited Thomas Edison with this invention? It was Thomas Edison himself!

How Hasso Plattner Has Channeled Thomas Edison

A person seeking to do this follows a vital pattern. They do not attribute work before their own, and Thomas Edison was famous for doing this. Column-oriented databases have been in existence as long as relational databases. Hasso asserts that because he came up with the idea of not using aggregates, simply a switch of a pre-calculated table read for recalculation, he created something entirely new. However, upon analysis, I can’t see anything of substance added by Hasso. And the one idea Hasso seems to have come up with — dropping aggregates is not a good idea. If you approach an area with a well-developed body of work and functional commercial products (many column-oriented databases were being sold then) and add one minor idea that is not even a good idea, you did not invent anything. It’s not innovation. It’s claim jumping.

What is more apparent is that Hasso pushed the idea of column-oriented databases — but that is the promotion. Like Thomas Edison, Hasso is a great promoter. But a significant contributor to databases? Hasso and his Ph. D.s did not accomplish this.

The Cover Story

The story presented by Hasso in his book The In-Memory Revolution minimizes the technologies they relied upon and proposes the idea in a way that came out of nowhere. After HANA was introduced, Hasso told more inaccurate stories about the supposed benefits of HANA. Those benefits — particularly vis a vis Brightwork has debunked the competition. (see the articles on this site for which areas of SAP’s claims about HANA’s superiority have been debunked)

It seems that not only were the proposed benefits of HANA exaggerated and inaccurate, but now the origin of HANA was engineered to make it appear as if HANA was primarily Hasso’s idea. And this has filtered through to become the accepted view. This is commonly proposed by IT media entities and SAP consulting companies who aggressively bow to anything that SAP asserts. However, how can something that existed before HANA was developed be an original idea?

This is a constant problem with SAP, where they seem to have innovated things they never did.

Interesting Questions on HANA’s Development

Some interesting questions arise from this case study.

Why did SAP purchase some of HANA’s primary components less than a year before Hasso “invented” HANA?
Why was Professor Cha asked to lead the Korean development of HANA?
On Professor Cha’s website, he states that his company was “quietly” purchased by SAP. Why was the purchase “quiet?” Was this because SAP planned to take more credit for HANA being internally developed than it was?
Why don’t we even hear about Professor Cha in Hasso’s story about HANA? Was Hasso unfamiliar with Professor Cha’s work with a professor who developed a company that SAP purchased?
Did any IT media look into this story, or did they accept the storyline that HANA and the essential component technologies were Hasso Plattner’s inventions?