Last Updated on March 21, 2021 by Shaun Snapp
- Hasso Plattner and SAP have put forth a false backstory for how HANA was developed.
- We cover whether removing aggregates from HANA is the breakthrough it is proposed to be, where TREX and P*Time came from, the importance of “zero response time” from a database, and the University of Korea connection.
The story of HANA’s development looks more suspicious the closer one looks at it. Luckily for SAP, few people do. You will learn the most probable story for HANA’s origin and how it was changed by SAP to glorify Hasso Plattner and help SAP make false innovation claims.
Our References for This Article
If you want to see our references for this article and other related Brightwork articles, see this link.
Lack of Financial Bias Notice: The vast majority of content available on the Internet about SAP is marketing fiddle-faddle published by SAP, SAP partners, or media entities paid by SAP to run their marketing on the media website. Each one of these entities tries to hide its financial bias from readers. The article below is very different.
- First, it is published by a research entity.
- Second, no one paid for this article to be written, and it is not pretending to inform you while being rigged to sell you software or consulting services. Unlike nearly every other article you will find from Google on this topic, it has had no input from any company's marketing or sales department.
SAP’s Official Story on the Origin of HANA
Hasso Plattner has been widely credited with inventing HANA. The following quotation from Quora covers the common understanding of this.
“I think I am late to answer this, but I completely agree with Anuj. Vishal was the marketing guy or the idea guy but Hasso designed HANA. Both of them are geniuses and at SAP, Vishal will be always missed.” – Quora
And here is the explanation of its genesis from Wikipedia.
“The first major demonstration of the platform was in 2008: teams from SAP SE, the Hasso Plattner Institute and Stanford University demonstrated an application architecture for real-time analytics and aggregation. Former SAP SE executive, Vishal Sikka, mentioned this architecture as “Hasso’s New Architecture”.”
In this article, we will analyze how HANA was invented and who invented it.
The Explanation Who Invented HANA
Hasso wrote four books on HANA. In one of the books, The In-Memory Revolution: How SAP HANA Enables Business of the Future. In this book, Hasso Plattner explains the genesis of HANA.
“Its fall 2006: I, Hasso Plattner, am a professor for computer science at the HPI in Potsdam Germany. My chair has the focus on enterprise system architecture, and I have to find a new research area for my PhD candidates. Yes, they have to find the topic for their dissertation themselves, but I wanted to guide them towards something I was really familiar with, a concept for a new Enterprise Resource Planning system. All my professional life I have worked on such systems, and I ask myself, what would they look like if we could start from scratch?”
On a side note, it is interesting that Ph.D. candidates are directed to work on something beneficial for SAP. So this is a strange university as it seems to be a research outfit for SAP rather than a university. That is, how much will the Ph.D. candidates be paid to work on this?
SAP Exploiting Cheap Ph.D. Student Labor?
That is, how much will the Ph.D. candidates be paid to work on this?
Let’s see the next quote from Hasso on HANA.
“Ever since we have been building such systems, first at IBM, and now at SAP, they were based on the idea what we know exactly what the users want to know. In order to answer their questions in a reasonable time frame, we maintained aggregated data in real time – meaning that whenever we recorded a business transaction we updated all impacted totals. Therefore, the system was ready to give answers to any foreseeable question, thus labeled a real time system. The new idea I come up with is to drop those totals completely, and to just compress the transaction data per day while keeping all additional data intact. It has not much , one piece of paper, but it is a start.
With this I went to my team of PhD candidates and educated them about data structures and data volumes in typical ERP systems. After a lengthy session on the whiteboard, one student asked me what the compression rate might be from the transaction data to the compression data. I did a calculation for a fictitious financial system, and after a while, came up with the answer.
The student wasn’t the least bit impressed, and said, “From an academic point of view, this compression rate is not very impressive.” My new idea, all that I had, was shattered. I took the eraser, wiped out the aggregates in the drawing on the whiteboard and replied, “Okay, no aggregates anymore.”
Why is the objective here to move to such compressed data? Storage is inexpensive.
Hasso Plattner’s Obsession with Compression
I have read much of Hasso’s writing on this topic of compression. I can’t determine if he has a serious mental block on this topic and is fixated on data compression or if he believes data compression is a big deal. If I had to bet, I would bet that he knows it is not worth focusing on. But that he does so because it so happens that column-oriented databases like HANA compress better and that he can trick senior executives into thinking it is an important advantage. Throughout HANA’s existence, SAP has been careful to bring the HANA message to the people highest in companies that know the least about databases. Hasso cannot debate and win against people that know databases.
Furthermore, Hasso has dramatically overstated the compression that HANA is capable of (and subsequently repeated through SAP’s passive surrogate network), as is covered in the article Why John Appleby was So Wrong About his HANA Predictions.
This overall conversation is strange, and it implies that Hasso is making major changes to the design based on a single Ph.D. candidate’s input. And when told about his response, Hasso says that “my new idea — was shattered” seems a bit melodramatic.
Is Removing Aggregates a Breakthrough?
“This was the breakthrough I was looking for. No one had ever built a financial system without materialized aggregates, whether updated in real time or through batch updates.”
Is “breakthrough” the accurate description of was had occurred? Aggregates are precomputed tables, which allow fast retrieval of information. They take up space, but they are quite useful. Without them, the database must calculate everything that it uses on the fly every time the request is made.
Also, what is the benefit of dispensing with aggregates? The database will be smaller, but unless you price the database per its size, this is hardly an issue. And as it happens, HANA did end up becoming priced per GB/TB. There is an important reason why no one had built a financial system without aggregates. There is close to no benefit to doing so. Every year storage becomes less expensive.
Is This the Proper Research Question?
“Back on the offensive, I asked, “What if we assume the database always has zero response time, what would an ERP system look like? This was a proper research question, and the academic work could begin.”
First, why is this the proper research question? Who out in SAP’s client base was asking for a zero response time database for ERP? I ask because I know that SAP customers have been asking for many things, better customer support, better maturity before releasing products, lower costs, etc..Why weren’t those things the proper research question? Second, who cares? ERP systems don’t require zero response time. ERP systems record transactions and they do it quite quickly already. ERP systems can run into issues when processing—for instance, performing a procedure like MRP.
But MRP can be sped by adding memory and CPU capacity to the machine or by doing things like removing the invalid product location combinations run through the system. Database performance is a bottleneck for analytics, but not ERP systems. This is not to say that it can’t be. Furthermore, even if it were true, there is no evidence that HANA is faster than alternatives. This is particularly true for HANA with ERP, as is covered in the article What is the Actual Performance of SAP HANA. On the contrary, HANA is slower than the alternatives, but it is less stable and the highest maintenance database of the other options.
The Importance of a Zero Response Time Database?
“But how about some experimental work? Shouldn’t we check for a database that could come close to this ideal? Was this, in the end, possible at all? This is the wonderful part of doing research at a university. At SAP, ideas such as zero response time database would not have been widely accepted. At a university you can dream, at least for a while. As long as you can produce meaningful papers, things are basically alright.”
Perhaps, but in the industry, a zero response database would have been less accepted because the benefits versus the costs aren’t there, particularly for ERP systems. Ph.D. students will be more willing to work on things that tend to be less practical. In some cases, this can be a good thing. But in this case, Hasso leads his students on a wild goose chase because databases are already quite sufficiently fast to support what companies want to do.
“We asked SAP whether we could have access to the technologies behind their databases; TREX, an in memory database with columnar storage, P*Time, an in memory database with row storage, and MaxDB, SAP’s relational database. My PhD candidates started playing with these systems, and it became clear in a very short time that building a new ERP system was, by far, not as interesting as building a new database. In the end, they were all computer scientists — accounting, sales ,purchasing, or human resource management are more than the scope of a business school student.”
Is This a Believable Story?
So HANA exists because Hasso’s Ph.D. students found it more interesting to create a new database? It is too bad I was not invited to this little soiree because I could have communicated to Hasso and Hasso’s Ph.D.s that this is a poor use of time. SAP has many problems with its customers. And database speed is not even in the top ten list of problems.
“The compromise was that we build a database prototype from scratch, and all the application scenarios with which we were going to verify the concept of a super fast database had to match those of real enterprise systems.”
Hasso set upon his Ph.D. candidates to build a new ERP system? Is that wise? ERP systems are massive combinations of functionality that should probably be undertaken by SAP development. This is not a good subject for a dissertation.
If Ph.D. candidates are developing a new ERP system for Hasso Plattner, this seems quite exploitive (were they getting stock options), and some Ph.D. candidates aren’t the right people to do it.
Where Did TREX and P*Time Come From?
Hasso states that they “asked SAP” if they could access TREX and P*TIME. First, that is unlikely that anyone at SAP would deny Hasso this. It is well-known that McDermott is a puppet, and Hasso still runs SAP. Imagine Larry Ellison asking to use an Oracle database for a research project and request being rejected by Mark Hurd. Would that conceivably happen?
But secondly, there is some information missing in Hasso’s story regarding the origin of P*TIME.
Now let us look at the timeline of HANA and its component technologies.
Why does Hasso describe P*TIME as SAP products without bringing up the point that it was recently acquired by SAP in 2005?
Also, this timing looks peculiar. Hasso tasks his Ph.D. students with developing HANA less than a year after purchasing a critical technology to HANA. Hasso was not aware he would use P*TIME for something like this when he acquired it in 2005?
Let us look into where P*TIME was acquired.
Sang Kyun Cha, the University of Korea Professor
P*TIME was purchased from professor Sang Kyun Cha of the University of Korea. Let us review Sang Kyun Cha’s bio on his webpage.
“SAP HANA – My third generation in-memory database engine (SAP HANA – Wikipedia, Article (Korean))
P*TIME – Founded Transact In Memory, Inc. in Silicon Valley in 2002 (also its wholly-owned subsidiary TIM System in Korea in 2000) to fund the development of the next-generation in-memory DBMS and has led it to the successful acquisition by SAP (the #1 global business software company) in 2005. SAP transformed TIM System to SAP Labs Korea and made an official announcement in March 2008.
Sang Kyun Cha is a professor, an innovator, and an entrepreneur. He worked on three generations of commercialized in-memory database technology since he joined Seoul National University in 1992. In 2000, he founded Transact In Memory, Inc. with his vision of developing an enterprise in-memory database system called P*TIME (Parallel* Transact-In-Memory Engine). The company was quietly acquired by SAP in late 2005.
By SAP’s request, Prof. Cha led SAP’s Korean HANA development.
By early 2006, Prof. Cha’s team completed P*TIME development with an innovative OLTP scalability architecture. With SAP’s in-house column store TREX, P*TIME served as a corner stone of developing SAP HANA, the first distributed enterprise in-memory database system enabling real-time analytics over transactionally integrated row and column stores. Today, SAP and many other companies run ERP, CRM, business warehouse on HANA. By SAP’s request, Prof. Cha led SAP’s Korean HANA development.”
Where is Hasso’s recounting of what Prof Cha was doing at that time? Hasso is proposing that HANA was wholly original, but he would have approved and perhaps driven the purchase of P*TIME.
The story of HANA’s origins, as presented by Hasso, appears fishy. It has the look of a story that minimizes the pre-existing inputs before Hasso even entered the scene with column-oriented databases. Through a tiny “addition,” allows Hasso to take credit for others’ work.
Hasso seems to imply that TREX and P*TIME were just sorts of “sitting around” until he and his Ph.D. candidates used them to create HANA. The story presented by Hasso in his book The In-Memory Revolution seems to minimize the technologies that they relied upon and to propose the idea in a way that came out of nowhere and was a lightning bolt of creativity developed by Hasso Plattner. And this is how people who are worth a lot of money (and Hasso is worth north of $20 billion) can manufacture history to position themselves as the inventors of things. A perfect example of this is Thomas Edison. Thomas Edison was a terrible scientist, according to Nikola Tesla. And who initially credited Thomas Edison with this invention? It was Thomas Edison himself!
How Hasso Plattner Has Channeled Thomas Edison
A person seeking to do this follows a vital pattern. They do not attribute work done before their own, and Thomas Edison was famous for doing this exact thing. Column oriented databases have been in existence as long as relational databases. Hasso asserts that because he came up with the idea of not using aggregates — which is simply a switch of a pre-calculated table read for recalculation, he created something entirely new. However, upon analysis, I can’t see anything of substance added by Hasso. And the one idea Hasso seems to have come up with — dropping aggregates is not a good idea. If you approach an area with a well-developed body of work and functional commercial products (many column-oriented databases were being sold at this time) and add one minor idea that is not even a good idea, you did not invent anything. It’s not innovation. It’s claim jumping.
What is more apparent that Hasso pushed the idea of column oriented databases — but that is the promotion. Like Thomas Edison, Hasso is a great promoter. But a major contributor to databases? Hasso and his Ph.D.’s did not accomplish this.
The Cover Story
The story presented by Hasso in his book The In-Memory Revolution minimizes the technologies that they relied upon and proposes the idea in a way that came out of nowhere. After HANA was introduced, more inaccurate stories were told by Hasso about the supposed benefits of HANA. Those benefits — particularly vis a vis the competition have been debunked by Brightwork. (see the articles on this site for which areas of SAP’s claims about HANA’s superiority have been debunked)
It seems that not only were the proposed benefits of HANA exaggerated and inaccurate but now the origin of HANA was engineered to make it appear as if HANA was primarily Hasso’s idea. And this has filtered through to become the accepted view. This is commonly proposed by IT media entities and SAP consulting companies who aggressively bow to anything that SAP asserts. However, how can something that had already been around before HANA was developed be an original idea?
This is a constant problem with SAP, where they pose to have innovated things that they never did.
Interesting Questions on HANA’s Development
From this case study, some interesting questions arise.
- Why did SAP purchase some of HANA’s primary components less than a year before Hasso “invented” HANA?
- Why was Professor Cha asked to lead the Korean development of HANA?
- On Professor Cha’s website, he states that his company was “quietly” purchased by SAP. Why was the purchase “quiet?” Was this because SAP planned to take more credit for HANA being internally developed than it was?
- Why is it that we don’t even hear about Professor Cha in Hasso’s story about HANA? Was Hasso unfamiliar with Professor Cha’s work unfamiliar with a professor who developed a company that SAP purchased?
- Did any IT media look into this story, or did they accept the storyline that HANA and the essential component technologies were Hasso Plattner’s inventions?