Why SAP HANA Is Fast Database For Analytics

Executive Summary

  • We explain what the HANA database is at the most basic level.
  • Why improvements in storage hardware and other changes in software speed database queries.
  • How data access and data swapping for analytics work in HANA.
  • Whether SAP’s claims regarding compression make sense and what it has to do with the data type being stored.

Introduction

SAP has made many proposals around HANA’s performance, but the majority of people that discuss HANA do not appear to understand the logic of the performance claims or whether they are true. You will learn the truth around HANA’s capabilities.

Understanding SAP’s Marketing Around HANA

As I explained in this previous article titled Has SAP’s Relentless HANA Push Paid Off?, I brought up the fact that SAP has redirected its marketing efforts to focus to a very significant degree on SAP HANA. However…

  • What is SAP HANA
  • What are the actual SAP HANA technology underpinnings to SAP HANA?
  • That is what makes SAP HANA go so fast?

In this article, you will learn what is SAP HANA. This will be explained in a way that should be accessible to anyone from executives to users to anyone interested in understanding the parts about HANA that is real.

What is SAP HANA Database at the Most Basic Level (What is SAP HANA)

SAP HANA is the name SAP uses to describe its offering that combines software and hardware for enhancing speed, principally based upon leveraging hardware performance improvements and cost reductions in random access memory and solid-state memory, and then changing data management software that interacts with this hardware.

SAP HANA must be simplified for decision-makers to be able to move beyond the marketing hype and simplistic platitudes in their determinations of how and when to use HANA.

The first thing to understand is that SAP HANA is simply the branding of some technologies. These are not technologies on which SAP has a monopoly. The two principal technologies are the following:

  1. Storing a Databases in Memory (in RAM or SSD) Versus on Disk
  2. The Columnar Database

I should also point out that SAP is a relative newcomer to databases — they have had some small database projects like SAPDB/MaxDB, and they made a quite large acquisition of Sybase back in 2010. However, for almost all of SAP’s history their software has resided on the databases of other software vendors. This is a long way of me saying that there is not a lot of databases that SAP knows that other software vendors do not also know. SAP is simply the most important and aggressive marketer of this approach.

SAP HANA Database Overview: Improvements to Storage Hardware and the Corresponding Changes to HANA Database Queries

As is explained quite nicely by Professor Sam Madden at MIT, data queries change when a database is moved from spinning hard disks to either random access memory or solid-state memory.

How Data Access Works

  • When a spinning disk is used, even if a query requires only three fields within a 10-field table, the software must read all ten fields.
  • If the table is 1,000,000 records and 100 megabytes, then the entire table must be read to complete the query.

And in fact, this is not even the worst part. This is because questions often involve multiple tables.

A query which pulls three fields, but which are distributed in 3 tables, must read every single one of those tables to completion.

This is a function of how the data is accessed on a spinning disk.

However, this is not the case when data is stored in random access memory or solid-state memory. Here, a query that uses only three fields is only required to read three fields, regardless of the number of fields in that table.

The Elimination of Swapping

Traditional database systems stored on a disk spend lots of time in an activity known as swapping. Swapping is where data is read into memory from the disk. It is processed, written back to disk, purged, and a new cycle is begun.

In memory, databases remove this swapping because all of the data that is manipulated in loaded into storage. As previously stated, disks within a HANA implementation are not used for primary processing, but for offline data backup.

SAP HANA is often justified by performance. It is important to consider that performance does not correlate directly with business benefit. SAP wants companies to make this error as it puts them in the driver’s seat.

SAP is not interested in the sticky questions related to actual business benefits of SAP HANA, because then the story begins to be much less impressive.

Compression

Why is Compression So Effective in HANA Database?

Column-based databases like a SAP HANA database have a significant advantage when it comes to data compression. This is because once placed into columns; files can be compressed very easily.

This is because there are often so many duplicates in any particular column.

A good example of this is a column that contains the color of a product.

  • If the column has 1000 records…
  • and there are five colors that are possible…
  • then, of course, most of the fields are duplicates…
  • so 200 white, 300 blue, 250 red, etc.…

This means less data redundancy to begin, in addition to the compression — which comes from having fewer unique combinations.

The Importance of the Commonality of the Data Type

This compression is possible because all of the data in a table/column is the same data type. And in many cases, the compression is very significant. It is common to be able to compress columnar databases in the 80% and up range. This process of moving from a standard “row oriented” relational database to a columnar (every table is one column of data) is called horizontal partitioning.

This is because the normal relational table is broken into columns or partitioned.

sap-performance-vs-benefits-hana

The Problems with Finding Uses Cases for SAP HANA Technology

It is also well known it is difficult to come up with use cases for HANA, and that is a problem for closing a HANA sale. So once we get past the presentations about HANA’s capabilities, there are real issues with customer interest and adoption.

This should be acknowledged when discussing the continuation of the HANA marketing strategy.

Who is Measuring the Benefits of SAP HANA Technology

HANA is slated to be the infrastructure of all SAP applications eventually. Let me first address one of the most common implementations of HANA.

This is porting SAP’s BI/BW onto HANA, which is considered one of the most straightforward implementations of HANA that can be performed. However, at companies where I have seen this accomplished, while the reports do run faster, the major bottleneck, which is the backlog of reports that have yet to be created does not change.

Noting this is the difference between simply observing a performance improvement versus understanding the overall benefits of implementing a technology.

SAP Applications on Top of SAP HANA

In other cases where HANA is proposed, such as Simple Finance (where FI/CO is ported to HANA with a new user interface called Fiori), being able to quickly process finance transactions has not been a constraint in ERP systems for many years. In this case, there is an extra complexity. This is because the front end of Finance is different than ECC. It does operate more efficiently than the SAPGUI. That is not HANA – that is it is not the infrastructure change out that is the major differentiating factor.

This is another common problem with HANA, the descriptions of what it improves often morphs into discussions of other new SAP products that are not in fact HANA.

Therefore a discussion that starts off with SAP HANA technology ends up with as a discussion on some other technology.

Danger

SAP HANA Warning!

It is about as easy to get incorrect information on SAP HANA as it is to get it on Big Data.

This issue with the excessive hyperbole on SAP HANA is a serious problem concerning understanding what it can do and how it should be used. It was developed to help cut through the hyperbole on HANA and provide a basis for which to analyze SAP HANA statements.

However, this is of course only one dimension of understanding SAP HANA. None of the consulting companies will touch this issue and have served primarily as sales arms of SAP since — well since they started partnering with SAP. The vast majority of analysts either have a conflict of interest in bringing this up, don’t understand databases well enough to know what part of the SAP HANA story is real and what part is smoke. One perfect example of this inaccuracy that flows through the HANA explanations is the following:

“Relational databases typically use row-based data storage. However Column-based storage is more suitable for many business applications. SAP HANA supports both row-based and column-based storage, and is particularly optimized for column-based storage.” – SAP HANA Tutorial

SAP and their consulting network continue to present all other databases as “traditional.” However, Oracle, DB2 and SQL Server all have column stores. And because each company is better at databases than SAP (a newbie to DBs), the evidence indicates is that both Oracle, DB2 and SQL Server are better that HANA at even the column/analytics processing. However, SAP is not updating the information it first began distributing back when these other vendors were further back than SAP on column oriented processing. SAP wants to freeze all of their competing database vendors back in 2011. Here is another quote that continues the inaccuracy that only SAP has columnal storage.

“Can we just increase the memory of the traditional database (like Oracle) to 1 TB and get similar performance?

NO. You might have performance gains due to more memory available for your current Oracle/Microsoft/Teradata database, but HANA is not just a database with bigger RAM. It is a combination of a lot of hardware and software technologies. The way data is stored and processed by the In-Memory data base is the true differentiator. Having that data available in RAM is just the icing on the cake.” – SAP HANA Tutorial

But in fact, HANA does memory optimize. The curious thing is that SAP does not seem to have the same capabilities to optimize memory, so it has to brute force the solution with very large hardware specs. Benchmarks by a vendor shared with me illustrate that the hardware that HANA has is not addressed properly. So a lot of the hardware ends up being wasted.

HANA as a Major Marketing Tentpole

SAP HANA has been a marketing tentpole of SAP for over 4.5 years. Still, the knowledge of SAP HANA is still fragile. Secondly, few people have implemented a HANA system, and shockingly few have implemented any SAP HANA once one gets past the most common implementation, which is porting the SAP BW to SAP HANA. SAP has seen little return on its SAP HANA investment, but SAP HANA is still rising as a topic of interest — perhaps not among those that work in close collaboration with SAP, but overall. This was verified by web metrics and was a surprise.

There are a lot of interesting storylines to cover on SAP HANA, and we will cover as much as we have the time and the information and understanding to cover.

Conclusion

This article was designed as a SAP HANA technology overview. Columnar databases have speed advantages. However, they are not universal benefits.

Compression is an advantage of column-based databases — and this has been true since column based database was invented. However, column-based databases represent only a small fraction of the overall database market. Why? Well, there is much more to database design than compression. SAP will most often bring up a positive aspect of HANA — or a column based database, but leave out the negatives.

HANA is presented by SAP as a universally advantageous combination of database design combined with faster memory/storage. The less one knows about databases, the more this seems credible. The article Is SAP HANA a Major Advantage for ERP Systems? Explains why SAP HANA’s speed benefits do not hold true for this type of application.

Search Our Other HANA Content

Financial Disclosure

Financial Bias Disclosure

Neither this article nor any other article on the Brightwork website is paid for by a software vendor, including Oracle, SAP or their competitors. As part of our commitment to publishing independent, unbiased research; no paid media placements, commissions or incentives of any nature are allowed.

HANA & S/4HANA Research Contact

  • Interested in Research on S/4HANA & HANA?

    It is difficult for most companies to make improvements in S/4HANA and HANA without outside advice. And it is not possible to get honest S/4HANA and HANA advice from large consulting companies. We offer remote unbiased multi-dimension S/4HANA and HANA support.

    Just fill out the form below and we'll be in touch.

References

https://www.youtube.com/watch?v=mRvkikVuojU

https://www.youtube.com/watch?v=mRvkikVuojU

https://www.datasciencecentral.com/profiles/blogs/row-vs-columnar-vs-nosql-databases

*https://gigaom.com/2010/05/12/analysis-why-sap-bought-sybase-for-5-8-billion/

https://searchdatamanagement.techtarget.com/definition/columnar-database

https://docs.aws.amazon.com/redshift/latest/dg/c_columnar_storage_disk_mem_mgmnt.html

https://docs.aws.amazon.com/redshift/latest/dg/t_Creating_tables.html

https://www.oracle.com/technetwork/database/in-memory/overview/twp-oracle-database-in-memory-2245633.html

https://www.forbes.com/sites/oracle/2015/12/18/oracle-challenges-sap-on-in-memory-database-claims/#4206580e177f

https://saphanatutorial.com/sap-hana-online-courses/

It should be noted that spinning disks are still used in HANA installation, but they are now primarily relegated to backup and archival, so their reduced speed does not interfere with the processing time of queries.

Compression is its area of specialty within database design. Database administrators can choose from different approaches or compression algorithms.

Regarding hardware, overall hardware advances have been benefiting all computer users for some time, and the rightful claimants to these benefits are the hardware manufacturers, not the software vendors. This is a typical progression in computer technology.

I cover how to interpret risk for IT projects in the following book.

The Risk Estimation Book

 

Software RiskRethinking Enterprise Software Risk: Controlling the Main Risk Factors on IT Projects

Better Managing Software Risk

The software implementation is risky business and success is not a certainty. But you can reduce risk with the strategies in this book. Undertaking software selection and implementation without approximating the project’s risk is a poor way to make decisions about either projects or software. But that’s the way many companies do business, even though 50 percent of IT implementations are deemed failures.

Finding What Works and What Doesn’t

In this book, you will review the strategies commonly used by most companies for mitigating software project risk–and learn why these plans don’t work–and then acquire practical and realistic strategies that will help you to maximize success on your software implementation.

Chapters

Chapter 1: Introduction
Chapter 2: Enterprise Software Risk Management
Chapter 3: The Basics of Enterprise Software Risk Management
Chapter 4: Understanding the Enterprise Software Market
Chapter 5: Software Sell-ability versus Implementability
Chapter 6: Selecting the Right IT Consultant
Chapter 7: How to Use the Reports of Analysts Like Gartner
Chapter 8: How to Interpret Vendor-Provided Information to Reduce Project Risk
Chapter 9: Evaluating Implementation Preparedness
Chapter 10: Using TCO for Decision Making
Chapter 11: The Software Decisions’ Risk Component Model

Risk Estimation and Calculation

Risk Estimation and Calculation

See our free project risk estimators that are available per application. The provide a method of risk analysis that is not available from other sources.

project_software_risk