- SAP has been proposing that all companies should use a single database type and that they should buy HANA.
- AWS’s CTO explains the benefits of the multi-base approach.
Introduction to Multibase
What is often left out of the analysis of database advice from commercial software vendors is how biased it and self-centered it is. Commercial database vendors don’t provide any information to a customer that is not in some way designed to get the customer to invest more deeply in the vendor’s commercial products. As bad as Oracle’s “advice” to companies has been, Oracle at least has respected, although highly self-centered, knowledge of databases. SAP’s slightly insane advice to their customers has been far worse, and far more self-centered. For years SAP has been telling customers that they need to perform multiple
For years SAP has been telling customers that they need to perform various types of database processing from a single database. This is wholly false but has not stopped either SAP or their partner network for saying its true. We have covered in detail how SAP’s proposals about HANA have ended up being proven incorrect in article ranging from What is HANA’s Actual Performance?, A Study into HANA’s TCO, to How Accurate Was Bloor on Oracle In-Memory.
See our references for this article and related articles at this link.
In this article, we will expand into a topic that shows how wrong SAP is. This perspective we will address is not brought forward by either SAP, Oracle, IBM, or Microsoft, but by the entity, providing thought leadership on the future of how databases being used…which is AWS.
Verner Vogels on Multiple Database Types
In an excellent article by Verner Vogels, who is the CTO of AWS. Let us begin with how he starts the article.
“A common question that I get is why do we offer so many database products? The answer for me is simple: Developers want their applications to be well architected and scale effectively. To do this, they need to be able to use multiple databases and data models within the same application.”
Notice the last part of this paragraph, where Verner describes using “multiple databases and data models within the same application.” Wait, what was that? We all know that applications have a single database, right? How does a single application use multiple databases? What is Verner talking about?
Well, it turns out Verner is describing software development that is different than the monolithic environment. Verner goes on to say this…
“developers are now building highly distributed applications using a multitude of purpose-built databases.”
That is the application that we think is one way of developing, but this is giving way to distributed applications that can access multiple databases. It is an unusual way of thinking about applications for those of us who came up under the monolithic model.
The Limitations of the Relational Database
Verner goes on to describe the limitations of the relational database.
“For decades because the only database choice was a relational database, no matter the shape or function of the data in the application, the data was modeled as relational. Is a relational database purpose-built for a denormalized schema and to enforce referential integrity in the database? Absolutely, but the key point here is that not all application data models or use cases match the relational model.”
This we have seen in the rapid growth of databases like MongoDB and Hadoop that specialize in either unstructured data or data with lower levels of normalization. Verner describes how Amazon ran into the limitations of using the relational database.
“We found that about 70 percent of our operations were key-value lookups, where only a primary key was used, and a single row would be returned. With no need for referential integrity and transactions, we realized these access patterns could be better served by a different type of database (emphasis added). This ultimately led to DynamoDB, a nonrelational database service built to scale out beyond the limits of relational databases.”
Let us remember, AWS has a very fast growing relational database service in RDS. However, they also have fast-growing non-relational databases like DynamoDB.
The Different Database Types According to Verner
Below we have provided a synopsis of the different database types, their intended usage, and the database that reflects them by Verner.
- Relational: Web and Mobile Applications, Enterprise Applications, Online Gaming (e.g., MySQL)
- Key Value: Gaming, Ad Tech, IoT (DynamoDB)
- Document: When data is to be presented as a JSON document (DynamoDB)
- Graph: For applications that work with highly connected datasets (Amazon Neptune)
- In Memory: Financial Services, Ecommerce, Web, Mobile Applications (Elasticache)
- Search: Real-time visualizations and analytics generated by indexing, aggregating, and searching semi-structured logs and metrics. (Elastisearch Service)
And actually, it is a bit more complicated than even Verner is letting on. This is because some databases that AWS releases or releases access to end up being used differently than first intended. This is described in a comment on Verner’s article.
“It turns out that your products are so good that people do end up using them for a different purpose. Take Amazon Redshift. I remember when Amazon Redshift was launched, a question came from the audience if you can use Redshift as an OLTP database, even though it’s OLAP. Turns out using Redshift in an OLTP scenario is one of the major use cases, to build analytical applications. We are one of those use cases, we’ve built an analytical app on top of Redshift. The OLTP use case stretches Redshift once you start putting a serious number of users on it. Even with the best WLM configuration.
To solve for that, we’ve used a combination of Amazon RDS, Amazon Redshift and dblink plus Lambda and Elasticsearch. Detailed write-up on how we did it here:”
The Multi-Application Nature of Solutions Distributed by AWS
The multi-application nature of solutions is explained as follows by Verner.
“Though to a customer, the Expedia website looks like a single application, behind the scenes Expedia.com is composed of many components, each with a specific function. By breaking an application such as Expedia.com into multiple components that have specific jobs (such as microservices, containers, and AWS Lambda functions), developers can be more productive by increasing scale and performance, reducing operations, increasing deployment agility, and enabling different components to evolve independently. When building applications, developers can pair each use case with the database that best suits the need.”
But what are packaged solutions offering? Well, monolithic applications that are the exact opposite of this. And as SAP is a perfect example of a monolithic application provider, SAP wants customers to use a single database. Further, they want customers to use “their” single database as in HANA. Which, according to SAP, can do all the processing as well as all the different database types described by Verner above? The one problem being, HANA can’t.
The AWS Customers Using Multibase Offerings
- Airbnb: DynamoDB, ElastiCache, MySQL
- Capital One: RDS, Redshift, DynamoDB
- Expedia: Aurora, Redshift, ElastiCache, Aurora MySQL
- Zynga: DynamoDB, ElastiCache, Aurora
- Johnson and Johnson: RDS, DynamoDB, Redshift
Verner goes on to say.
“purpose-built databases for key-value, document, graph, in-memory, and search uses cases can help you optimize for functionality, performance, and scale and—more importantly—your customers’ experience. Build on.”
The Problem with SAP and Oracle Cloud and Leveraging the Multibase Approach
SAP and Oracle have been touting their cloud. However, with SAP and Oracle, the cloud is only a pathway to lead to SAP and Oracle’s products. This is as true of databases. SAP and Oracle are closed systems. They dabble in connecting to non-SAP and non-Oracle, but only to co-opt an area so they can access markets. AWS and Google Cloud are quite different. Notice the variety of databases available at Google Cloud.
There are over 94 databases out at Google Cloud, and far more out at AWS. These databases can be brought up and tested very quickly. Selecting one of the databases brings up the configuration screen. Furthermore, the number of database and database types is only increasing with AWS and Google Cloud.
Right after this is launched, one can bring up a different database type (say NoSQL or Graphic) and immediately begin testing. Under the on-premises model, this would not be possible. Instead of testing, one would go through a sales process, and a commitment would be made, the customer would be stuck with (and feel the need to defend) whatever purchase had been made. We have entered a period of multi-base capabilities, and AWS and Google Cloud are the leaders in offering these options. This will transform or is transforming how databases are utilized. And the more open source databases are accessed, the worse commercial databases look by contrast.
Packaged solutions ruled the day for decades. After the 1980s, custom coded solutions were for losers. They were to be replaced by “fantastic ERP” systems that would make your dreams come true. And who agreed to this? Well vendors and consulting companies with packaged software and packaged services to sell. Consulting companies became partners with packaged software companies, parroting everything they said, without evidence. Even to the point where almost no one in IT is also aware that packaged ERP systems have a negative ROI as we cover in the book The Real Story on ERP. ERP proved to be a false God and delivering both a negative ROI (but a positive ROI for vendors and consulting firms) while saddling companies with systems that put the ERP vendors in the driver’s seat of account control to extract more and more out of their “customers.”
Now, as I read about distributed applications accessing multiple databases, are we entering a period where the pendulum switches to custom coding again. Under the SAP or Oracle paradigm, you accepted the databases that were “approved” by SAP and Oracle. All competition was driven out of the process. Oracle applications worked with the Oracle database. SAP finally decided to introduce HANA to push the Oracle DB out of their accounts. SAP now thinks that all SAP applications should sit on a SAP HANA database.
Verner is describing a combination of components that are selected and stitched together. Most of these databases are open source. And one can choose from a wide variety offered by AWS. This is inherently contradictory to packaged applications, because the packaged application uses one DB, and works in a particular and defined way.
While this is little-discussed, AWS/GCP can be viewed as opposed to packaged applications. Sure, leveraging AWS/GCP will start with the migration of packaged applications, but once companies get a taste of freedom, it will begin breaking down the rules enforced by the packaged software vendors. And who will tell this story? Will it be Gartner? No. Gartner receives 1/3 of it is a multi-billion dollar per year revenues from packaged software vendors, and it is doubtful that AWS or GCP will pay Gartner to sing their praises the way that the packaged software vendors have. Gartner presents SAP Cloud, Oracle Cloud, AWS, and GCP as if they offer the same thing, but that AWS is simply “ahead” of SAP Cloud and Oracle Cloud. Gartner has no interest in educating their customers as to the reality of AWS and Google Cloud, as it cuts against their corrupt revenue model.