Last Updated on March 24, 2021 by Shaun Snapp
- AWS and Google Cloud provide an amazing number of database options to customers.
- What this means for on-premises environments with more restricted options.
Options Galore with AWS and Google Cloud
AWS does offer some internally developed products (the Aurora database being one), and Google Cloud offers both the Spanner database and the Go coding language. Still, the vast majority of AWS’s and Google Cloud’s revenue comes from providing services for items they did not have a hand in developing. Kubernetes is an open-source project initiated by Google, which has helped increase the popularity of Google Cloud; however, Kubernetes is not controlled by Google.
Some of these items that AWS and Google Cloud offer are open source and commercial license products. This “universal” revenue stream places AWS and Google Cloud in a far less biased position than a software vendor that is trying to sell its licenses first (which leads directly to the most profitable thing SAP and Oracle sells, which is support), and it is the cloud as a distant second. Both Oracle and SAP have tiny cloud businesses compared to AWS or Google Cloud.
AWS explained this choice in the following quotation.
“The days of the one-size-fits-all monolithic database are behind us,” he said. “Our customers are changing how they develop applications and they need particular databases to do that.”
This is particularly prevalent in the space for analytical data in nature rather than designed to support an application (because of the lower degree of lock-in due to application certification). Moreover, what have we observed in this market? The data from applications are increasingly being moved into data lakes or data staging areas at a lower degree of normalization (increasing the data size but reducing its maintenance). These are being pushed out of more structured higher maintenance databases like Oracle (version 18 presently) and into databases like Hadoop or MongoDB. However, SAP and Oracle are trying to “get into” this open-source environment by declaring that their commercial software is necessary. Let us review a graphic in this area.
Notice the data coming from SCM (SAP) goes to HANA, then to MAPR. However, the sensor and social media data (which is not from SAP, goes directly to MAPR). What is the benefit of HANA in this design? SAP makes several proposals, but non-SAP projects are pleased with 100% open source Hadoop. If the ERP, ECM, CRM (SAP) data is in Oracle, it can go directly to MAPR. However, if any application sits on HANA, it must then go through a second HANA database due to indirect access rules enforced by SAP. There is no reason for SAP to have HANA in this design except to infiltrate the open-source MAPR solution.
What Oracle Offers
Oracle offers various database types, but its primary strength is in the structured (highly normalized) relational database design. However, the database market’s growth is more in the unstructured database design (which often means it holds less normalized data than a database that supports an application that requires highly normalized data). Many people with excellent database knowledge, like Werner Vogels that propose the highly normalized relational database design, have been over-applied. Open-source offerings rather than commercial offerings dominate less normalized databases.
Moreover, of course, Oracle has no interest in using its Oracle Cloud to allow companies to host non-Oracle databases or applications. Secondly, the Oracle Cloud is such an uncompetitive offering to AWS or Google’s Cloud Services that it would make little sense to use Oracle Cloud even if Oracle were interested in opening up to other vendors and open source. The Oracle Cloud, much like the SAP Cloud, is for hosting Oracle databases and applications. That is the extent of Oracle’s vision for the Oracle Cloud. By contrast, AWS offers the ability to test all of the different databases (including SAP and Oracle databases), which dramatically increases the ability of a company to test different databases and compare and contrast the offerings.
One such area to test is covered in the following quotation.
“And, if you’ve mixed online transaction processing (OLTP) and analytics-style data access, moving from a one-tool-for-everything Oracle setup to using a separate warehouse for reporting and analytics can improve both your application responsiveness and your analytics capabilities. There are options to create a dedicated Postgres-XL–based warehouse or use Amazon Redshift as a powerful managed warehouse.” – David Rader
Increasing the Type of Databases Put Into Use
This increase in the types of databases put into use. For years SAP and Oracle, and even IBM have been telling customers they offer the database processing types that they needed and that various processing types could be met their RDBMS databases. When SAP promoted their HANA in-memory RDBMS database is superior to all other databases, Oracle and IBM copied SAP by adding column-oriented “in memory” capabilities to the Oracle and IBM RDBMS database. Bloor Research questioned whether this was really worth the extra overhead, as we covered in the article How Accurate Was Bloor on Oracle In-Memory? The sizing of each of their databases is by itself a lengthy process, and the commitments for specific hardware have greatly restricted testing the proposals by SAP and Oracle. When the performance does not match what SAP or Oracle says, some excuse is often given. After the customer has purchased the software and the hardware and paid for the implementation, the vendor has the power in the relationship.
In terms of options, AWS offers EC2, which is AWS’s computer cloud. With EC2, AWS offers over 60 different instance types that are categorized by the following types.
- General Purpose
- Computer Optimized
- Accelerated Computing
- Storage Optimized
AWS’s RDS has 35 different instance types. These changes depending upon the number of CPUs, the amount of memory, whether the instance is EBS optimized, and the network’s speed.
AWS S3, a storage offering, has four different storage classes (Standard, Standard-IA, One Zone-IA, and Glacier). It also has options concerning access control. In each AWS offering, SAP and Oracle customers will observe options that they are not accustomed to in SAP or Oracle. All of these options are public; they do not need to be communicated through an account rep.
All of this allows AWS to support new approaches to data management, as is covered in the following quotation by Werner Vogels, the AWS CTO.
“If there’s a unifying theme to AWS’s disparate set of databases, he said, it’s that they’re all aimed at supporting cloud-native methods of creating applications that aren’t driven by the way the data needs to be stored in a single kind of database. Instead, the cloud application, often composed of smaller bits of code widely distributed in multiple data centers and the cloud, drives the way the data needs to be accessed and used. That, Vogels contends, requires different kinds of databases for different kinds of applications.”
SAP and Oracle’s Oppositions to Open Source
Also, there is just no way for SAP or Oracle to provide such various databases on their cloud offerings. One reason is both SAP and Oracle are very opposed to open source options. SAP is an expert at taking open source offerings and then making them closed source. SAP has a product that is a copy of the open-source Spark component, called Vora, that connects HANA to Hadoop (as we covered in the article How Accurate is SAP on Vora? ). Also, almost no one uses it. Nor should anyone. Hadoop does not need HANA. An intelligent company to figure out Hadoop will also figure out that there are far better column-oriented in-memory databases to connect to Hadoop rather than HANA. SAP is always coming up with some intrusion into open source with a commercialized offering. Oracle’s history with open sources has been one of hostility and neglect. Several very prominent examples include the following:
- Oracle purchased Java, OpenSolaris, OpenOffice.org and MySQL, and others, and is widely considered to have worsened each of these open source offerings.
- Oracle’s acquisition of MySQL is a significant factor that drove the growth of other open-source database projects like MariaDB and PostgreSQL.
- AWS is opening up its customers’ horizons in a way the customers have not had in the past, as explained by Werner Vogels.
“More generally, Vogels contended, AWS’ own enterprise customers were looking for alternatives. “With many of our enterprise customers migrating from on-premises into the cloud, there’s a desire to move away from commercial databases, mostly because of the licensing restrictions and the lack of control over the cost.”
And AWS is the best in the market at offering these options.
“Now, he noted, a lot of companies are using multiple Amazon databases for various parts of their business. “What we’re seeing in AWS customers is they’re using a multiplicity of databases,” he said. “They’re looking for the best tool for each application, or maybe multiple tools.”
“For instance, Airbnb Inc. uses DynamoDB for storing users’ search history, ElastiCache for storing site sessions for faster site rendering, and MySQL on another AWS relational database, RDS, as its main transactional database. Besides Elasticsearch, Expedia also uses Aurora, ElastiCache and Amazon’s Redshift data warehouse.”
This is a fundamental change in how databases are evaluated and then used. This means that the structure and row-oriented database that was overapplied is giving way to a multitude of specialized database types. AWS emphasizes the educational challenge in leveraging these different database types in the following quotation.
“The biggest challenge is education; there is another way, but it means learning something new,” Jim Webber, chief scientist at the graph database maker Neo4j Inc., also told SiliconANGLE.
“If all I’ve got is a hammer, then every problem is a nail. Relational is a beautiful hammer.”
Changes are afoot in software development that is intertwined with the cloud. The cloud is reinforcing these changes. This is because the cloud is making so many options available to developers. These changes are reinforced by specific cloud providers, not all of them. SAP Cloud and Oracle Cloud make it very difficult to bring up services and have all manner of quality problems. SAP and Oracle are overpromising in the cloud to the degree that it is difficult to believe what they say about their cloud offerings. AWS and Google Cloud offer so many services that SAP Cloud and Oracle Cloud that there is no way to compare these two sets of the cloud. Two clouds have more significant numbers of users logging in and trying new things, and using their production clouds. Two other clouds are more brochureware designed to help those opposed to the cloud and opposed to open source projects cloud wash for Wall Street. One of the book’s authors, Ahmed Azmi, recently tested SAP Cloud for a prospect, and it was a resounding “pass.” In AWS, Azure, and GCP, we can spin up/dispose of a container within a few seconds. On SAP Cloud, containers take as long as 8 minutes to start. That’s not an ideal environment for building lightweight microservices.
These changes are allowing a movement from monolithic designs with little choice to containers where the number of options seems endless, and where much more time must be spent in evaluating individual components rather than simplistically choosing to use the Oracle database because one is an Oracle shop, or to use ABAP because SAP says that it is “standard SAP.” This means multiple programming languages being used and multiple databases to develop applications that are “composites” of multi-container applications. It also means being able to leverage custom and, in many cases, self-configured hardware in a way that was not possible before and that removes the necessity to perform sizing.
Never before IT has there been such a necessity to perform testing or such a fast and straightforward way to perform that testing. The previous on-premises approach where vendor sales reps could make assertions that could not be first be tested before a purchase is diminishing. These are all positive developments, but they mean a new day for IT departments. Previous IT departments’ structure is not the appropriate structure required to leverage the new cloud lead alternatives.
Now that we have covered how the four different clouds compare let jump into what the cloud means for the data warehouse and data lake.