How to Perform Homogeneous and Heterogeneous Database Migration

Executive Summary

  • AWS and Google Cloud can allow for both homogeneous and heterogeneous database migration
  • Learn why cloud services are enabling databases.

Introduction

  • When the source and target database are the same, this is called a “homogeneous” conversion.
  • When the databases are different, this is referred to as a “heterogeneous” migration.
  • Homogeneous migrations are often the starting point for migration. This is because it allows the customer to be used to the AWS or Google Cloud environment. As the customer learns more, they can then migrate the Oracle RDS to PostgreSQL, MariaDB, or others.

The Progression

We see the progression as follows.

  1. The Heterogeneous Migration: The on-premises Oracle application is migrated to Oracle for RDS, and a VM or container is created for the application software. The heterogeneous migration is tested, and when it passes testing, the AWS instance becomes production, and the on-premises instance can be decommissioned.
  2. The Homogeneous Migration: A copy is made of the VM or container for the application. The Oracle RDS is then migrated to a new (for example) PostgreSQL RDS. The VM or container application is connected to the PostgreSQL RDS. The new instance is tested, and when all tests are passed, the Oracle RDS and the first VM or container are shut down. The PostgreSQL and VM/container copy become the production instance.

Moving Away from the Oracle Database

When moving away from Oracle databases, performing a “heterogeneous” conversion is necessary, which is more complicated. A database like PostgreSQL usually does an excellent job of managing the same workloads as Oracle 11, 12, or 18. There are pretty good open-source databases for any Oracle database version that can do virtually anything Oracle can do. Moreover, they can do it with far less overhead.

This is covered in the following quotation from Gartner.

“Today, there are about 300 database options available, including open-source and cloud alternatives such as MongoDB, PostgreSQL and Amazon. The database market has been commoditized. According to Gartner,* open-source databases have now reached parity with traditional databases in terms of functionality, tools and available resources — all with a lower TCO.”

This movement from commercial databases to open-source databases is a growing trend and one we see speeding up in the future. And we are not the only ones. Seeking Alpha agrees.

“We see the $29.6b commercial database market contracting 20-30% by 2021, and do not believe Oracle (NYSE:ORCL) can transition its revenue streams (from legacy commercial database to cloud-based subscription offerings) fast enough to offset the decline of this market, which represents a major legacy core of its revenue.”

Many Oracle instances can be migrated to PostgreSQL or MariaDB, etc.. as they do not use Oracle DB’s more advanced functions. Several items must be addressed, which include the following steps:

  1. Modify Incompatible SQL
  2. Schema Conversion
  3. Data Replication
  4. Migrating Code (Converting PL SQL Stored Procedures)

Something important to consider is that no database is precisely like another database. Therefore, moving from Oracle will mean changes.

This is explained very well by this anonymous quote.

“The thing is, you don’t switch from Oracle (say, Exadata boxes) to something comparable as there is nothing directly comparable. What companies do is rethink their approach to BI, reporting and such and move to SaaS solutions, while simplifying greatly.”

Moving from anyone one is familiar with comes with an adjustment. The question is whether the change is worth it. And for a large percentage of the Oracle databases out there, the answer is a resounding yes.

Let us go into the steps that are required for migration.

Step 1: Modify Incompatible SQL

There are differences in SQL from Oracle SQL to PostgreSQL. However, the modification can be performed, and Oracle SQL is considered one of the least efficient variations on SQL.

  • Many consider the language overspecialized.
  • Oracle SQL is so complex that it is expensive to operate.
  • Oracle SQL is more difficult to troubleshoot and has become decreasingly user-friendly.
  • Most companies do not leverage Oracle SQL overhead, so they pay the extra cost but do not benefit from the additional complexity.

After the SQL modification, one gets the benefit of cleaner SQL. In most cases, most of the effort in migration is SQL modification.

Step 2: Schema Conversion

This is where the schema (the tables and the relationships) from the Oracle database is moved to PostgreSQL (in this example).

Invariably, objects are built up in any database that is no longer used. So, part of the conversion process is only migrating what is necessary. AWS offers a schema conversion tool that provides a wide variety of conversions.

This graphic is from AWS’s RDS migration page.

The list is included below:

Step 3: Data Replication

This is where data is copied from the schema in the source to the schema in the target.
For both AWS RDS and Google Cloud SQL, the source database can be replicated to one or more regional databases.

Step 4: Migrating Code (Converting PL SQL Stored Procedures)

Stored procedures were for a long time promoted by Oracle to improve performance. Something called PL-SQL manages Oracle’s stored procedures. (procedural language SQL). However, this reduced database portability and enabled lock-in.
Since SAP created HANA, they began promoting stored procedures for the same reason that Oracle did, to increase lock-in. However, SAP has been pushing aggressively to recast all of its reverse engineering of other databases, which Brightwork Research & Analysis covered in the article Did SAP Reinvent the Wheel with HANA? Therefore, SAP decided to call their stored procedures “code pushdown.” Brightwork Research & Analysis covered in the article How Accurate is SAP’s Statements on Code Pushdown?

PL-SQL in Oracle can be converted to PL/pgSQ in part automatically, but some of the conversion will be manual.

Database Bloat

The fact is that a huge number of Oracle databases are just not doing that much that is particularly interesting or advanced. They are doing things other open-source DBs could very quickly do at a far lower expense. Yet many of these companies have Oracle DBAs managing them inefficiently on-premises and at a high cost.

This is referred to as database cost bloat.

Database bloat pulls resources away from other things, such as IT departments are perpetually short on budget for value-added items. Oracle supported and reinforced this entire inefficient approach, and Oracle is only now reacting to efficiencies imposed from the outside — in this case, AWS, Google Cloud, and Azure. Without this change being brought from the outside, Oracle, which makes enormous amounts of money from the status quo, would prefer to keep things as they are as their database is a cash machine.

AWS and Google Cloud can cut the Oracle bill in two critical ways: the database license and the second is the support. Oracle to Oracle migration makes a lot of sense for databases that support applications without open-source database certification. It means turning an internally managed database into a managed service database. If the open-source database can be used, huge savings and efficiencies can be achieved. Companies that use Oracle should focus on this because a company can’t migrate a chunk of their databases away from Oracle DB by using Oracle for IaaS/PaaS.

How Does Oracle View the Cloud?

Oracle has a curious view of the future of the cloud and Oracle’s place in it. Mark Hurd, the CEO of Oracle, has stated that Oracle gets 3x in revenue for every customer that moves from on-premises to cloud. That is where Oracle’s head is at regarding the cloud. Customers need to prepare to pay Oracle much more for the same thing.

Oracle’s support has a 93%+ margin. Why should customers pay such a support margin if the open-source databases are so well documented and one can have a managed database (either Oracle or open-source) from AWS or Google Cloud? That is as long as the application support for open source is there and or if an older version of Oracle can be leveraged (allowing support to be dropped)? A great deal of money is wasted with the Oracle database and database support. We are not proposing that all Oracle DB’s can be ripped out. We fully appreciate the case-by-case nature of the question, but the more Oracle databases a customer has, the more liability it creates. This is a liability in dealing with Oracle account managers’ visits, audits, lots of DBAs consuming resources without scaling, and many other negative implications. One can minimize those liabilities by migrating some Oracle DBs to AWS or Google Cloud and other Oracle DBs to non-Oracle using managed DBs on AWS or Google Cloud.

Properly Leveraging AWS RDS

We listed the significant steps to using RDS, but there are others. The RDS environment differs from an on-premises environment, and we do not intend to gloss over the differences.

For example, some things can be leveraged for backup that doesn’t exist in an on-premises environment. One of these is Lambda, the usage of which is described in the following quotation from House of Brick.

“The House of Brick best practice for proper backup retention is to use manual snapshots to store snapshots of the database in S3 on a monthly and annual basis. Using an encrypted S3 bucket is important for ensuring the security of the backup data. Because running these backups manually would be tedious and error prone, using a scheduled Lambda function is highly recommended for this purpose. The following example python 2.7 code illustrates a Lambda function that can be used for monthly snapshots for the instance OracleRDStest1 in the US-EAST-1 region.

import botocore
import datetime
import re
import logging
import boto3
region=’us-east-1′
rds_instances = [‘OracleRDStest1’] def lambda_handler(event, context):
source = boto3.client(‘rds’, region_name=region)
for instance in rds_instances:
try:
timestamplabel = str(datetime.datetime.now().strftime(‘%Y-%m-%d-%H-%-M-%S’)) + “monthly-snap”
snapshot = “{0}-{1}-{2}”.format(“mysnapshot”, instance,timestamplabel)
response = source.create_db_snapshot(DBSnapshotIdentifier=snapshot, DBInstanceIdentifier=instance)
print(response)
except botocore.exceptions.ClientError as e:
raise Exception(“Could not create snapshot: %s” % e)

Once appropriate retention is established, backup reliability and availability also needs to be considered. It is possible for an entire AWS region to become unavailable, which has happened in the past. Keeping the backups in a S3 bucket in the same region as the RDS instance creates the potential nightmare scenario of simultaneously losing access to the database and all database backups. To combat this, House of Brick recommends mirroring the monthly, annual, and most recent daily backup to an encrypted S3 bucket in another region and on a different AWS account. The latter point is key, as it offers protection of the backup data from technical failure, human error, and malicious intruders. If the protected bucket is properly configured to allow versioned read/write access on a cross-account basis to the production account hosting the RDS instance, then the production account can never damage or delete retained backups, even if the production account is compromised to the root account level.”

This quotation was included to emphasize that once in AWS or Google Cloud, there are things to leverage new and different ways of doing things. One does not manage an on-premises environment like a cloud environment.

Let us take a moment to discuss the AWS and Google Cloud managed database.

Understanding the AWS and Google Cloud Managed Database Services

This relatively new service has profound implications for changing how databases are administered.

The AWS-managed Brian Hostetter describes database service.

“Right, even if you run Oracle in AWS, you don’t have to deal with capacity planning, patching, high availability, backups. RDS takes away the undifferentiated heavy lifting and allows your DBA’s to tune and work on better ways of querying, that is much more valuable than backing up LUN’s like on prem.”

The TCO database reductions are pretty impressive, with the most significant impact being labor and M&S ongoing fees. There is also the agility of immediate access and the ability to scale up dynamically and down, so companies don’t have to spend on unused capacity. This removes the time and expense of performing the database planning.

Notice the “Serverless” option for Aurora, which auto-scales the database.

The development instance of PostgreSQL on Cloud SQL does not scale. However, the Staging and Production options do scale, but only for storage, not for the virtual CPUs. The vCPUs are set per category.

Differences from the On-Premises Model

Under the on-premises model, everything had to be planned out, and companies invariably made forecast errors, which resulted in under or over-capacity.

  • If there’s a usage increase due to seasonality or promotions, the customer can easily scale without the huge cost of RAC hardware, software, and implementation costs.
  • Scaling is built into RDS and Google Cloud SQL across data centers, AZs, and regions.

A natural question arises.

Why didn’t Oracle offer the managed database service to customers long ago? Oracle has not been innovating in the ways that AWS and Google Cloud have been innovating. First, Oracle lacks AWS and Google Cloud’s cloud capabilities. Second, AWS and Google Cloud do not have on-premises-based partners. They have to worry about alienating. AWS can apply its technology advantage without protecting the status quo because AWS and Google have nothing invested in it.

Price Reductions Rather than Constant Price Increases

Furthermore, Oracle has another problem, and this is related to AWS and Google’s pricing strategy. AWS and Google’s strategy is to cut prices and to keep cutting prices. Last year, AWS had a significant round of price cuts across its services. In some cases, up to 21%. It was their 62nd price cut to that date. Oracle has no history of cutting prices. Quite the opposite.

Customers see this, and they want to participate. Labor has an 85% deflation in the cloud. The fully loaded cost of a single Oracle DBA is $120,000/year. The complete hardware and software infrastructure admin labor is redundant when automation fully manages the service. There is a 22% reduction in software maintenance and a 10% reduction in hardware maintenance. Even the support reduction alone will pay for many AWS and Google Cloud migrations.

The following quote highlights this.

“Regarding database administration, RDS has allowed us to forget about things like backups, capacity management, and patch management. We project that we will pay about one-quarter of what we were paying in our private infrastructure.” – Todd Hofert, Director of Infrastructure Operations, Trimble.

As well as this one,

“While license fee savings from open source hit 100 per cent, open-source databases also yield significant savings in hardware costs. All in, companies can expect to save 70 per cent by shifting from Oracle to a database like MongoDB (even once you account for the cost of migration, re-skilling DBAs, etc.) On the AWS platform, the list price for running Oracle (RDS) is $25.68 per hour. Running PostgreSQL or MySQL (RDS) is 1/8th to 1/10th that cost.

As big as those savings are, however, the bigger cost differential derives from developer and DBA productivity.

For DBAs skilled with Oracle’s database, they can often manage up to 25 database servers, on average. That same DBA can manage a million database servers on Amazon’s RDS, thanks to the benefits of automation. Talk about scale.”

Conclusion

AWS’s RDS is one of their most popular services. It is highly successful; however, in our view, it is still underutilized. There are so many opportunities for either homogeneous or heterogeneous migration. A big part of more companies taking advantage of AWS is merely understanding and getting comfortable using AWS. AWS explains how to manage the migration, provides a migration service, and can put customers in contact with partners that will support database migration. Database migration is an early step that opens up many other options for companies that do not have to keep their databases on-premises.

SAP and Oracle have no answer for the managed database service, and Oracle’s autonomous database, which we believe is fake, is an example. AWS and Google Cloud have such an advantage here because SAP and Oracle have little experience managing customer databases. Managing databases has never been a part of SAP and Oracle’s business model.

Now that we have covered the database migration topic, with its associated cost improvements, let us switch gears and move to another benefit of AWS and Google Cloud. This is how AWS and Google Cloud can speed the development of software.