Home Enterprise Oracle Announces Improved MySQL HeatWave Lakehouse

Oracle Announces Improved MySQL HeatWave Lakehouse

by Harold Fritts

On the opening day of Oracle CloudWorld, Oracle announced an improved MySQL HeatWave Lakehouse with query performance that is touted to be 17x faster than Snowflake and 6x faster than Redshift based on a 400 TB workload. According to Oracle, MySQL HeatWave Lakehouse can load 400 TB of data from object storage 8X faster than Redshift and 2.7X faster than Snowflake.

On the opening day of Oracle CloudWorld, Oracle announced an improved MySQL HeatWave Lakehouse with query performance that is touted to be 17x faster than Snowflake and 6x faster than Redshift based on a 400 TB workload. According to Oracle, MySQL HeatWave Lakehouse can load 400 TB of data from object storage 8X faster than Redshift and 2.7X faster than Snowflake.

MySQL HeatWave Lakehouse is the newest addition to the MySQL HeatWave portfolio combining transaction processing, analytics, machine learning, and machine learning-based automation within a single MySQL database.  MySQL HeatWave Lakehouse scales to 512 nodes and offers customers the ability to process and query hundreds of terabytes of data in an object store in a variety of file formats, such as CSV and Parquet, as well as Aurora and Redshift backups.

Powered by the massively parallel scale-out MySQL HeatWave architecture, MySQL HeatWave Lakehouse is said to deliver significantly better performance than competitive cloud database services for running queries and loading data, as demonstrated by industry-standard benchmarks.

In a single query, customers can query transactional data in the MySQL database and combine it with data in the object store using standard MySQL syntax. Oracle also announced new MySQL Autopilot capabilities that improve performance and makes it easier to use. MySQL HeatWave Lakehouse is now available in Beta for customers to try and is slated for general availability in 1HCY23.

Customers migrating from AWS, Google, and on-premises have been using MySQL HeatWave for a broad set of use cases including marketing analytics, particularly real-time analysis of advertising campaign performance and customer data analytics to build effective campaigns. Migrating AWS customers include leaders in the automotive, telecommunications, retail, high-tech, and healthcare industries.

Oracle is also publishing new lakehouse benchmarks and introducing several capabilities for MySQL HeatWave Lakehouse and MySQL Autopilot. Oracle customers can try MySQL Heatwave free for 30 days.

Benchmarks

As demonstrated by a publicly available 400 TB TPC-H benchmark, with scripts available on GitHub,  the query performance of MySQL HeatWave Lakehouse is:

  • 17X faster than Snowflake
  • 6X faster than Amazon Redshift

Loading data from object store into MySQL HeatWave Lakehouse is also significantly faster. For a 400 TB TPC-H workload, load performance of MySQL HeatWave Lakehouse is:

  • 8X faster than Amazon Redshift
  • 2.7X faster than Snowflake

Innovative new capabilities for MySQL HeatWave Lakehouse

MySQL HeatWave Lakehouse new capabilities include:

  • Larger data size, standard MySQL syntax: Customers can query up to 400 TB of data with MySQL HeatWave Lakehouse, and the HeatWave cluster scales to 512 nodes. Customers use standard MySQL syntax for querying the data.
  • Identical performance and compression: MySQL HeatWave offers the same query performance for data stored inside MySQL database or on object store—as demonstrated by both 10TB and 30TB TPC-H benchmarks. The amount of compression achieved and the amount of data that can be processed per node is the same in both instances.
  • Support for multiple file formats: With MySQL HeatWave Lakehouse, customers can load and process data stored in a variety of file formats, such as CSV and Parquet, as well as Aurora and Redshift backups from AWS. This enables customers to leverage the benefits of MySQL HeatWave even when their data is not stored inside a MySQL database. The query performance is the same regardless of the file format in which the data is stored.
  • Ability to query data in MySQL and combine it with data in object store: With MySQL HeatWave Lakehouse, customers can query their OLTP data stored inside MySQL database and combine it with data stored in the object store. Any change made to the OLTP data is updated in real-time and reflected in the query result.

New MySQL Autopilot capabilities

MySQL Autopilot provides machine learning-based automation for MySQL HeatWave. Existing MySQL Autopilot capabilities such as auto-provisioning and auto query plan improvement have been enhanced for MySQL HeatWave Lakehouse, reducing database administration overhead and improving performance.

New MySQL Autopilot capabilities include:

  • Auto schema inference: Autopilot automatically infers the mapping of the file data to datatypes in the database. As a result, customers don’t need to manually specify the mapping for each new file to be queried by MySQL HeatWave Lakehouse.
  • Adaptive data sampling: Autopilot intelligently samples portions of files in object storage, collecting accurate statistics with minimal data access. MySQL HeatWave uses these statistics to generate and improve query plans, determine the optimal schema mapping, and more.
  • Auto load: Autopilot analyzes the data to predict the load time into MySQL HeatWave, determines the mapping of the datatypes, and automatically generates the loading scripts.
  • Adaptive data flow: MySQL HeatWave Lakehouse dynamically adapts to the performance of the underlying object store. As a result, MySQL HeatWave can get the maximum available performance from the underlying cloud infrastructure which improves overall performance, price performance, and availability.

Additional enhancements to MySQL HeatWave

Oracle announced a number of other enhancements to MySQL HeatWave spanning from machine learning to the VS code plug-in. The in-database ML capabilities of MySQL HeatWave have been further enriched to include support for forecasting models. New ML explanation techniques have been added which have been optimized for MySQL HeatWave. Data scientists can now influence various stages of the automated HeatWave ML training pipeline, including the choice of algorithm, feature selection, scoring metric, and explanation technique. HeatWave ML has also been enhanced to allow customers to import ML models into HeatWave.

A new multi-engine Hypergraph query optimizer further improves the performance of complex queries and eliminates the need to specify the join order. Zone map has been added, which accelerates a broader set of queries with MySQL HeatWave. And the VS code plug-in for MySQL has been enhanced to support MySQL HeatWave capabilities.

Ready for the Distributed Cloud

MySQL HeatWave is available in multiple clouds including OCI, AWS, and now Microsoft Azure. It’s available on-premises as part of OCI Dedicated Region for organizations that prefer not to move their database workloads to the public cloud. Customers can also replicate data from their on-premises MySQL OLTP applications to MySQL HeatWave to obtain near-real-time analytics. MySQL HeatWave is always on the latest version of the MySQL database.

Engage with StorageReview

Newsletter | YouTube | Podcast iTunes/Spotify | Instagram | Twitter | TikTok | RSS Feed