Skip to main content

Benchmarking Results: Internal vs External Table in Oracle Autonomous Data warehouse with 1 Billion Records

In this benchmarking test, the objective was to compare the query performance between an internal relational table and an external table (created from a Parquet file in OCI Object Storage) in Oracle ADW

The Parquet file contained 1 billion records, and the same data was loaded into both an internal ADW table and an external table using the Parquet file as the data source.


Queries were executed using SQL Developer

SET TIMING ON;

 

-- Query to count rows in the internal table

SELECT COUNT(*) FROM internal_table;

 

-- Query to count rows in the external table (Parquet-based)

SELECT COUNT(*) FROM BIG_EXTERNAL_TABLE;


Execution Plans

Internal Table Execution Plan:

•     Operation: TABLE ACCESS STORAGE FULL — Oracle retrieves data directly from its optimized storage.

•     resourceCost: 4329

•     Result Cache: Enabled, allowing Oracle to cache the result for faster retrieval in subsequent queries.

•     Parallelism: Automatic Degree of Parallelism (DOP) set to 12.


External Table Execution Plan:

•     Operation: EXTERNAL TABLE ACCESS STORAGE FULL — Oracle retrieves data from OCI Object Storage, involving more overhead.

•     resource Cost: 16785 (significantly higher than the internal table due to accessing external storage).

•     Result Cache: Not enabled, so each query requires fetching data from Object Storage without caching.

•     Parallelism: Automatic DOP set to 12, similar to the internal table.



Key differences between this 2 strategy


(1)in terms of result cache - internal table uses result cache but external tables does not. Result caching typically improves performance especially for repeated queries

(2)as per sql plan - the overall resource cost was much higher for external parquet table - external table query requires more resources

(3)internal table uses - TABLE ACCESS STORAGE FULL on an Oracle-managed storage system, while the external table uses EXTERNAL TABLE ACCESS STORAGE FULL, accessing Object Storage, which involves more overhead


Timing Results

•     Internal Table: 0.1 seconds

The internal table query was fast, completing in just 0.1 seconds. This speed is largely due to the result caching and the optimized ADW storage.

•     External Table: 12-15 seconds

The query on the external Parquet table took considerably longer, ranging from 12 to 15 seconds. This is because the data had to be retrieved from OCI Object Storage, and the lack of result caching added to the latency. 

Comments

Popular posts from this blog

AWR Made Easy - Part 1: The Basics

Intro Reading and interpreting Automatic Workload Repository (AWR) reports can often seem like a daunting task, especially considering the sheer volume of information that they contain. While setting up basic monitoring and alarms for your Oracle database via the OCI console is recommended, there will come a time when a deeper understanding and interpretation of the AWR report will be required for advanced analysis. In this first part of our three-part blog series on AWR reports, we will start slow and cover some basic aspects of AWR analysis. Elapsed Time vs DB Time The first thing you should look at in your AWR report is the difference between elapsed time and DB time.High elapsed time often indicates high concurrency, where multiple sessions are running concurrently, each taking a fraction of the total time. This scenario is quite common in environments with multiple cores running in parallel, such as Exadata Cloud Service (ExaCS). Inefficient SQL queries often come to the surface h...

Mastering High Availability and Disaster Recovery in Oracle ATP Databases

Introduction In today's fast-paced digital world, High Availability and Disaster Recovery (HA/DR) are not just technical jargon but essential strategies for safeguarding your data and business continuity. If you're using Oracle Autonomous Transaction Processing (ATP) databases, you're in luck. Oracle offers an array of HA/DR options, which can be easily managed from the ATP console itself. In this blog post, we'll delve into these options, their costs, and key performance indicators like Recovery Time Objective (RTO) and Recovery Point Objective (RPO). Most importantly, we'll guide you through best practices for setting up and testing these HA/DR configurations. Your HA/DR Options in Oracle ATP - At a Glance Here's a summary table outlining the available HA/DR options for Oracle ATP databases: Best Practices for HA/DR Setup 1. Evaluation Before implementing any HA/DR solution, perform a thorough assessment of your business needs, budget, and risk tolerance. The ...

15 Reasons You Should Consider Oracle Cloud for Your Oracle Database Workloads

Whether you're a startup or a multinational corporation, the cloud has become a critical part of any business's IT strategy. When it comes to hosting Oracle Database workloads, Oracle Cloud Infrastructure (OCI) provides a compelling suite of benefits. Here are 15 reasons why you should consider OCI for your Oracle Database needs: 1. Unique Oracle Database Features: OCI provides exclusive access to features such as Oracle Autonomous Database, Oracle Real Application Clusters (RAC), and Oracle Maximum Availability Architecture (MAA). These unique capabilities can dramatically enhance your database efficiency and reliability. 2. Seamless Migration with Zero Downtime: OCI's Zero Downtime Migration (ZDM) feature ensures a seamless transition from your on-premises databases to the cloud, with no service interruption. 3. Exadata on Cloud: Leverage the power of Oracle Exadata, a pre-configured, pre-tested, and optimized platform for Oracle Database workloads, in the cloud. It offer...