Redefine tables within object storage connectors like Iceberg and Delta Lake without any downtime
Share
More deployment options
Since Trino 431 release, support for CREATE [OR REPLACE] TABLE
statements was added. The relevant issue is available here.
What is a CREATE [OR REPLACE] TABLE statement?
This new capability allows you to redefine tables within object storage connectors like Iceberg and Delta Lake. It enables atomic table replacements, ensuring that concurrent queries always access a valid version of the table. This guarantees seamless transitions and uninterrupted data availability.
Why is CREATE [OR REPLACE] TABLE important?
- Continuous Availability: BI tools and ETL processes won’t be impacted by updating tables as all tables continue to exist without any downtime
- Historical Data Retention: Table history is preserved for auditing purposes and easy rollback. to previous versions if needed.
- Enhanced Reliability: Transactions are managed atomically, providing consistency and reliability during table updates.
Let’s take a closer look below
Table History
The CREATE [OR REPLACE] TABLE
statement allows specific object storage connectors (currently Iceberg and Delta Lake) to set a new version of the table with totally different data and structure while retaining the history of the table.
Atomic table replacement
This change leverages these specific object storage connectors’ support for transactions (Iceberg and Delta), permitting atomic table replacements. This means that any queries concurrent with the CREATE OR REPLACE TABLE
operation use either the old or new table version.
Let’s go through an example (we are using Iceberg here, because we believe Iceberg is the table format that is the most easy to use and evolve). When creating a new table or replacing an existing one, the operation is performed atomically. This means your data remains consistent, and users can query the old or new table version seamlessly.
CREATE TABLE Example
Creating a new table is simple:
CREATE TABLE my_table AS SELECT 'test' a, DOUBLE '38.5' b;
Now let’s overwrite this table with new data and a totally different table definition:
CREATE OR REPLACE TABLE my_table AS SELECT BIGINT '-42' a, DOUBLE '38.5' c;
The last statement will atomically replace the table definition with the new table definition without impacting reads on the first version.
We can now check the versions (snapshots in Iceberg lingo) of this table by executing the following query:
SELECT * FROM "my_table$snapshots" ORDER BY committed_at DESC
Let’s use the snapshot id from the first version to retrieve the data as of that time (SELECT 'test' a, DOUBLE '38.5' b
).
SELECT * FROM my_table FOR VERSION AS OF <snapshot_id>
If you like this old version better than the current one, you can easily roll back to the earlier version by executing the following statement, which will atomically replace the current state with the old state:
CALL <catalog_name>.system.rollback_to_snapshot('<schema_name>', 'my_table', <snapshot_id>)
Voila, now the data in my_table is rolled back to the desired state.
How is CREATE [OR REPLACE] TABLE statement relevant to dbt?
One tool which especially benefits from implementing CREATE [OR REPLACE] TABLE
is the Trino dbt adapter – dbt-trino.
During the dbt run command, SQL queries representing dbt models are executed against Trino. If a table representing a certain dbt model already exists, particular actions are taken depending on materialization type and model configuration.
To take advantage of CREATE [OR REPLACE] TABLE
in Trino, we added a new mode to table materialization in dbt-trino 1.7.1 – replace.
Until now, there were two modes available in table materialization for handling existing tables:
- `rename` — Creates an intermediate table, renames the target table to the backup one, and renames the intermediate table to the target one.
In case of error during CREATE TABLE
operation we still have the original table. Any reads on the older table will fail when the table is dropped. There is a short period of time when the table is not available. Table history is also lost, as we are creating a new table.
- `drop` — Drops and re-creates a table.
Table is unavailable until CREATE TABLE
finishes. This is unsafe operation, as after dropping a table we could encounter an error during table re-creation, and end up with a missing data.
`replace` is superior to both of the old modes, as it safely replaces a table ‘in place’ without additional overhead. The table is always ready to read, without downtime, and we retain its history.
Documentation about `on_table_exists` modes is available here
To better visualize the time when a table is available vs. unavailable, let’s look at the chart below:
Code example of table materialization and `rename` mode usage
Let’s see the difference between `rename` (default) and `replace` (recommended if connector supports it) mode.
First, let’s see how it works with the default mode, ‘rename’
{{
config(
materialized = 'table',
on_table_exists = 'rename'
)
}}
SELECT * FROM {{ ref('some_sample_seed') }}
Next, let’s see which queries are executed by running `dbt run` after the table has been created:
-- creates new table with __dbt_tmp suffix
create table "iceberg"."default"."cortas_example__dbt_tmp"
...
-- renames old table - adds __dbt_backup suffix
alter table "iceberg"."default"."cortas_example" rename to "iceberg"."default"."cortas_example__dbt_backup"
-- renames new table - removes suffix
alter table "iceberg"."default"."cortas_example__dbt_tmp" rename to "iceberg"."default"."cortas_example"
-- drops old table
drop table if exists "iceberg"."default"."cortas_example__dbt_backup"
Now change `on_table_exists` config to ‘replace’:
{{
config(
materialized = 'table',
on_table_exists = 'replace'
)
}}
SELECT * FROM {{ ref('some_sample_seed') }}
and execute `dbt run` once again. As we can see, only one CREATE [OR REPLACE] TABLE query was executed instead of 4 queries:
-- one CREATE [OR REPLACE] TABLE query
create or replace table "iceberg"."default"."cortas_example"
If you would like to learn more about dbt-trino, there is a blog titled ‘Building a Lakehouse with dbt and Trino‘ authored by Michiel De Smet and Przemek Denkiewicz. It is a great resource for those looking to familiarize themselves with dbt-trino, providing step-by-step examples that you can try on your own. We highly recommend it if you are new to dbt-trino or haven’t yet explored this blog!
Optimize your data lake with Iceberg and Trino
Explore the future of data lakes through our comprehensive technical how-to whitepaper.