Powered by Apache Iceberg & Trino

Icehouse - the next era of the data lakehouse

Build your Iceberg data lakehouse on the architecture trusted by the teams at Netflix, Apple and Stripe. The Icehouse provides a familiar, warehouse-like experience on a truly open foundation.

By clicking Submit, you agree to Starburst Galaxy's terms of service and privacy policy.

Leader in Big Data Processing & Distribution

Highest User Adoption for Enterprise Big Data Analytics

What is an Icehouse: the newest Lakehouse Architecture

An Icehouse is a specific type of data lakehouse with Trino as the SQL query engine and Apache Iceberg as the table format. At Starburst, we believe that an Icehouse is the only data architecture that provides teams with a familiar warehouse-like experience on a truly open foundation. Starburst is happy to introduce the “Starburst Icehouse,” providing automations to Starburst Galaxy that will continue to make it easier for teams to build and manage an Icehouse architecture, including specific Icehouse features like data ingestion, data governance, data management and automatic capacity management.

The Next Era of the End-to-End Data Lakehouse

Data Warehouse vs. Data Lakehouse

Understand the benefits of an Icehouse and see why you can get warehouse-like performance for a lower cost.

Data Warehouse
Data Lakehouse
Data Volume

TBs-PBs

PBs+

Data Types

Structured

All (Structured, Semi-Structured, Unstructured)

Performance

High

High

Data Quality

High

High

Cost

$$$

$$ (Separation of Storage & Compute)

Open

No

Yes

Use Cases

Business Intelligence & Reporting; Workloads

Business Intelligence & Reporting; Data Applications; Data Science; Machine Learning

The Benefits of Adopting an Icehouse Architecture Built With Trino & Apache Iceberg

Optimized for Big Data Analytics

Optimized for Big Data Analytics

Improve the scalability and responsiveness of your architecture with the lakehouse that’s proven at petabyte scale

Get Speed Without Increased Costs

Get Speed Without Increased Costs

Achieve data warehouse performance with a more scalable architecture without the added costs

Leverage Cutting Edge Innovation

Leverage Cutting Edge Innovation

Adopt the technology that revolutionized Netflix, Apple, Shopify, Stripe

Comparing Icehouse Data Tables

Why Apache Iceberg is the best table format vs. Databricks Delta Lake or Apache Hive

Apache Iceberg
Databricks Delta Lake
Apache Hive
Transaction Support (ACID Compliance)

Full

Full

Only w/Hive ACID

File Format

Parquet, ORC, Avro

Parquet

Parquet, ORC, Avro

Schema Evolution

Full

Limited

(Only supports adds/reorders of columns)

Limited

(No guarantees of correctness)

Partition Evolution

Yes

No

No

Versioning

Yes

Yes

No

Time Travel

Yes

Yes

No

Materialized Views

Yes

No

Yes

Community & Ecosystem

Growing

Growing

Established

Integrations

Interoperable

Tight integration with Databricks

Interoperable

Use Cases

General purpose data lakehouses

Optimized for Databricks data lakehouses

General purpose data lake w/limited DML support

Create the Hyperscale architecture you have always dreamed of

Data warehousing solutions simply can’t scale to big data

  • Storage: Separation of storage and compute that supports independent scaling
  • Processing: Trino is multi-parallel processing engine that supports high concurrency
  • Table Format: Iceberg built for cloud storage with decoupled metadata that supports large tables

Achieve industry-leading price performance for SQL workloads

Data warehousing solutions simply can’t scale to big data

  • Performance: Achieve the same performance as your data warehouse while optimizing your spend
  • Costs: Expect 4X cost savings over time
  • Risk: Remove any risk of having your data restricted with 0% of the lock-in of a data warehouse

Use a familiar SQL interface

The same SQL interface you’ve been used to working with

  • Support: Ensure you have the right support for DML statements and your table needs
  • Compliance: Guaranteed ACID-compliance so that all database transactions are completed easily
  • Schema Evolution: Provided schema evolution will allow you to easily modify your database without disruption

The perfect data architecture without the hassle

All on a fully-managed platform with end-to-end data pipeline support from ingestion to data sharing

  • Ingestion: Using unreliable data ingestion can create complications with data accuracy, can increase complexities with data analysis, which can ultimately lead to data unreliability.
  • Governance: Simplify data governance. With the right architecture, you can eliminate the need to integrate a whole separate governance system.
  • Analyze: Execute SQL queries on Iceberg tables fast with advanced performance optimization tools
An Icehouse is a new data architecture comprised of Trino, Apache Iceberg & SQL that will help you lower costs, increase speed and data accuracy.

Activate your data lake today with Starburst