Starburst Data Lakehouse, Powered by Iceberg

The performance of a data warehouse. The scale of a data lake.

Start building your open data lakehouse.

Start Free

Watch a Demo

Change the Economics of your Data Architecture

Fast

Query petabyte-scale data lakes, fast with a multi-parallel processing query engine

Efficient

Reduce costs while adding elasticity to your architecture by separating storage and compute

Reliable

Bring the simplicity of SQL tables to big data with the Apache Iceberg table format

How it works

Land

Starburst works directly with your cloud object storage. Start by choosing where you want to store your data – Amazon S3, Azure Data Lake Storage, or Google Cloud Storage. Gather and land that data in your object storage location of choice.

Structure

Connect Starburst Galaxy to your object storage and transform your raw data files into Iceberg tables. Or, leverage Galaxy’s built-in ingestion solution to automatically stream data into your data lake and land it in Iceberg format.

Consume

Curate gold-standard tables that are ready to be queried using SQL or Python data pipelines. Add proper data governance with built-in access controls to ensure proper read-only access to your data sets.

Optimize

Automatically run common data lake management tasks like compaction, vacuuming, and retention with Starburst Galaxy to ensure optimal performance.

Core Platform Capabilities

Ingest data in near real-time

Continuously ingesting data from a Kafka-compliant topic into your data lake and land it in Iceberg format with Galaxy’s fully-managed streaming ingest.

Automate lakehouse management

Schedule routine maintenance operations on your data lakehouse, like data compaction, vacuuming, and data retention, directly from the Galaxy UI.

Unify on a single, powerful platform

Build your lakehouse in Galaxy to leverage powerful platform capabilities like built-in access controls, data products, and more.

Available on all three major clouds

Deploy Starburst Galaxy across all three major clouds and leverage cross-cloud connectivity to unify your data.

“The move to Starburst and Iceberg has resulted in a 12x reduction in compute costs versus our previous data warehouse. This efficiency allows us to focus our attention on using analytics for revenue-generating opportunities.”

– Peter Lim, Sr. Data Engineer, Yello

“We moved from a monolithic Snowflake approach to a decentralized approach with Starburst and Iceberg. Now we can skip the data warehouse step completely, and complete analytics on the data right where it sits.”

– Lutz Künneke, Director of Engineering, BestSecret

Read BestSecret blog

“The combination of Starburst Galaxy and Apache Iceberg offers exceptional value, delivering far more for the same investment. It’s a clear win for efficiency and productivity in our data-driven environment.”

– Johni Michels, Data Team Lead, Kovi

“The time it took from reporting Trino + Iceberg bugs to deployed fixes was very fast and demonstrated the commitment Trino has to being a leader in Iceberg adoption across the various compute engines.”

— Marc Laforet, Senior Software Engineer, Shopify

Read Shopify blog

“The Starburst integration and acceleration of Iceberg based on data in S3 was the key selling point for us. [It’s] enabling our vision of a lakehouse based on AWS.”

— Anonymous, Enterprise User in Retail

Read full review

Resources to learn more

How to migrate your Hive tables to Apache Iceberg

How to optimize your data lake with Iceberg and Trino

Building an open data lakehouse using Trino and Apache Iceberg

Starburst’s mission is to free our customers to see the invisible and achieve the impossible