Snowflake has always promised all the power of a data warehouse with the infinite scalability of the cloud. We could call this the “data warehouse in the cloud” model.
In recent years, Snowflake has extended their offering to include data lakes and even data lakehouses in the form of Iceberg managed tables. This shift sees Snowflake join the growing chorus of other companies, including Starburst, adopting Iceberg as a disruptive, epochal technology.
Snowflake compute is expensive
Snowflake offers the powerful combination of a data warehouse with the infinite scalability of the cloud. Recently, they’ve expanded their offerings to include data lakes and data lakehouses using Iceberg managed tables. However, despite these advancements, Snowflake’s significant cost remains a major concern.
Starburst lowers Snowflake compute costs and enhances openness for users
All this talk of rising costs poses one fundamental question. Is there another way to retain ease of use without the price tag of Snowflake compute? You’ve come to the right place. Disrupting the data industry in favor of openness is what Starburst is all about, and if you’re a Snowflake user suffering from high compute costs, we’ve got you covered.
Enter the Starburst Galaxy Snowflake Catalog Metastore, designed to help Snowflake users reduce compute costs while adding features like data federation. Our new connector allows Snowflake users to leverage their data without incurring high compute costs, bringing openness and flexibility to their data architecture.
Unpacking the Snowflake metastore
Currently, Snowflake allows Iceberg tables to be created in a customer managed AWS S3 bucket. Importantly, the Snowflake catalog metastore only applies to customer managed storage, meaning that users of this approach currently pay for compute costs themselves.
Instead, the shift comes by offloading compute from Snowflake to Starburst.
This feature allows all Iceberg metadata added to a Snowflake data lakehouse to be registered by their proprietary metastore. This approach will be familiar to Databricks users accustomed to using that ecosystem’s Unity catalog. The Snowflake metastore acts like a Snowflake unity catalog for Snowflake data held in the snowflake ecosystem.
The example below helps illustrate how this works for most Snowflake users.
How it works
Currently, Snowflake allows Iceberg tables to be created in customer-managed AWS S3 buckets. The Snowflake metastore only applies to customer-managed storage, meaning users pay for compute costs. By offloading compute from Snowflake to Starburst, users can access their data at a significantly lower cost while maintaining the benefits of Snowflake’s ecosystem.
Example
Imagine a company using Snowflake with 70% traditional data warehouse tables and 30% Iceberg managed tables. They are locked into Snowflake’s ecosystem and facing high costs. With Starburst Galaxy, they can query the same Iceberg tables at a lower cost, retaining their Snowflake ecosystem while saving money. This allows them to perform the same queries with reduced compute costs and explore additional data sources seamlessly.
How Starburst saves Snowflake users money
Snowflake metastore, Starburst compute
The solution is openness, swapping out the Snowflake query engine for Starburst Galaxy. Simply put, Snowflake is not the cheapest way to process an Apache Iceberg workload. This is certainly true for data not held in Snowflake, but it’s also true for data held inside the Snowflake ecosystem.
For users tied to the Snowflake ecosystem, they can continue using Snowflake Iceberg tables, but offload the compute to Starburst. This means that Snowflake users can access the same data lake on Snowflake using Starburst Galaxy but at considerably lower compute costs compared to Snowflake.
Open data architecture
This new approach allows users to process data with Starburst Galaxy, avoiding Snowflake’s high compute costs. Users can still perform the same queries but at a reduced cost, benefiting from an open data architecture. This openness can directly improve a company’s bottom line by making a like-for-like swap in compute when accessing the same data source.
Add additional data sources using data federation
Starburst Galaxy’s connector not only reduces compute costs but also enables data federation, connecting to various data sources through a rich ecosystem of connectors. This means Snowflake users can easily connect additional data sources, offloading certain workloads to less expensive alternatives. This flexibility allows users to choose the right storage and processing options for their specific needs.
Starburst Galaxy Icehouse architecture powered by Trino
Starburst Galaxy is powered by Trino, an open source query engine originally developed by Facebook to deliver flexible performance with clusters designed for data at any scale. Some of the largest, most complex data on the planet is currently processed by Trino and many users accustomed to other systems report being amazed by the speed and
Accessing your Snowflake data also opens up a world of possibilities to explore Starburst’s own Apache Iceberg tables. In fact, Iceberg was originally developed by Netflix with Trino in mind. The two technologies are naturally intertwined. We call this combination of technologies, the data Icehouse architecture and we think it’s going to remake big data as we know it.
Just like everything else with Starburst, you can approach an Icehouse architecture at your own speed. You’re free to experiment, explore, and optimize your workloads in whatever way makes sense for you, including employing an Icehouse implementation. This total freedom is what really makes Starburst special, and we’re so happy that we’re able to offer compute and federation options to Snowflake data lakehouse users.
Join us
Powered by Trino, Starburst Galaxy offers unparalleled performance and flexibility. It allows Snowflake users to explore Starburst’s own Apache Iceberg tables, leading to the innovative data Icehouse architecture. This combination of technologies, originally developed by Netflix with Trino in mind, is set to transform the big data landscape.
Excited for the open data revolution? Start free today and sign up for our managed Icehouse private preview program.
What are some next steps you can take?
Below are three ways you can continue your journey to accelerate data access at your company
- 1
- 2
Automate the Icehouse: Our fully-managed open lakehouse platform
- 3
Follow us on YouTube, LinkedIn, and X(Twitter).