Federating Data Products during migration

  • Levi Gibson

    Levi Gibson

    Cloud Alliances Sales Director

    Starburst

Share

Organizations who are interested in adopting Data Mesh strategies have found success with Starburst and Google Cloud Platform (GCP) by simplifying migrations and connecting GCP to hybrid and multi-cloud datasets.

Enterprises wanting to modernize to cloud data storage face two complex tasks: staging data migrations and, perhaps more complicated, moving and rebuilding the analytical business logic—the data pipeline—that connects existing BI and data science tools. Data Products are becoming critical to business transformation, and allowing Data Products to federate between cloud, and on-prem, during and after migration is critical. 

Starburst leverages Trino, formerly PrestoSQL, an open-source, distributed SQL query engine, to make better decisions with lightning-fast access to all data, no matter where it lives. Starburst helps solve these problems with a data consumption layer that enables seamless operation of analytics workloads during migration and eliminates the need to reconstruct data pipelines on cloud data lake storage. With Starburst, a query fabric layer abstracts the datasets being queried from the underlying physical datasets and storage. Thus, even as the location of the physical data changes during migrations, queries continue against the same virtual dataset without disruption.  

Starburst can alleviate the issues of data migrations by creating a SQL-based “query fabric” over the top of all data sources that acts as a single point of access. We call this the data consumption layer. The data consumption layer allows for the underlying data locations, formats and technologies to change without the users needing to be aware. IT can concentrate on moving the data and users can get fast, reliable, accurate, and timely data without interruption. This allows for federated data products to consume data without any downtime, or user experience changes during a migration.  

“Businesses are migrating their assets into leading cloud data services like Google’s BigQuery, and they are building diverse data products that integrate data from multiple cloud and on-premise locations. 

On the IT side, we want to accelerate the migration of key datasets while managing a cost efficient and secure process. On the business side, we want to minimize or avoid any disruption. The business wants their existing dashboards, applications and active data projects to continue to run, without interruption. Where the data resides, in the cloud or on-premise, should be totally abstracted from the business. 

Improved performance, compliance, reliability and accessibility are the only changes that the business should notice when their data is moved to Google Cloud. Starburst enables business team’s to maintain the data views they want, even while IT teams physically move data assets on the back end,” said Adrian Estala, VP, Field CDO at Starburst. “Customers are using Starburst to connect BigQuery with data no matter where it lives with Starburst’s 50 enterprise connectors.”

Recently, a multinational telecommunications company went through a large-scale migration of Teradata to Google Cloud. They faced a migration roadmap that would phase over three years. Additionally, they also had specific security and data sovereignty rules to contend with, making the entire process even more challenging. 

With Starburst, the company was able to migrate Teradata to Google Cloud in phases— without experiencing any disruptions, while maintaining the ability to query the migrated data along with the on-prem Teradata data, at the same time. Even better, the company deployed Google Cloud, by region, and Starburst’s Stargate technology enabled connectivity to data within those regions.

Accelerating migrations with a data mesh framework

Download eBook