Introducing external data sharing in Starburst Galaxy

  • Vishal Singh

    Vishal Singh

    Head of Data Products

    Starburst

Share

Ten years ago, organizations were challenged with collecting more data – they wanted more data on their users behavior, the market, their competitors, etc. Today, the challenge lies in analyzing all of that data. And, truly data-driven organizations not only analyze the data they own but incorporate third-party data sets (from partners, vendors, etc.) to make informed decisions.

However, the process of getting this external data into a secure, shareable format puts a large burden on both the data provider (the data producer) and the data receiver (the data consumer). The process is oftentimes manual, error-prone, and time-intensive due to incompatible systems and siloed structures.

That’s why we are excited to announce the private preview of secure, external data sharing in Starburst Galaxy. 

The challenges of external data sharing

Cross-organizational data sharing is incredibly complex. The easiest way to understand the complexities involved is to break it into three key components.

  1. Technology limitations
  2. Error handling
  3. Security 

The primary limitation of many data sharing systems comes down to the legacy architectures they are built on. Many organizations have outdated or incompatible systems and are using antiquated tooling like SFTP. The absence of standardized protocols between organizations complicates the integration even further, requiring teams from both sides to spend time documenting and mapping out processes.

Secondarily, whenever data moves between different systems, there’s a risk of corruption or alteration – compromising its accuracy and reliability. Sharing data externally only compounds this risk making the data even more error prone. Teams must not only look out for these errors but build in monitoring and debugging systems to catch and resolve data quality errors as they occur. Otherwise, the organization risks making decisions on corrupted data.

Finally, teams must spend time putting the proper security measures in place such as access controls and encryption to prevent unauthorized access to the data.

External data sharing in Starburst Galaxy

With the introduction of external data sharing in Starburst Galaxy, organizations can now share data with third-parties, directly from the source – eliminating the typical pains in the data sharing process described above.

At a high level, data sharing in Galaxy allows a user in environment A (the data producer’s Galaxy domain) to establish a secure connection to environment B (the data consumer’s Galaxy domain), and then share a data product from environment A to B. 

 

External data sharing also works with the Gravity layer in Starburst Galaxy which includes features like ABAC and data observability. This means that organizations can:

  • Maintain robust security protocols, including encryption and fine-grained access controls, for secure external access
  • Quickly resolve data quality issues with built-in data observability features like data lineage
  • Share data without physically relocating it

To get a feel for what the in-product experience is, follow along with this short demo:

How to get started

As of today, we’re launching our private preview program. If you’re interested in being a part of our early testers and want to help us shape our data sharing roadmap, apply here.