Cookie Notice
This site uses cookies for performance, analytics, personalization and advertising purposes.
For more information about how we use cookies please see our Cookie Policy.
Manage Consent Preferences
These cookies are essential in order to enable you to move around the website and use its features, such as accessing secure areas of the website.
These are analytics cookies that allow us to collect information about how visitors use a website, for instance which pages visitors go to most often, and if they get error messages from web pages. This helps us to improve the way the website works and allows us to test different ideas on the site.
These cookies allow our website to properly function and in particular will allow you to use its more personal features.
These cookies are used by third parties to build a profile of your interests and show you relevant adverts on other sites. You should check the relevant third party website for more information and how to opt out, as described below.
Fully managed in the cloud
Self-managed anywhere
Use the input above to search.
Here are some suggestions:
Trino Summit is a two-day virtual conference on the 11th and 12th of December 2024. It's an event that brings together engineers, analysts, data scientists, and anyone interested in using or contributing to Trino.
Learn moreUse the input above to search.
Here are some suggestions:
Trino Summit is a two-day virtual conference on the 11th and 12th of December 2024. It's an event that brings together engineers, analysts, data scientists, and anyone interested in using or contributing to Trino.
Learn morein subscription revenue
of data ingested daily
Americas
Telco
Teradata, Oracle, Hadoop, AWS
Enterprise
1000+
“When end users are going into on-prem or cloud environments, they will be presented with all the data sets they have access to, irrespective of where the data is located. This offers a huge value to our end users.”
Anonymous
Director of Software and Engineering
in subscription revenue
of data ingested daily
This telecommunications company is the leading pay-TV and cable TV company in the United States. With more than 15 million subscribers, this customer retains and ingests tremendous volumes of data across various platforms. The company collects daily viewing data in Hadoop, while subscriber and account information and other data are stored in Teradata, Oracle, and various systems. Extracting intelligence from all this data is critical, as data-driven insights help retain existing customers, offer them new packages tuned to their preferences, and more.
Querying all this data would be simpler if it were stored in one platform, but the customer’s data is distributed across the above-mentioned platforms and, more recently, Amazon Web Services. “We weren’t born in the cloud,” said our customer. “We’ve been around for a long time. We have a lot of customers, and a lot of data we’re collecting across a lot of different systems.”
This case study details why the customer chose to deploy Starburst, how the platform brought them to their ultimate vision of universal data access, and what the impact is towards generating new revenue.
In 2016, the big data team was looking to move data out of Teradata. The system was expensive and underperforming, as it was overloaded with thousands and thousands of daily reports— analytics were the last in line. Yet the company’s Enterprise Business Intelligence group still depended on this data and accessed it regularly.
Meanwhile, viewing and event data were streaming into their Hadoop platform, and they often had to join this information with the data in Oracle and Teradata. “If you wanted to do a big data science project you had to build very large ETL jobs and move data between those platforms,” explains our customer. “We didn’t have cross-platform queries.” At one point, the company was moving everything into Hadoop, and querying Hadoop directly. Our customer also experimented with Teradata’s QueryGrid, which would connect to Hadoop’s event data. Ultimately, though, neither of these approaches brought them closer to its desired end state: fast, cross platform queries, universal data access, and freedom from any one proprietary data warehouse.
At the same time, the cable company was looking to unlock new revenue streams, and the CMO wanted to experiment with upselling premium subscriptions to their existing subscriber base. The customer’s billing data sits in Teradata and the user data, consisting of viewer watching patterns, sits in a large Hadoop cluster. The CMO believed that she could identify upsell opportunities to premium subscription packages based on shows that premium subscribers watched. With the data sitting in two physically separate locations and two different formats, being able to query this data and run these types of analytics would be nearly impossible. The cost, complexity and time to ETL the large amounts of data would not be affordable.
By deploying Starburst Enterprise as an abstraction layer that operates between the data sources and business intelligence tools, our customer is now able to query data where it resides, and move the data they want to move—all without their end users noticing a thing. “Trino was our secret sauce,” said our customer.
The number of Trino queries has rapidly scaled, from 2,000 queries per day in 2016 to over 350,000 queries per day in 2019. “More connections are getting added but the end user experience is not changing,” our customer said. “End users are able to access data no matter where it is. This made a huge impact.”
Performance at scale
Previously, analytics were last in line, and users might have to wait days for results. Starburst Enterprise is built to accommodate many users hitting the system at the same time. Our customer uses multiple Starburst-to-Teradata connectors running in parallel to maximize throughput for analytical workloads. Multiple Teradata servers can talk to multiple clusters and hundreds of nodes on the Trino side, ensuring the performance is just as good as querying Teradata directly.
Federated access
The big data team wanted to empower end users by giving them access to the data no matter where it lives — Teradata, Oracle, Hadoop, or AWS. Similarly, they wanted to ensure that virtual users could access on-prem resources as easily as on-prem users, and vice versa. Starburst Trino makes this possible while simplifying the process for those Business Intelligence users, allowing them to achieve a single point of access to all of this distributed data. They don’t know—and don’t have to know—where the data lives.
Cost savings
One of the drivers behind moving away from proprietary data warehouses was cost. Our customer wanted to avoid vendor lock-in and be able to move data to less expensive platforms without disrupting end users. Starburst Enterprise facilitates this in two ways. First, it is agnostic to the data source, and designed to query data wherever it lies. Second, it facilitates ETL, and allows organizations to continue accessing and analyzing data during ETL, so the company can accomplish its larger goals without disrupting the critical daily work of end users. The customer said that Starburst Enterprise also allows them to minimize its labor costs related to these processes.
Customer 360
With Starburst, the cable company is able to federate data between Hadoop and Teradata. This unlocks access for the marketing team to run their upsell campaign. As a result, they’ve generated over $200 million in revenue within the first few months of launching, and the campaign continues to generate hundreds of millions of dollars in subscription revenue for the business.
Americas
Telco
Teradata, Oracle, Hadoop, AWS
Enterprise
1000+
© Starburst Data, Inc. Starburst and Starburst Data are registered trademarks of Starburst Data, Inc. All rights reserved. Presto®, the Presto logo, Delta Lake, and the Delta Lake logo are trademarks of LF Projects, LLC
Up to $500 in usage credits included