In today’s data architecture economy, there are no shortages of options when it comes to choosing various distributions and deployment strategies for a given technology. You can deploy many open-source technologies on-prem yourself or you can run the same cluster on a cloud platform. If you decide to have a managed enterprise solution then you can choose between a cloud vendor or a standalone vendor. Each of these choices come with tradeoffs, but we believe that there are some tradeoffs with higher payoffs.
Trino is no exception to this notion. There are plenty of options when shopping for which distribution or deployment strategy of Trino you should choose. If you are on one of the teams that chose to deploy your own open-source Trino cluster or utilize Trino on EMR then this post aims to highlight the advantages of moving your Trino cluster to running on an independent Starburst Enterprise cluster. If you’re someone who is investigating Trino and you are balancing between open-source, Trino EMR, or an enterprise Trino distribution like Starburst, this post will offer some insights to help lead you in the right direction for your architecture.
One of the great benefits when using Trino is that it is not actually storing any of the persistent data. This means if you are living on Trino EMR, this actually makes migration so much easier than if you were trying to migrate between two different databases as you would have to copy your data as a time-consuming and costly part of the process.
With that, here are the top 10 reasons you should make the move from EMR Trino to Starburst Enterprise:
1. Cloud Platform Agnosticism
SEP is available on all the major cloud platforms, including Amazon AWS, Microsoft Azure, Google Cloud Platform, and RedHat Marketplace. While there can be some benefit to building some cloud-native applications, there is a clear advantage to having the flexibility to architect your system in such a way that it can run across any cloud provider. This expands your options if a cloud provider doesn’t exist in the area you want to deploy your system, your clients don’t want their data stored with a particular cloud provider, or you simply want to shop for the lowest bidder when it comes to paying those monthly cloud bills. Whatever your reasoning, having the ability to adjust to new and unknown use cases is always beneficial for the health of your applications.
2. Improved Security
SEP offers a wide range of security features such as role based access control, data masking, and encryption out of the box. There are also many more security features and configurations added to the various data source connectors that augment and simplify securing your cluster.
3. Improved Stability
SEP keeps up to date with the latest innovations from open-source Trino by doing short-term support STS releases, while less frequently releasing a much more stable LTS release. LTS releases go through an extra rigorous validation cycle before releases are made available to the general public so you can rest assured that you are using the most stable Trino available. If we find any vulnerability in the system in any of our releases, you will have a team of engineers able to quickly fix the issue. A team that includes the co-creators of Trino.
4. Better Performance
While SEP intentionally does not differ from the open-source core engine, the connectors lie at the heart of how SEP attains the speedups over any other version of Trino on the market. The cost-based optimizer is only as efficient as the metadata we feed it to devise the most optimal query plan. With SEP we provide connectors that expose statistics from the data source that improves your query plans and ultimately performance. In addition to exposing table statistics, most connectors are upgraded to run in parallel to enable faster reads and better utilize the parallel operators in core Trino. See more about which connectors have these performance enhancements here.
5. Enterprise Connectors
While we’re still talking about connectors, SEP also comes with a plethora of enticing connectors that do not exist with any other vendor. This includes connectors such as the Snowflake connector and Delta Lake connector. For many that are anxious about vendor lock-in with any of these platforms, we have you covered. You can run Starburst over a database like Snowflake and this allows you to query and move data seamlessly throughout your system. For a full list of enterprise connectors, check out our documentation here.
6. High Availability
SEP comes with High Availability options baked in. While you will typically have multiple workers running in a cluster, only a single coordinator is allowed to exist in a Trino cluster. This makes the coordinator a single point of failure and if it goes down, you lose all availability of the system. SEP has an option to provide the number of coordinator backups to have available in the case of failure. These backups are configured but shutdown to minimize costs and only come up in the event of a coordinator failure to minimize outage time.
7. Support
You’ve likely dealt with support teams that help with multiple cloud products, EMR Presto being one of them. If you’re experiencing an issue, do you want the jack-of-all trades, or a team of Presto experts? Starburst offers 24×7 support from a team that consists of the overwhelming majority of Presto contributors. This includes the co-creators of Presto and co-founders of Starburst and early adopters of Presto.
8. Trino Roadmap Influence
By having a team that contributes a lot of their efforts to the open-source project, we are able to take the experiences from our users and translate their needs into a roadmap for Trino at large. This is a mutually beneficial relationship as the open-source project grows with the larger communities’ needs.
9. Ease of Deployment
SEP offers out-of-the-box Kubernetes deployment, CloudFormation Templates, and AMI images all as options to deploy your cluster in minutes. It’s something you will want to experience yourself after tediously pouring over difficult and outdated documentation.
10. Lower Compute and Deployment Costs
With SEP, you’ll save time and cut compute costs with autoscaling. If your organization occasionally needs high concurrency or large data processing, SEP automatically scales up and scales down once you’re done using those resources. You’ll also be saving your engineers hours of building security and management solutions that have already been solved with our distribution.
That concludes the 10 reasons you should migrate from EMR Trino to Starburst Enterprise
There are many other aspects to be discussed when considering just how many wins there are to moving off of EMR specifically that our team would be happy to discuss with you whenever you’re ready to make your move. If you want to learn more about migrating your existing cluster or starting a new Trino cluster with us, check out our new page on Migrating from EMR to Starburst Enterprise or feel free to reach out to us.
I hope this was enlightening and wish you well on your journey to free your data.