Recently, I had the pleasure of chatting with Ravit Jain on his show “The Ravit Show” to discuss the evolution of Trino and where things are headed for the open-source project. It’s been an exciting time for us since rebranding PrestoSQL to Trino and bringing on many of the original Presto creators and contributors to the Starburst team. The Trino brand, with the help of our popular mascot Commander Bun Bun, has gained a huge following and exists within a supportive and active open-source community. But how did Trino come to be? What’s the Starburst connection? And what does the future hold for Trino?
The origins of Trino, formerly PrestoSQL
In 2012, I joined the Facebook data infrastructure team alongside Martin Traverso and David Phillips. Together with Eric Hwang, we created Presto to address the problems of low latency interactive analytics over Facebook’s massive Hadoop data warehouse. One of our non-negotiable conditions was for Presto to be an open-source project. Open-source is in our DNA — we had all used and participated in open-source projects to various degrees in the past, and we recognized the power of open communities and developers coming together to build successful software that can stand the test of time.
Over the next six years, we worked hard to build a healthy open-source community and ecosystem around the project but, unfortunately, it became clear that the Facebook management wanted to have tighter control over the project and its future. As a matter of principle, we had no choice but to leave Facebook in order to focus on making sure Presto continued to be a successful project with an open, collaborative, and independent community.
In reality, the choice was easy. We knew Presto had a lot of potential. Facebook only uses part of the engine, so the decision to expand it on our own was made much easier when we saw interest and support from developers in the community. After attending our first few Presto conferences and connecting with the growing community, we had the validation we needed to pursue the project on our own, full-time.
A fast, distributed SQL query engine for big data analytics
Trino has grown tremendously since its inception at Facebook. I’ve since joined Starburst along with Martin Traverso and David Phillips to spearhead the enterprise offerings of Trino while simultaneously improving the open-source engine. Today, companies are using Trino in a variety of ways — for single reporting applications, as a single enterprise-wide query engine, and much more. With the tremendous growth of cloud computing and self-service BI demand, Trino offers a more agile approach to data access by and for data consumers.
Allowing data to be consumed by anyone that needs access has become increasingly important to organizations, along with the time it takes to make said data available. Trino powers Starburst Enterprise and Starburst Galaxy to do just that, providing a highly efficient, parallelized execution path that speeds queries while slashing time-to-insight to just minutes.
On the open-source side, the Trino community continues to be a strong force with close to 6,675 members in the Slack community and more than 500 active contributors from around the globe. Trino operates under a number of project values:
- Correct: Trino is used for critical decisions (e.g., financial results for public markets), and results must always be correct.
- Secure: Trino is a gateway to sensitive information, and must protect that information.
- Long Term: We expect that Trino will be used for at least the next 20 years. We build for the long term.
- Standards-based: These can be formal standards like ANSI SQL, JDBC, or ODBC, or implicit conventions of industry-standard databases. This makes it easier for users and integrators, because their existing skills transfer.
- Just works: Simple to get started. Trino should just work out of the box and provide good performance with minimal setup. Trino is a large complex system, so simplification makes everything better.
- Supported: Everything that ships with Trino is supported. This means that features that cannot be tested and supported are not added, e.g., PowerPC support is only being added now that test hardware is available.
- Real-world uses: Trino is designed and tuned for real-world workloads over synthetic benchmarks.
- Commercially friendly: We encourage enterprises to use Trino for their analytics needs, and we encourage vendors to base products on Trino. We appreciate contributions back, but do not require them.
With the support of Starburst and our growing community, we’re constantly working towards building the next generation of fast, distributed SQL query engines for big data analytics.
The future of Trino
As the big data landscape continues to grow and the standards of the industry change over time, we continue to update and improve Trino in parallel. When developing and iterating on an open-source project like this one, it’s important to always be forward-thinking. What is considered a new and exciting feature at one point in time can quickly become a standard practice in just a few months or years.
Trino has already been great at running interactive queries. As the project has grown, we’ve seen more and more demand for an equally amazing experience for long-running queries and batch/ETL workloads and we’re building just that in a new release called “Project Tardigrade!” We shared more details about the release along with other exciting announcements, at our mini Trino Summit, Cinco de Trino.
As for the future of Trino, our eyes are always on performance, ease, and making sure everything “just works!” We’re constantly working to improve Trino by adding new connectors and integrations, like improving support for Iceberg, Delta Lake, and more. To keep up with the changing “SQL Standard,” we’ll be adding things like polymorphic table functions and new JSON functions to the engine as well. With so many cooks in the kitchen, there’s a lot to be churned out, but we know our enterprise partners and supportive community will help us get there.
We’ve got a lot planned this year, from new releases to Meetups and events (hopefully some in-person soon, fingers crossed!) and we hope to see you there!
What are some next steps you can take?
Below are three ways you can continue your journey to accelerate data access at your company
- 1
- 2
Automate the Icehouse: Our fully-managed open lakehouse platform
- 3
Follow us on YouTube, LinkedIn, and X(Twitter).