Why Trino is the PostgreSQL of analytics?
Evan Smith
Technical Content Manager
Starburst Data
Evan Smith
Technical Content Manager
Starburst Data


More deployment options
PostgreSQL has garnered a reputation as the Swiss Army knife of relational databases. In fact, its utility and versatility make it invaluable to development teams across a wide swath of use cases.
Today, another technology is taking up a similar ubiquity in the area of analytics–Trino.
The idea is not new. A few months ago, Sanjeev Mohan, a former analyst with Gartner, called the Trino SQL query engine “the PostgreSQL of analytics.” Check out the video below:
We couldn’t agree more, and the more you investigate the analogy, the more it makes sense. This article will explain why Trino has become indispensable for analytics workloads and why you should consider adding it to your data stack if you haven’t already.
Why PostgreSQL is so successful
PostgreSQL’s success stems from its adaptability, which has driven adoption. In Stack Overflow’s 2018 developer survey, around 33% of developers used the relational database. Today, according to the same survey, almost half—49%—are using it in some capacity. In the DB-Engines ranking, PostgreSQL has consistently outperformed the rest of the market.
PostgreSQL is used anywhere and everywhere that can benefit from workloads serviceable by an online transaction processing (OLTP) data storage model. There are several reasons for this:
- Versatility: PostgreSQL’s plugin architecture supports many different data types and transaction styles.
- Speed: PostgreSQL is a performant technology that suits many organizations.
- Ease of use: PostgreSQL is designed to be easy to use and accessible, which has made it the first port of call for organizations with smaller teams or fewer resources.
- Open-source and low-cost: PostgreSQL is an affordable, open solution due to the lack of licensing fees.
The Swiss Army knife utility of PostgreSQL has led to a “Postgres for Everything” movement, which champions using the technology wherever applicable. In fact, many teams are using it as a cache, message queue, data warehouse, JSON DocumentDB store, and geospatial and time series data store, among other use cases.
Using PostgreSQL often makes sense from a cost standpoint as well. Given the expensive licensing of other relational database solutions such as Oracle and Microsoft SQL Server, moving to PostgreSQL secures instant savings. It also saves time and money in other ways:
- Developers don’t have to learn a new technology. Instead, they can leverage what they already know.
- Administrators also don’t have to learn how to scale, secure, and manage a new piece of infrastructure. Instead, they can apply lessons learned from previous deployments and manage new instances with existing tools.
Trino: The PostgreSQL of analytics
So what makes Trino “the PostgreSQL of analytics”?
For analytics, organizations have realized they need a solution that can query their data interactively across the organization, pulling in data no matter where it lives. In other words, they need a query tool with a versatility in analytics that mirrors PostgreSQL’s general versatility among relational databases.
Understanding Trino
Enter Trino, a SQL query engine that can query data across data sources.
A standard SQL client tool, like pgAdmin, typically works against a single database instance at a time. Traditional SQL tools operate well as single systems. However, they struggle to work across multiple systems, which can encourage the creation of data silos. By contrast, a SQL query engine like Trino supports analytic workloads against extremely large data sets spread across multiple data stores.
Massively Parallel Processing (MPP) and connectors
Trino uses a Massively Parallel Processing (MPP) architecture to spin up compute, devise a query plan, and then run and synthesize the results of complex queries. It optimizes queries for analytics workloads using techniques like join reordering, predicate pushdown, and partial aggregation.
Why Trino powers multiple data platforms
Why is there such widespread adoption of Trino? Simple: Trino brings a world of benefits to distributed analytics workloads, much like PostgreSQL brings to the world of relational databases. These benefits include:
- Integrations
- Versatility
- Speed
- Ease of use
- Low-cost and open-source
- Reliability
- Governance
Let’s look at each of these in detail and see how they apply to Trino.
Integration
Today, Trino is supported across a wide swath of the modern data stack. It’s supported by a large number of commercial deployments, including Raft, Stackable, and Starburst. It’s integrated directly into several major data lake and lakehouse architectures, including Starburst and Amazon Athena (which was originally based on Trino’s predecessor, Presto).
Trino can also access data in all of the major data storage and query solutions in use in today’s modern enterprise data architectures. Its rich set of connectors allows Trino to access data in Snowflake, Clickhouse, Amazon Redshift, and more. It provides support for all modern data formats, including DataBricks Delta Lake, Hive, Hudi, and Iceberg.
In other words, if you’re looking for one tool that can query data across your entire data estate, Trino is it.
Versatility
Like PostgreSQL, Trino covers a large variety of modern data use cases. In Trino’s case, these use cases are highly focused on the analytics space:
- High-speed data ingestion at a reasonable cost
- ETL/ELT
- Near real-time streaming analytics
- Lightning-fast ad-hoc analytics
- Support for Machine learning (ML) and Artificial Intelligence (AI) data architecture
Trino can implement these use cases across your entire data ecosystem, including your existing Hive clusters, your data lakes, and your data lakehouses. It can work with both centralized and decentralized data stores.
In other words, Trino doesn’t force you to centralize on a single data solution. You can employ a flexible data architecture that grows and shifts in response to evolving technology and shifting business requirements.
Speed
Older approaches to large-scale queries required importing all your data into a single location, such as a data warehouse, and then performing a long-running batch query to generate a report. These solutions are often too slow and brittle for today’s modern data users, who need to make snap decisions based on real-time information.
Developers built Trino specifically to overcome these limitations and support efficient analytics. The Trino project began its life (as Presto) as a project at Meta to enable engineers and business users to query the company’s 300PB data lake.
Thanks to this focus on real-time results, Trino outperforms previous solutions in the analytics space. For example, it runs faster than Apache Spark thanks to its support for parallel execution, 100 percent execution in memory, MPP architecture, and advanced performance techniques such as predicate pushdown.
Even better, Trino can still be made faster. At Starburst, we’ve made key performance improvements to open-source Trino. Our Warp Speed feature, for example, increases performance by a factor of seven while simultaneously reducing cloud compute costs.
Low-cost and open-source
For years, PostgreSQL has represented a more versatile option compared to data warehouse options like Oracle and Microsoft SQL Server. . Like PostgreSQL, Trino is built by the community, for the community. It was created as an open-source project by some of the original developers behind Presto and remains an open-source solution to this day.
A look at some numbers shows how successful this strategy has been:
- 138 individual contributors in 2024
- Over 14 releases and 2,822 commits in that year alone
- Contributions from 50+ companies, including Google, Apple, and Amazon
Trino’s open source is currently overseen by the Trino Foundation under an Apache 2.0 license. This open licensing and grass-roots engagement keep Trino focused on the needs of the community at large versus one specific vendor. This has helped it to both retain and grow its versatile footprint in the modern analytics space.
Reliability
One of the downsides of the old batch approach to analytics is that data might not be available when a business user needs it. A broken data pipeline or a problem with a report could help business decision-making in its tracks.
Modern analytics applications need to scale—sometimes rapidly—and offer 24/7 availability. Trino’s fault-tolerant execution architecture enables querying data at any scale and at any time. Its support for containerization via Docker means companies can scale new Trino clients instantly, whether on-prem or in the cloud.
Governance
Trino is also built to meet modern security and compliance needs. It provides granular access control over data encryption, lineage, audit logging, and support for enforcing regulatory compliance measures.
Because of its versatility, Trino can eliminate technical debt, replacing older elements of your data architecture. This, in turn, helps improve governance, which helps speed up the adoption of new use cases such as Machine Learning and AI.
Other companies are also working hard to provide additional security and governance capabilities on top of Trino. For example, Starburst adds governance improvements such as support for impersonation, role-based access control (RBAC), and a query logger and auditor.
Managing Trino at enterprise scale
What’s the best way to experience Trino? Starburst Galaxy is a fully managed Trino solution that combines the SQL query engine with the Apache Iceberg data format to create a powerful, all-in-one data lakehouse solution. Besides making Trino easy to administrate, Starburst Galaxy adds support for advanced features such as access control and data governance, autoscaling, and an easy-to-use interface.
Want to see if Starburst is a good fit for your use cases? Try it for free today.