Presto vs Trino: History
Presto, Formerly PrestoDB
Presto is the name of the query engine originally developed by Martin Traverso, Dain Sundstrom, David Phillips, and Eric Hwang at Facebook in 2012. Hive map reduce jobs were taking too long at Facebook’s scale, and Presto was created as a massively-parallel processing query engine that could turn 12+ hour SQL queries into queries that only took minutes or seconds.
Over the next several years, Presto expanded to cover a wide range of data sources, serving any database, data warehouse, or data lake, and able to easily integrate with BI tools. Being built as a SQL engine meant that data scientists and analysts using Presto could tap into their existing skills.
Trino, Formerly PrestoSQL
In 2018, Martin, Dain, and David left Facebook to create a fork of Presto which could better serve the open source community instead of just Facebook. The fork was initially called PrestoSQL, but they later rebranded to Trino in 2020.
While Presto was primarily developed at Facebook and Uber to serve hyper-scale internet companies, Trino was expanded on to serve a much broader variety of customers and use cases, while still including those hyper-scale web companies. Companies like Amazon, LinkedIn, Lyft, Netflix, Salesforce, Condé Nast, Goldman Sachs, and FINRA use Trino, among thousands of others.
Presto vs Trino: Features
Both Presto and Trino feature the same core ANSI SQL-compliant query engine. Their architecture is identical, and the two engines can be used in place of each other for most lightweight analytics workloads. However, there are several key differences and features unique to each query engine that are worth knowing about.
Presto Formerly PrestoDB
- Presto on Spark, allowing users to use Spark as an execution framework for Presto queries.
- Vector acceleration for the Hive connector, speeding things up when querying Hive.
Trino, Formerly PrestoSQL
- A substantially larger number of minor performance improvements and fixes, all of which have added up to a significant performance edge over time.
- Fault-tolerant execution mode for handling batch and ETL/ELT jobs with high reliability.
- Table functions for making it easier to write powerful queries, or for running syntax native to connectors.
- Dynamic filtering to speed up queries that contain JOINs.
- Expanded SQL support for keywords like MERGE, as well as dozens of extra functions.
- Variable-precision temporal types, with precision down to picoseconds, which can be very important for any time-critical systems such as financial transactions processing
Which is a better query engine?
We’re not going to mince words. In our opinion, Trino is the superior query engine. Since the two projects were forked, Trino’s development has been going at roughly thrice the pace of Presto’s, and it shows. It runs faster and serves a much wider variety of use cases. If your existing tech stack heavily relies on Spark and Hive for all things data, it may make sense to use Presto. For all other situations, Trino is the better, more versatile, more powerful option.
A few sprinkles on top: if you are using Hive, Trino has a built-in procedure to migrate your Hive tables to Apache Iceberg, allowing you to easily modernize your data stack and reap the performance and cost benefits. Some users who’ve already gone through with migrating have seen certain queries execute 95% faster.
Meet the creators of Trino (Formerly, PrestoSQL)
Martin Traverso
Creator of Presto, Creator of Trino, Co-Founder of the Trino Software Foundation, and CTO at Starburst. Prior to Starburst, Martin worked as a Software Engineer at Facebook, and a Software Architect at Proofpoint and Ning.
Dain Sundstrom
Creator of Presto, Creator of Trino, Co-Founder of the Trino Software Foundation, and CTO at Starburst. Prior to Starburst, Dain was a Software Engineer at Facebook, a Software Architect at Proofpoint, founded the Apache Geronimo project, and was one of the original JBoss authors.
David Phillips
Creator of Presto, Creator of Trino, Co-Founder of the Trino Software Foundation, and CTO at Starburst. Prior to Starburst, David was a Software Engineer at Facebook, and held senior engineering positions at Proofpoint, Ning and Adknowledge.
Additional resources
- What is Trino?
- Starburst vs Trino
- Trino and Advanced SQL training series
- Why an open source data architecture is the future of data analytics
- When is it time to upgrade from open source to enterprise-grade analytics?