ETL: Extract Trino Load, a Case for Trino as a Batch Processing Engine
Join us virtually or in-person at our Boston HQ for our monthly Trino Meetup! Our speaker, Andrii Rosa will be live in Boston, along with the team behind Project Tardigrade. Please select virtual or live and we will send you all the info you’ll need to join. We’re looking forward to seeing you there and see below for more details on the talk!
Trino was initially built to replace Hive workloads at Facebook and handled massive petabyte-scale batch workloads. Yet across the board, Trino was not being widely adopted as a batch ETL engine to solve these workloads. As it turns out, one of the features that drive Trino’s incredible speed was forgoing failure recovery measures to buy faster queries. In practice, many desire the opportunity to have the system running the query to facilitate the recovery from failures. The Trino community has banded around supporting native granular failure recovery to improve resiliency in the event of a failure. This brings Trino to a new frontier by enabling both exploratory and failure recovery for long-running workloads so that engineers and analysts do not have to shift between systems to run their queries.
Andrii Rosa
Software Engineer, Starburst