hi team,
does Trino support database caching (RDBMS)? AFAIK Trino supports object storage caching via Rubix (and Alluxio) but it seems RDBMS caching is not supported? Can you give us some details/architectural decisions about that and any recommendation if we want to utilize caching for RDMBS? Thank you and much appreciate!
There is no actual data caching in Trino for connectors that use JDBC connection, which are mostly RDBMS connectors. There is some metadata and statistics caching in these connectors for example PostgreSQL connector — Trino 436 Documentation
See config properties such as metadata.cache-ttl
Starburst Galaxy and Starburst Enterprise support other optimizations such as result set caching.
Also note that many client tools use and support all sorts of additional caching, which then varies in terms of impact depending on the client tool. For example, server side client applications might be able to cache for multiple users while pure client side tools only cache on the client machine.
One more thing, Rubix caching will be removed soon and Alluxio will replace it in Trino. Starburst Enterprise and Starburst Galaxy include separate, more powerful indexing and caching with Starburst Warp Speed.
1 Like
Thanks @simpligility for the detailed information. One more question, so as I understand caching (Hive connector with Rubix, Alluxio) is connector-implemented only and it’s actually not a part of Trino core/architecture? Thank you!
The Alluxio caching that is added to Trino will be built in and part of core Trino. Just like Rubix is currently. The initial implementation PR adds support for the Delta Lake connector. There are proposals for Iceberg as well. And we will probably follow up with Hive and Hudi over time. It all depends on what the community works on and sends PRs for.
You can follow along with discussion and progress at Alluxio cache by Pluies · Pull Request #18719 · trinodb/trino · GitHub
1 Like