Insert performance on Trino JDBC connectors

findinpath · October 31, 2021, 8:25pm

I’ve did a small showcase on how to ad-hoc copy data from one database/object store to another database/object store via Trino. This is definitely not the usecase for which Trino was initially conceived, but I was curious to see if such scenario basically works.

The project GitHub - findinpath/trino-hive-2-postgres: Copy data from Hive to PostgreSQL via Trino contains a small demo on how to copy data from hive towards PostgreSQL via Trino.

I know that Trino JDBC connectors can benefit of the write_batch_size session property (by default 1000) to speed up writing in the target database.

Can the writing on a JDBC database be done in parallel or only one worker can perform the writing?

In case that there’s parallelism for writing, can there be provided a hint to Trino to avoid doing too many parallel INSERT statements when executing a CTAS or INSERT ... SELECT statement on a larger Trino cluster ?

Thank you in advance.

Minnesota_Cold · November 1, 2021, 3:00pm

There is also the option for insert.non-transactional-insert.enabled that can be set:
Non-Transaction Inserts

The downside to N-T-I is that, as its namesake, is not a transaction. It may help, but, testing would be required.

In general, only object storage affords itself to multiple writers from a connector standpoint. (otherwise, look at the Enterprise release for some of those features)

hashhar · November 22, 2021, 6:56am

To add a little bit more information, here’s how writes work in Trino today (for almost all JDBC connectors).

The query plan has a TableWrite node in case of queries that write data back to a connector.
Depending on the number of tasks that push data into the TableWrite node there can be multiple table-writer tasks running on the cluster - maybe on all workers or maybe on a few of them.

Each writer task writes it’s own output to a “staging” table. Once the coordinator notices that all table writer tasks have finished it “commits” the operation by copying data from the “staging” table into the final table that you used in your query (mostly via SELECT INTO …)

That way it can provide atomicity - if any of the writer tasks fail the staging table can be dropped and the actual table will remain as it was.
It does have a cost though - copying data from the staging into final table takes time. If you want to trade-off that safety for performance you can use the non-transactional inserts feature as described by @Minnesota_Cold .

TL; DR: By default in most of the cases the writes to JDBC connectors happen in parallel. You can try to reduce the number of nodes that perform writes (and all other operations) by reducing hash_partition_count session property (but that affects all operations - not just the write so it may slow down the query).