Trino for geospatial analysis

x-malet · November 24, 2021, 3:31pm

Hi everyone !

I’m looking for a solution to query massive point cloud stored in iceberg with Trino. While doing some tests, I was wondering if there is a way to improve geospatial queries in Trino ? something like, building and storing a KDTree, pruning unwanted files by using a geohash or something ? Partitionning Iceberg table by geohash and using it transparently in Trino ?

If anyone have an optimal solution, even if it means that I have to put my soul in that, I would be really interested by it or any advice !

Thanks !

bitsondatadev · November 28, 2021, 5:32am

Trino has a pretty rich Geospatial functions — Trino 364 Documentation library that deals with the Well Known Text / Well Known Binary (WKT / WKB) formats to represent spatial types. Unfortunately there’s nothing like spatial partitioning.

I would say using a geohash might be the best way to do it. There’s a pair of function s provided to encode la decode a polyline. You could make a partition based on the hash of each envelope for your rtree. It just doesn’t feel very elegant. I’ll try to think if there’s some better way to do this.