Hive vs Glue query performance

,

We use Starburst on AWS and store data as parquet files in S3. Is there a difference of using hive or glue for queries from a performance perspective?

Do you happen to cache metadata explicitly? That may be the quickest way to speed up the query execution assuming analyzing/planning/optimizing takes noticeable time for your workload. With a self-managed HMS, you have more control (you can provision better hardware) while Glue gives you ease of use but there are some limits on the # requests etc.

If caching metadata on Starburst side is acceptable for you (most people do that), you shouldn’t worry too much about HMS vs Glue. You should also check out Iceberg. It removes the performance issues when reading a lot of data.