OPTIMIZE alledgedly not supported by Hive Table

mcbenzing · July 3, 2024, 1:50pm

I am struggeling with the optimize function of the hive connector. It keeps saying that executing OPTIMIZE is not supported by Hive tables.

I created a table with the following statement:

create table tmp_cat_hive.hive_schema.t1 
with ( partitioned_by= ARRAY['c_birth_month'] ) 
AS  
SELECT c_first_name, c_last_name, c_birth_month 
FROM sample_data_tpc.sf1.customer;

Then I did :

SET SESSION tmp_cat_hive.non_transactional_optimize_enabled=true;
ALTER TABLE tmp_cat_hive.hive_schema.t1 EXECUTE optimize;

However i still get:

Executing OPTIMIZE on Hive tables is not supported

Any idea about what I could be doing wrong?

lester · July 3, 2024, 2:09pm

I do know that this is the case in Starburst Galaxy for Hive tables. I also know that many folks do not recommend using optimize with the Hive connector on SEP, but I’d need to spin up a cluster to double-check if it still is operational (I think it is).

On Galaxy, this will work for Iceberg tables.

mcbenzing · July 3, 2024, 2:16pm

I use Starburst Galaxy, so you don’t need to start a cluster to check this. The Starburst Galaxy documentation doesn’t have much information about the connectors, so I looked in the Enterprise documentation. Sometimes it’s hard to tell what will work in Galaxy and what won’t.

Anyways. Thanks for your help!

lester · July 3, 2024, 2:41pm

I do agree with that strategy (leveraging the formal connector docs) AND with things don’t always work the same. I do know the Docs team are always working hard to make things better. It would be nice to know of “things that the Trino/SEP connectors do that just don’t work with Galaxy”.

As a shout out example to the Docs team, they have doc’d Starburst | Iceberg table format showing how to migrate from Hive to Iceberg which is different than how you do it in the SEP Iceberg connector.

In fairness, I guess that’s actually what they are already trying to do. Show what can be done as apposed to what can’t be done. Of course, everything can always be improved and I assure you the Docs team is about continued improvement. Good folks there!!