Hello
Sorry for these naive questions. First time using trino but seems more difficult than using presto from EMR. I basically have large amount of data under hdfs in format like /user/hadoop/uuid_1/2024/03/01/01/ many parquet files. It is partitioned by year/month/day/hour, as the directory suggests. I already created hive external table on it and added partitions (presto cli can query it very well). Now I try to use JDBC/trino to query it, but I find that the tables are not set up, so I even dont know how to query it from trino cli.
If I created hive table from hive cli, will that table already show up under trino cli? I can not find it in hive catalog.
So, I create schema like
create schema hive.schemaName with (location='/user/hadoop/');
I notice external_location does not work here, should I use external_location to suggest I dont need to insert data?
Then I run
create table hive.schemaName."uuid_1" (//detailed_schema_omitted) WITH (FORMAT='PARQUET');
But if I do select * from hive.schemaName."uuid_1" limit 1
it does not show anything.
How can I easily solve this?
From my test, whenever I create a table within 1 schema, trino will create a subdirectory under the location of the schema. So, in above example I created table with name hive.schemaName.“uuid_1”, where uuid_1 is already the bucket name with contents.
I assume external_location should be used when creating schema, is that right?
Thanks.