Trino Hive Connector fails to query data: Could not obtain block

maatouk-j1 · August 18, 2022, 1:04pm

Hello,

I’m using the Trino Hive connector to query data in my Kerberized Cloudera CDH 6.x environment.
I have a Trino master and a Trino worker installed and configured on remote servers (not on same servers as the CDH).
I configured the “hive.properties” catalog and when I’m using the Trino CLI to test the connection.
I’m able to fetch metadata when executing commands such as “show schemas from hive” and “show tables from hive.default” but when trying to retrieve data using queries such as “select * from hive.default.test” I’m getting the following error:

org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: ….

I executed the command “hdfs fsck /” from my HDFS node to check if my filesystem is healthy, and indeed it is, there are no missing blocks.

Also, I tried reading the file which the Trino connector is trying to access and it was successful, I can easily check the data using “hdfs dfs -cat ”.

Also, since the HDFS namenode is highly available, I added the configuration resources path (core-site.xml and hdfs-site.xml) to the hive catalog. To note that I downloaded these resources from the node running HDFS on CDH.

I tried researching the error mentioned above but couldn’t resolve it.

Anyone has any idea about this ?

zeeshan · November 8, 2022, 4:19am

this occurs usually HDFS worker nodes are not reachable from Trino .
check telnet or NC to worker nodes also there is a limitation that all workers nodes must be reachable even if the one worker node is unreachable query will fail .

Scorpion2115 · May 20, 2023, 1:15pm

Hi, I am having exactly the same issue with you. But the odds thing is, my queries from Trino to HDFS do not always failed…Every time after I restart the trino service, the query to HDFS would succeed for couples of times and then started to through out this error

Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block:

So have you resolved this issue yet?

LuciferCoder · May 11, 2024, 8:02am

you are the hero! I checked my hadoop cluster’s datanode which firewalld service;and stoped the firewalld;
trino come to right!