Hive connector query fails

Dear Community,

Trino version: 364
We are facing error while running query. Error message given below.

Query 20220111_104442_00007_z8uk4, FAILED, 7 nodes
Splits: 2,864 total, 13 done (0.45%)
8:16 [2.39M rows, 966MB] [4.82K rows/s, 1.95MB/s]

Query 20220111_104442_00007_z8uk4 failed: Error reading tail from hdfs://hacluster/user/hive/warehouse/psn.db/detail_uf/dt=1640628000/no=0/154d6b916cc5b128-8ea2b6547d0d1bbb_2066729209_data.0.parq with length 16384
io.trino.spi.TrinoException: Error reading tail from hdfs://hacluster/user/hive/warehouse/psn.db/detail_uf/dt=1640628000/no=0/154d6b916cc5b128-8ea2b6547d0d1bbb_2066729209_data.0.parq with length 16384
	at io.trino.plugin.hive.parquet.HdfsParquetDataSource.readTail(HdfsParquetDataSource.java:113)
	at io.trino.parquet.reader.MetadataReader.readFooter(MetadataReader.java:94)
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:213)
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:164)
	at io.trino.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:286)
	at io.trino.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:175)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:49)
	at io.trino.split.PageSourceManager.createPageSource(PageSourceManager.java:68)
	at io.trino.operator.ScanFilterAndProjectOperator$SplitToPages.process(ScanFilterAndProjectOperator.java:268)
	at io.trino.operator.ScanFilterAndProjectOperator$SplitToPages.process(ScanFilterAndProjectOperator.java:196)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:319)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:372)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:306)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:372)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:306)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:372)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:221)
	at io.trino.operator.WorkProcessorUtils.lambda$processStateMonitor$2(WorkProcessorUtils.java:200)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:372)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:221)
	at io.trino.operator.WorkProcessorUtils.lambda$finishWhen$3(WorkProcessorUtils.java:215)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:372)
	at io.trino.operator.WorkProcessorSourceOperatorAdapter.getOutput(WorkProcessorSourceOperatorAdapter.java:151)
	at io.trino.operator.Driver.processInternal(Driver.java:388)
	at io.trino.operator.Driver.lambda$processFor$9(Driver.java:292)
	at io.trino.operator.Driver.tryWithLock(Driver.java:685)
	at io.trino.operator.Driver.processFor(Driver.java:285)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1078)
	at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
	at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
	at io.trino.$gen.Trino_364____20220111_103516_2.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-952478870-192.168.212.2-1451608027649:blk_33418479072_36961430745 file=/user/hive/warehouse/ps.db/detail_uf/dt=1640628000/no=0/154d6b916cc5b128-8ea2b6547d0d1bbb_2066729209_data.0.parq
	at org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:879)
	at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:862)
	at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:841)
	at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:567)
	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
	at java.base/java.io.DataInputStream.read(DataInputStream.java:149)
	at java.base/java.io.DataInputStream.read(DataInputStream.java:149)
	at io.trino.plugin.hive.util.FSDataInputStreamTail.readTail(FSDataInputStreamTail.java:59)
	at io.trino.plugin.hive.parquet.HdfsParquetDataSource.readTail(HdfsParquetDataSource.java:109)
	... 33 more

Same query works fine on beeline.

It seems a decryption problem.
Which process did write the parquet file initially?

Can you provide some more info to help us reproduce the issue on a testing environment?

We have fixed this problem.
There was connectivity issue with one datanode.
After we fixed connectivity with datanode query is working fine.

2 Likes

I am encountering the following error intermittently; however, it seems to work fine upon the next attempt. What could be the root cause?
Error: Connector reply error: SQL##f - SqlState: S1000, ErrorCode: 1060, ErrorMsg: [Starburst][Trino] (1060) Trino Query Error: Failed to read Parquet file: s3://ssss/parquet/xyz/dtl/cur_dts=2020-08-21/part-00003-74c8013a-15f2-47ea-ace6-96a61199c8ba.c000.snappy.parquet (16777217))

1 Like

Since this is an intermittent error it is most likely related to an instability of your cluster and setup and one or more of the following issue could be the cause

  • Network connectivity to S3 from one or more workers is bad
  • One or more workers have CPU or memory issues that prevent stable IO
  • S3 or your connection to it (or quota or whatever) has performance issues
2 Likes