Known issues in Impala

This topic describes the known issues for Impala in Cloudera Data Warehouse on cloud, version 2025.0.21.0-185.

Known issues identified in the March 31, 2026 release

There are no new known issues identified in this release.

Known issues identified before the March 31, 2026 release

CDPD-91992: Impala crashes when writing multiple delete files per partition in a single DELETE operation
Impala may crash during a DELETE operation that requires writing multiple delete files to the same partition. This issue occurs due to a state management conflict, leading to instability.
To avoid this issue, perform one of the following workarounds:
  • Modify the DELETE statement to affect fewer records in a single operation, thereby reducing the number of delete files generated per partition.
  • Increase the parquet_file_size query option for the DELETE statement. A larger file size can consolidate deletes into fewer files.

Apache Jira: IMPALA-14496

CDPD-64615: Underestimated or overestimated coordinator memory
The Impala planner's memory estimation for coordinators is sometimes inaccurate, which cause queries to fail or be unnecessarily queued. This is most noticeable in edge cases where the estimated memory is either too low or too high.
None

Apache Jira: IMPALA-12622

CDPD-62248: Unnecessary profiles for metadata commands
Impala generates query profiles for internal metadata commands such as SET, USE, GET_TABLES, and GET_SCHEMAS. These profiles create unnecessary overhead and can lead to a large number of files, which affects performance for both the Hue query process and customer scripts that parse these profiles.
None.
DWX-20490: Impala queries fail with “Caught exception The read operation timed out, type=<class 'socket.timeout'> in ExecuteStatement”
Queries in impala-shell fail with a socket timeout error in execute statement which submits the query to the coordinator. The error occurs when query execution takes longer to start mainly when query planning is slow due to frequent metadata changes.
Increase the socket timeout on the client side. Set --client_connect_timeout_ms to a higher value, e.g. add --client_connect_timeout_ms=600000 to the impala-shell command line.
DWX-20491: Impala queries fail with "EOFException: End of file reached before reading fully"
Impala queries fail with an EOFException when reading from an HDFS file stored in an S3A location. The error occurs when the file is removed. If the file is removed using SQL commands like DROP PARTITION, there may be a significant lag in Hive Metastore event processing. If removed by non-SQL operations, run REFRESH or INVALIDATE METADATA on the table to resolve the issue.
Run REFRESH/INVALIDATE METADATA <table>;
Delay in listing queries in Impala Queries in the Job browser
Listing an Impala query in the Job browser can take an inordinate amount of time.
None.
IMPALA-11045: Impala Virtual Warehouses might produce an error when querying transactional (ACID) table even after you enabled the automatic metadata refresh (version DWX 1.1.2-b2008)
Impala doesn't open a transaction for select queries, so you might get a FileNotFound error after compaction even though you refreshed the metadata automatically.
Run the INVALIDATE METADATA statement on the transactional (ACID) table to refresh the metadata. This fixes the problem until the next compaction occurs. For information about running this statement, see INVALIDATE METADATA statement.
Impala Virtual Warehouses might produce an error when querying transactional (ACID) tables (DWX 1.1.2-b1949 or earlier)
If you are querying transactional (ACID) tables with an Impala Virtual Warehouse and compaction is run on the compacting Hive Virtual Warehouse, the query might fail. The compacting process deletes files and the Impala Virtual Warehouse might not be aware of the deletion. Then when the Impala Virtual Warehouse attempts to read the deleted file, an error can occur. This situation occurs randomly.
Run the INVALIDATE METADATA statement on the transactional (ACID) table to refresh the metadata. This fixes the problem until the next compaction occurs. For information about running this statement, see INVALIDATE METADATA statement.
Sessions with Impala continue to run for 15 minutes after the connection is disconnected.
When a connection to Impala is disconnected, the session continues to run for 15 minutes in case the user or client can reconnect to the same session again by presenting the session_token. After 15 minutes, the client must re-authenticate to Impala to establish a new connection.
None.