Fixed issues in Cloudera Data Warehouse on premises 1.5.5 SP2

Review the issues fixed in this release of the Cloudera Data Warehouse service.

DWX-22261: Resource quota limitations causing pod scheduling failures in Cloudera Data Warehouse on premises
Previously, when quota management was enabled in a Cloudera Data Warehouse on premises environment, the service would reserve the required resources in the quota management system for components deployed within the Database Catalog, Virtual Warehouse, and other relevant namespaces. As a result, using external tools could trigger pod scheduling failures. These tools injected additional sidecar containers or pods into the namespace, causing the overall resource consumption to exceed the quota initially allocated by Cloudera Data Warehouse.

This issue is now resolved.

DWX-21395: Log routers are not requesting enough resources
Previously, the Cloudera Data Warehouse log router component's resource quota was insufficient. Because the log router runs as a DaemonSet (with instances on multiple nodes), the resource request calculation did not account for the number of nodes or pod instances in the cluster. This led to resource constraints and issues with log router pods.

This issue is now resolved ensuring that sufficient resources are requested and allocated. Cloudera Data Warehouse on premises now correctly calculates the resource quota of the log router by multiplying the resource request of the underlying pod by the number of nodes it is executed on.

DWX-22286: Missing cleanup of stale and empty Database Catalog configuration entries in Cloudera Data Warehouse on premises
In Cloudera Data Warehouse on premises, the Database Catalog configuration view previously displayed several stale or empty configuration entries, such as the following ones:
  • hadoop metric s3a-file system
  • hadoop metrics2
  • hadoop metrics2-hivemetastore
  • hadoop-core-site-embedded-hs2

These entries appeared as blank pages without any keys or values.

This issue is now resolved. The stale hadoop-metrics2 configurations are removed, and empty configurations are filtered out from the Database Catalog configuration view.

DWX-22401: User group configuration visibility and editing limitations in Hive Virtual Warehouse
In Cloudera Data Warehouse 1.5.5, the Hive Virtual Warehouse details page did not display the configured user group information or allow editing of user groups. This limitation prevented administrators from reviewing or modifying user group configurations for existing Virtual Warehouses through the UI.

This issue is now resolved.

DWX-22273: Incorrect SSO authentication checkbox in Virtual Warehouse creation
In Cloudera Data Warehouse on premises, the Enable SSO checkbox was incorrectly displayed when creating or editing Virtual Warehouses, despite SSO authentication not being supported for on-premises deployments. This could mislead users into believing that SSO-based authentication was available and configurable, causing potential confusion during setup.

This issue is now resolved and the Enable SSO checkbox is no longer displayed.

DWX-22306: Missing restore plan for DRS restore with rebuild instead of just message
Previously, when running a DRS restore with rebuild using the Cloudera Data Warehouse API or the CDP CLI (cdp dw restore-backup --rebuild), the response only provided a generic message indicating that the restore had been started. This lack of detail offered no visibility into the specific steps involved in the restore operation.

This issue is now resolved. The restore response is enhanced to include a restorePlan field, which provides a structured summary of the actions that will be performed during the restore process (similar to the existing RestoreCluster response).

DWX-21896: Transient UI error for failed resource pool runtime upgrade
Previously, when creating a resource pool, if the associated Database Cluster or Virtual Warehouse runtime upgrade fails, the resulting error message is displayed only temporarily in the UI. Shortly after the failure, the error message vanishes, leaving the UI with no persistent indication that the runtime upgrade failed. This behavior can be misleading, as it might appear that the operation was successful or is still in progress. Users are unable to identify the root cause of the failure directly from the UI and must inspect the dwxserver or resource pool manager logs to diagnose the issue.

This issue is now resolved.

CDPD-89414: Incorrect results for window functions with IGNORE NULLS
When you used the FIRST_VALUE and LAST_VALUE window functions with the IGNORE NULLS clause while vectorization was enabled, the results were incorrect. This occurred because the vectorized execution engine did not properly handle the IGNORE NULLS setting for these functions.
This issue is addressed by modifying the vectorized processing for FIRST_VALUE and LAST_VALUE to correctly respect the IGNORE NULLS clause, ensuring the same results are produced whether vectorization is enabled or disabled.

Apache Jira: HIVE-29122

CDPD-60770: Passwords with special characters fail to connect with Beeline
When you used a password containing special characters like #, ^, or ; in a JDBC URL for a Beeline connection, the connection failed with a 401 error. This happened because Beeline did not correctly interpret these special characters in the password.
This issue is resolved by introducing a new method to reparse the password from the original JDBC URL, allowing Beeline to correctly handle and authenticate passwords containing special characters.

Apache Jira: HIVE-28805

CDPD-85600: Select queries with ORDER BY fail due to compression error
When you ran a Hive SELECT query with an ORDER BY clause, it failed with a java.io.IOException and java.lang.UnsatisfiedLinkError related to the zlib decompressor.
The issue was addressed by ensuring the zlib native library is correctly loaded.

Apache Jira: HIVE-28805

CDPD-90301: Stack overflow error from queries with OR and MIN filters
Queries, cause a stack overflow error when they contained multiple OR conditions on the same expression, such as MINUTE(date_) = 2 OR MINUTE(date_) = 10.
This issue is addressed by modifying the HivePointLookupOptimizerRule to keep the original order of expressions and to check if a merge can be performed before creating a new expression.

Apache Jira: HIVE-29208

CDPD-90303: Incorrect results from a CASE expression
A query that used a CASE expression to conditionally return values produced an incorrect result. The query plan incorrectly folded the CASE statement into a COALESCE function, which led to a logic error that filtered out some of the expected results.
This issue is addressed by adding a more strict check when converting CASE expressions into COALESCE during query optimization.

Apache Jira: HIVE-24902

CDPD-80655: Compile error with ambiguous column reference
A Hive query using CREATE TABLE AS SELECT with a GROUP BY clause and a window function failed with an "Ambiguous column reference" error. This happened because the query plan couldn't correctly handle redundant keys in the GROUP BY clause.
This issue is fixed by improving the query planner's logic to properly handle complex expressions and their aliases within window functions, allowing the query to compile and run successfully.

Apache Jira: HIVE-28878

DWX-20754: Invalid column reference in lateral view queries
The virtual column BLOCK__OFFSET__INSIDE__FILE fails to be correctly referenced in queries using lateral views, resulting in the error:
FAILED: SemanticException Line 0:-1 Invalid column reference 'BLOCK_OFFSET_INSIDE_FILE.

This issue is now resolved.

Apache Jira:HIVE-28938

CDPD-57938: LDAP-based access control for Hive Kerberos users
Previously, when you used Kerberos authentication in Cloudera Data Warehouse for Hive Virtual Warehouses, LDAP filters were not executed.
This issue is resolved by implementing LDAP-based access control for Hive Virtual Warehouses. Now, when you authenticate with Kerberos through a client such as JDBC or Beeline, Hive executes the configured LDAP filters. If the Kerberos-authenticated user does not belong to the required LDAP groups, Hive returns an unauthorized error.
CDPD-93737: Alter table failure on non-ACID tables
Previously, an ALTER TABLE command failed when executed on a non-transactional managed table in Cloudera Data Warehouse. This occurred because the table was identified as a managed table but was not transactional, which violated strict managed table requirements.
This issue is resolved by aligning the environment behavior and ensuring that managed tables adhere to the required standards.
HIVE-29210: Compaction duplicates when HMS initiator crashes
Minor compaction could conditionally produce duplicate records if the Hive Metastore (HMS) instance running the initiator unexpectedly crashed.
This issue is addressed by a compactor cleaner fix that prevents the creation of duplicate directories from multiple jobs running the same compaction.

Apache Jira: HIVE-29210

HIVE-29272: Data loss in Insert-Only tables during MINOR compaction
In insert-only tables, a query-based MINOR compaction could cause data loss if the compacted table had both an aborted and an open transaction. This occurred because the compaction process incorrectly used the minOpenWriteId as a lower limit when selecting delta files to merge. This led to the compactor creating an empty delta file, and when the Cleaner ran, it removed the original, non-compacted delta files, resulting in data loss.
This issue is addressed by changing the query-based MINOR compaction logic to not consider the minOpenWriteId when selecting the delta files for compaction. The highWatermark already correctly sets the upper limit for the compaction range.

Apache Jira: HIVE-29272

DWX-21855: Impala Executors fail to gracefully shutdown
During graceful shutdown Impala executors wait for running queries to finish up to the graceful shutdown deadline (--shutdown_deadline_s). During graceful shutdown the istio-proxy container on Impala executor pod was getting terminated immediately and as a result the executors were not reachable and were removed from the Impala cluster membership resulting in cancellation of running queries.
This issue is now resolved by making sure istio-proxy container’s lifecycle doesn’t impact executor’s cluster membership.
IMPALA-14263: Enhanced join strategy for large clusters
The query planner's cost model for broadcast joins can be skewed by the number of nodes in a cluster. This could lead to suboptimal join strategy choices, especially in large clusters with skewed data where a partitioned join was chosen over a more efficient broadcast join.
This issue is now resolved by introducing the broadcast_cost_scale_factor query option as an additional tuning option besides query hint to override query planner decision. To set it cluster-wide for all queries, add the following key-value to the default_query_options startup option:
broadcast_cost_scale_factor=<less than 1.0>

Apache Jira: IMPALA-14263

IMPALA-11402: Fetching metadata for tables with huge numbers of files no longer fails with OutOfMemoryError
Previously, when Impala Coordinator tried to fetch file metadata for extremely large tables (those with millions of files or partitions), the Impala Catalog service would attempt to return all the file details at once. This often exceeded the Java memory limits, causing the service to crash with an OutOfMemoryError.
This issue is addressed by configuring the Catalog service to limit the number of file descriptors included in a single getPartialCatalogObject response. A new configuration flag, catalog_partial_fetch_max_files, is introduced to define the maximum number of file descriptors allowed per response (with a default of 1,000,000 files).
If a request exceeds this limit, the Catalog service will truncate the response and return metadata for only a subset of the requested partitions. The coordinator is now designed to detect this truncated response and automatically send new batch requests to fetch the remaining partitions until all required metadata is retrieved. This change ensures that the coordinator can successfully fetch and process the metadata for extremely large tables without crashing due to memory limits.

Apache Jira: IMPALA-11402

CDPD-77261: Impala can now read Parquet integer data as DECIMAL after schema changes
Previously, if you changed a column type from an integer (INT or BIGINT) to a DECIMAL using ALTER TABLE, Impala could fail to read the original Parquet data files. This happened because the files lacked the specific metadata (logical types) Impala expected for decimals, resulting in an error.
Impala is now more flexible when reading Parquet files following schema evolution. If Impala encounters an integer type but the schema expects a DECIMAL, it automatically assumes a suitable decimal precision and scale, allowing you to successfully query the updated table:
  • INT32 is read as DECIMAL(9, 0).
  • INT64 is read as DECIMAL(18, 0).
This change supports common schema evolution practices by allowing you to update column types without manually rewriting old data files.

Apache Jira: IMPALA-13625

IMPALA-12927: Impala can now correctly read BINARY columns in JSON tables
Previously, Impala couldn't correctly read BINARY columns in JSON tables, often resulting in errors or incorrect data. This happened because Impala assumed the data was always Base64 encoded, which wasn't true for files written by older Hive versions.
Impala now supports a new table property, 'json.binary.format' (BASE64 or RAWSTRING), and a query option, JSON_BINARY_FORMAT, to explicitly define the binary encoding. This ensures Impala reads the data correctly. If no format is specified, Impala will now return an error instead of risking silent data corruption.

JIRA Issue: IMPALA-12927

DWX-22191: Improved session routing in Impala-proxy
Previously, the Impala-proxy determined which coordinator is expected to handle new session requests by counting all open HiveServer2 (HS2) sessions on each coordinator, including those without active client connections. This approach led to situations in which the proxy blocked new client connections even when many existing sessions were idle. Consequently, coordinators with several open but inactive sessions were marked as fully loaded, causing reduced availability.

This issue is now resolved. The Impala-proxy now checks only the number of sessions currently in use, which are sessions with active client connections, when making routing decisions. Additionally, users can disable this session-based check entirely by setting the coordinator-load-based-routing CPU and memory weights to 0 in the Impala-proxy configurations. This ensures that routing decisions are based on actual load, preventing unnecessary blocking and improving connection availability.

CDPD-91651: Catalogd memory exhaustion during metadata loading
Previously, the Catalog server (Catalogd) could run out of memory when you managed a large number of tables, such as one million tables. This occurred because tables that were not yet loaded tracked individual metrics that used a significant amount of memory. In environments with many databases and tables, this memory usage caused the server to stop responding, which prevented you from seeing your tables or running commands.
This issue is now resolved by skipping the initialization of these metrics for unloaded tables.

Apache Jira: IMPALA-14502

CDPD-81076: LEFT ANTI JOIN fails on Iceberg V2 tables with Delete files
Queries using a LEFT ANTI JOIN fail with an AnalysisException if the right-side table is an Iceberg V2 table containing delete files. For example, consider the following query:
SELECT * FROM table_a a
LEFT ANTI JOIN iceberg_v2_table b
ON a.id = b.id;

The error Illegal column/field reference'b.input_file_name' of semi-/anti-joined table 'b' is displayed because semi-joined tuples need to be explicitly made visible for paths pointing inside them to be resolvable.

The fix updates the IcebergScanPlanner to ensure that the tuple containing the virtual fields is made visible when it is semi-joined.

Apache Jira: IMPALA-13888

CDPD-78427: Enable MERGE statement for Iceberg tables with equality deletes
This patch fixes an issue that caused MERGE statements to fail on Iceberg tables that use equality deletes.

The failure occurred because the delete expression calculation was missing the data sequence number, even though the underlying data description included it. This mismatch caused row evaluation to fail.

The fix ensures the data sequence number is correctly included in the result expressions, allowing MERGE operations to complete successfully on these tables.

Apache Jira: IMPALA-13674

CDPD-77773: Tolerate missing data files during Iceberg table loading
This fix addresses an issue where an Iceberg table would fail to load completely if any of its data files were missing from the file system. This TableLoadingException left the table in an incomplete state, blocking all operations on it.

Impala now tolerates missing data files during the table loading process. An exception will only be thrown if a query subsequently attempts to read one of the specific files that is missing.

This change allows other operations that do not depend on the missing data—such as ROLLBACK, DROP PARTITION, or SELECT statements on valid partitions—to execute successfully.

Apache Jira: IMPALA-13654

CDPD-78508: Skip reloading Iceberg tables when metadata JSON file is the same
This patch optimizes metadata handling for Iceberg tables, particularly those that are updated frequently.

Previously, if an event processor was lagging, Impala might receive numerous update events for the same table (for example, 100 events). Impala would attempt to reload the table 100 times, even if the table's state was already up-to-date after processing the first event.

With this fix, Impala now compares the path of the incoming metadata JSON file with the one that is currently loaded. If the metadata file location is the same, Impala skips the reload, correctly assuming the table is already unchanged. This significantly reduces unnecessary metadata processing.

Apache Jira: IMPALA-13718

CDPD-91992: Impala crashes when writing multiple delete files per partition in a single DELETE operation
Impala queries might crash during a DELETE operation that requires writing multiple delete files to the same partition. This issue was caused by a state management conflict.

The issue has now been resolved, and queries execute successfully without crashes.

Apache Jira: IMPALA-14496

DWX-22353: Trino coordinator crashes due to case-sensitive configuration units
The Trino coordinator fails to start and enters into an infinite loop if configuration parameters are entered with the incorrect case. For example, entering 100mb instead of 100MB. This occurs because Trino configuration values for data size and duration are strictly case-sensitive.

This issue has been fixed by introducing client-side validation. When a user enters a configuration value with an incorrect case, a warning message is displayed requesting users to correct the unit case.

To avoid configuration failures, ensure the following unit casing is used:

  • For size: Use uppercase (B, kB, MB, GB, TB, PB, EB)
  • For duration: Use lowercase (ns, us, ms, s, m, h, d)