What's new in 1.5.5 SP2
Cloudera AI on premises 1.5.5 SP2 delivers a set of new features for Cloudera AI.
Cloudera Generative AI readiness
- The Cloudera AI Inference service is now Generally Available. Upgrading your Cloudera AI Inference service, starting with this release, enables access to new features, enhanced performance, improved security, and updated component versions. For more information, see Upgrading Cloudera AI Inference service.
- To learn more about which LLM architectures are supported for Cloudera AI on premises 1.5.5 SP2, see: Supported transformer model architectures for vLLM 0.8.5 and vLLM 0.8.4
- Registering extra large LLMs is now more reliable with the enhancements that support
importing the latest and largest models.
- The models download process for registering models from third party sources was modified to prevent errors related to massive files. For more information, see Downloading and uploading Model Repositories for an air-gapped environment.
- The Cloudera AI Registry service can now be set to have only one replica to avoid synchronization and storage constraints related to massive model files. For more information, see Configuring model import script in air-gapped environment.
- For more information also see Configuring Traefik timeout values for large file uploads and downloads.
Usability and performance enhancements
- Reduce waiting times with improved UI performance of Cloudera AI, offering faster page loads for key landing pages, session launches, and listing pages.
- To ensure seamless integration between your Cloudera AI environment and the broader Cloudera, the Hadoop and Spark runtimes are now aligned with Cloudera Data Engineering and Data Hub builds. This alignment allows you to move code between Cloudera AI and other services without concerns about version mismatches. For more information, see Apache Spark supported versions.
- Traffic management in Embedded Container Service-based environments is now more robust, featuring enhanced encryption and deeper observability. Additionally to NGNIX, Istio Gateway API is also available. For more details, see Gateway API support for Embedded Container Service.
- Collecting logs provide comprehensive insights into both past and present system behavior, streamlining troubleshooting processes. For more information, see Diagnostic bundle support for Cloudera AI Registry and Diagnostic bundle support for Cloudera AI Inference service.
- Several underlying infrastructure enhancements are now implemented to improve performance including disk space optimization, longer retention of critical logs and ring-fencing ability to reserve key infrastructure nodes for critical system functions.
Governance and resource control enhancements
- Workbench Quota Management is in Technical Preview, allowing Administrators to define maximum limits for memory and compute usage per workbench. This feature helps prevent cost overruns and minimizes operational risks. For more information, see Quota Management overview.
- Team quotas can now be applied in mixed-GPU environments, enabling Administrators to precisely allocate access to specific hardware types for each team. For more information, see Quota for Cloudera AI workloads
Security improvements
- The non-transparent proxy support on Cloudera AI Inference service ensures strict network control by enforcing all outbound communication to pass through the proxy. This approach ensures adherence to internal security policies and mandates for air-gapped environments. For more information see Non-transparent proxy support on Cloudera AI Inference service.
- Routine security improvements are now implemented, addressing Common Vulnerabilities and Exposures (CVE) remediations.
