New features
New features recently added to the NexusOne platform.Kyuubi batch-submit
A new Python command-line tool for submitting and monitoring Apache Kyuubi batch jobs with full support for local file uploads. It includes the following capabilities:- Supports file uploads
- Automatically detects local files vs remote S3 or HDFS URIs
- Uploads local JARs, Python, and data files directly via the Kyuubi REST API
- Supports mixed submissions combining local uploads with remote resources in a single job
- Supports command-line flags for different file types
--resource: Main resource containing a JAR or Python file--jars: JAR files--pyfiles: Python files--files: Data/config files
- Real-time job monitoring with status updates and progress spinner
- Automatic log retrieval upon job completion
- YAML configuration support for reusable job configurations
- Flexible authentication using a command-line tool, YAML, environment variable, or an interactive prompt
- Spark History Server integration with formatted URLs
- YuniKorn queue support for Kubernetes deployments
- Exit codes for scripting and automation
s3Cli
A new S3 command-line tool,s3Cli, is replacing the AWS command-line tool in Jupyter environments.
The S3 command-line tool is fully integrated with NexusOne’s multi-bucket architecture and Apache
Ranger authorization.
It includes the following capabilities:
- Ability to map each bucket to a different S3 endpoint with Hadoop’s
core-site.xmlfs.s3a.bucket.<name>.endpointconfiguration working behind the scenes - Supports multiple credentials
- Each bucket loads its credentials from its own Java Cryptography Extension Key Store (JCEKS) keystore
at
/jceks/<bucket>.jceks - Bucket authorization falls back to a default keystore,
/jceks/default.jceks, if no bucket-specific credentials exist in the JCEKS keystore
- Each bucket loads its credentials from its own Java Cryptography Extension Key Store (JCEKS) keystore
at
- Mandatory authorization enforcement through Apache Ranger
- Ability to manage buckets, directories, and files
nx1-sdk
A Python SDK,nx1-sdk, is now available for programmatic interaction with the NexusOne
platform services.
It includes the following capabilities:
- Authentication
- Keycloak OAuth 2.0 integration
- Access token lifecycle management
- Service account support
- Data operations
- Encrypted file ingestion
- Schema management
- Table operations such as create, drop, or alter
- Platform integration
- Policy management via Apache Ranger
- Catalog operations via Apache Gravitino
- S3 bucket management
Upgrades
Version upgrades to existing apps on the NexusOne platform.Airflow v3.1 upgrade
Upgraded to Airflowv3.1. This brings significant architectural changes and
new features such as the following:
- API migration
- REST API
/api/v1/*are no longer available - Migrated all integrations to API
/api/v2/* execution_datereplaced withlogical_date- DateTime formats are now RFC3339-compliant
- REST API
- Authentication changes
- Moved the Flask App Builder (FAB) authentication manager to a provider package
- Installed
apache-airflow-providers-fabfor OAuth 2.0 or LDAP authentication - Changed the OAuth 2.0 callback URL from
/oauth-authorized/keycloakto/auth/oauth-authorized/keycloak - Updated Keycloak redirect URIs accordingly
- Task Execution Interface (TEI)
- New SDK-based task execution architecture
- Workers now communicate via the internal API server
- Configures
AIRFLOW__CORE__INTERNAL_API_URLfor distributed deployments
- Database Changes
- Requires a new
sessiontable for FAB provider - After an upgrade, run
airflow db migrate
- Requires a new
Auth configuration updates
The followingauth_manager and auth_backends settings reflect the new provider-based package for
the FAB auth manager.
Migration steps
Before upgrading, use the following steps:- Backup your existing database
- Update your DAGs for Airflow 3 compatibility
- Run
airflow db migrate - Update OAuth 2.0 redirect URIs in Keycloak
- Verify that API integrations now use v2 endpoints
DataHub v1.3.0.1 upgrade
Upgraded DataHub to v1.3.0.1 for improved metadata management and lineage tracking.
The key changes include the following:
- Bootstrap process
- New bootstrap dependency handling
- The
system-updatejob now runs before the Generalized Metadata Service (GMS) starts
- Policy management
- Improved policy population on startup
- Domain-level access controls
- Enhanced RBAC for data assets
- Airflow integration
- Updated DataHub Airflow plugin for Airflow
v3compatibility - OpenLineage-based collection
RuntimeTaskInstancesupport to track state changes of a task and manage the environment
- Updated DataHub Airflow plugin for Airflow
Keycloak v26.0.5 upgrade
Upgraded Keycloak to v26.0.5 using the codecentric/keycloak Helm chart.
The key changes include the following:
- Deployment
- Runtime moved from WildFly to Quarkus
- Support for JGroups DNS-based discovery for Kubernetes cache clustering
- External PostgreSQL backend instead of the default internal relational database
- Configuration
KC_*prefix is now a standardized way to configure environment variables- Proxy mode support:
edgefor TLS termination at ingress - Health endpoints exposed at
/health/liveand/health/ready
- Breaking changes
- Admin account set up now uses
KC_BOOTSTRAP_ADMIN_USERNAMEandKC_BOOTSTRAP_ADMIN_PASSWORDenvironment variables - Truststore configuration now uses the
KC_SPI_TRUSTSTORE_FILE_*environment variable prefix
- Admin account set up now uses
Enhancements
Enhancements to existing app features on the NexusOne platform.JupyterHub S3 browser extension
Major enhancements to the JupyterLab S3 browser plugin, providing a full-featured file management interface with enterprise security. It includes the following changes:-
Multi-bucket and multi-endpoint support
- Consolidated bucket listing from all configured endpoints, so you don’t need to switch endpoints manually
- Can now map each bucket to a different S3 endpoint with Hadoop’s
core-site.xmlfs.s3a.bucket.<name>.endpointconfiguration working behind the scenes - Each bucket can now load credentials from its own Java Cryptography Extension Key Store (JCEKS) file
- Seamless navigation across buckets from different S3-compatible storage systems
-
Full Apache Ranger integration
- When listing buckets, Ranger user permissions are now checked
- All file operations, such as read, write, or delete, now trigger a Ranger check
- Real-time permission checks on navigation and actions
- Users only see content they’re authorized to access
-
Improved user experience
- Content previews for file formats such as
.txt,.csv, and more - An option to download files into the Jupyter workspace or directly to your local machine
- Downloading a directory automatically zips the content
- Large files are now read and sent in small chunks using file streaming
- Content previews for file formats such as
-
Enhanced file operations
Feature Description Recursive delete Delete directories and all contents with a single action Rename/Move Rename files and folders within the browser Create directories Create new folders directly in the UI Directory download Download entire directories as .zipfilesRecursive upload Upload local directories while preserving folder structure
Portal enhancements
- OAuth 2.0 credential management for Gravitino
- OAuth 2.0 client credentials are now automatically generated for Gravitino REST catalogs
- Tenant credentials are now individually managed
- OAuth 2.0 tokens are now issued and validated using Keycloak
- Iceberg catalog support
- Iceberg is now a configurable catalog type in the portal
- You can now provision Iceberg REST catalogs
- Spark and Trino configurations can now be automatically generated
- S3 bucket management
- Naming validation
- Bucket names now follow DNS-compliant rules
- The system enforces prefixes per tenant or environment
- Bucket operations
- You can now delete S3 buckets, update bucket configurations, and manage bucket lifecycle policies
- Credential masking
- S3 access keys and secrets are now hidden in the portal UI after creation
- Credentials are only visible at the moment they’re generated
- Credential rotation is now available
- Naming validation