2026.1 Series Release Notes¶
16.0.0-8¶
Upgrade Notes¶
A new
cyborg-dbsync online_data_migrationssubcommand backfills theproject_idcolumn on existing accelerator requests (ARQs). Expected operator order:Upgrade the
cyborg-dbsyncpackage (and related shared code) socyborg-dbsync upgradecan apply pending schema migrations.Run
cyborg-dbsync online_data_migrationsto backfillproject_idon existing ARQ rows using Nova instance data.Upgrade Cyborg services, starting with conductor and API, then agents.
The cyborg-conductor service also heals remaining NULL
project_idvalues on startup as a safety net.Nova
GET /servers/{id}calls for this migration pass microversion2.82explicitly so thetenant_idfield shape used for backfill stays consistent.
Nova must be configured with
[service_user] send_service_user_token = truefor Cyborg to accept bound-ARQ operations (bind, unbind, delete). This is the same requirement as for Cinder volume attachments since OSSA-2023-003.Cyborg now defaults
[keystone_authtoken] service_token_roles_requiredtotrueso that keystonemiddleware validates the service token roles. Operators who have not already set this should ensure the service user has theservicerole in Keystone.
Cyborg API policies now declare
scope_types=['project']and reject Keystone system-scoped tokens via oslo.policy scope enforcement. Keep[oslo_policy] enforce_scope=True. Disabling it weakens project isolation and is discouraged; prefer custom policy rules if you need different access behavior.
Security Issues¶
This issue is assigned CVE-2026-40214.
Fixed a cross-tenant access control vulnerability in accelerator request (ARQ) management. The
project_idfield was never populated on ARQ records, which meant non-admin users could list, view, and delete ARQs belonging to other projects. This could lead to information disclosure (leaking instance UUIDs across tenants) and denial of service (deleting another tenant’s ARQ prevents their instance from restarting).ARQs are now scoped to the requesting project. Non-admin users can only see and manage their own project’s ARQs.
Additionally, binding, unbinding, and deleting bound ARQs now require a service token. Only Nova, identified by a valid service token with the
servicerole, may set or clear theinstance_uuidon an ARQ or delete a bound ARQ. This prevents users from directly manipulating ARQs that Nova is managing, following the same pattern as the Cinder OSSA-2023-003 fix.
This issue is assigned CVE-2026-40213.
Replaced permissive
rule:allowdefaults withrule:admin_apion device, deployable, and attribute API policies so authenticated low-privilege users cannot read or change hardware topology and management data without the admin role. System-scoped tokens are not supported by Cyborg. Deployments that relied on the old defaults must grantadminor define custom policy rules for these APIs.
16.0.0¶
Prelude¶
The 2026.1 release focuses on reducing technical debt and improving operational reliability. The most significant change is the removal of the eventlet dependency, completing Cyborg’s migration to native threading for oslo.service and oslo.messaging. Database session management has been modernized to use the current oslo.db enginefacade API, replacing the legacy facade that was removed upstream. Additionally, the cyborg-agent now validates its Placement resource provider at startup, fixing a long-standing hostname mismatch issue between Cyborg and Nova. Several accelerator driver bugs have also been corrected.
The documentation has also been enhanced, a new driver configuration guide has been added covering all supported accelerator drivers with configuration examples and troubleshooting tips, and a release liaison guide has been added for contributors.
Upgrade Notes¶
The cyborg-agent now requires Placement to be reachable at startup. Previously the agent would start and only fail during the
update_usageperiodic task. This aligns with nova-compute behavior where the compute service validates its resource provider at startup. Ensure Placement is available before starting cyborg-agent.The agent retries the resource provider lookup up to 3 times by default (configurable via
[agent] resource_provider_startup_retries), using exponential backoff. The default of 3 retries gives approximately 7 seconds of tolerance (1s + 2s + 4s) for nova-compute to create the resource provider.
Cyborg no longer uses eventlet. The eventlet dependency has been removed and Cyborg now uses the native threading backend for oslo.service and the threading RPC server. No configuration changes are required. Packagers should ensure that
oslo.service[threading]is installed as a dependency.
The
cyborg-wsgi-apiwsgi_scriptsentry point has been removed. This setuptools-specific mechanism for installing a WSGI script was deprecated in the 2025.1 cycle. Deployments using mod_wsgi or uWSGI should reference the WSGI application directly viacyborg.wsgi.api:applicationas described in the WSGI deployment documentation.
Bug Fixes¶
NVIDIA GPU cards such as the A100 expose SR-IOV Virtual Functions (VFs) when virtualized. The NVIDIA GPU driver was incorrectly reporting these VFs as standalone GPU accelerators. A new
[gpu_devices]/filter_sriov_vfsconfiguration option (defaultFalse) allows operators to filter out VF devices so that only Physical Functions and mediated devices are reported. The option defaults toFalsebecause enabling it may remove VF-backed resources from Placement without cleaning up existing allocations first. Operators should ensure no instances hold VF allocations before enabling this option.
Fixed a crash in the PCI driver where discovery of devices with vendor IDs not present in the vendor map would raise an
AttributeError. The PCI driver now gracefully handles unknown vendors by skipping the vendor-specific trait and falling back to using the raw vendor ID as the driver name.
A new configuration option
[agent] resource_provider_namethat allows operators to explicitly specify the name of the compute resource provider in Placement has been added.This defaults to
socket.getfqdn()which typically matches libvirt’s hypervisor hostname behavior. If the resource provider lookup fails with this name, Cyborg will automatically fall back to usingCONF.host, providing backward compatibility for existing deployments.The cyborg-agent now validates the resource provider name at startup by querying Placement directly. It tries the configured name (defaulting to
socket.getfqdn()) first, then falls back toCONF.host. If neither hostname finds a valid resource provider, the agent will fail to start with a clear error message.A new
[agent] resource_provider_startup_retriesoption (default: 3) controls how many times the agent retries the Placement lookup at startup, using exponential backoff (1s, 2s, 4s, …). This tolerates a startup race where nova-compute has not yet created the resource provider in Placement.This fixes Bug 2139369 and addresses the disconnect between how Cyborg identifies the compute resource provider name and how Nova/Placement names it.
Previously, Cyborg used
CONF.host(which defaults tosocket.gethostname()) to look up the compute resource provider in Placement. However, Nova uses libvirt’s hostname determination, which typically returns the FQDN. This mismatch causedPlacementResourceProviderNotFounderrors on systems where the short hostname differs from the FQDN.