Current Series Release Notes

22.0.0.0rc1

Prelude

The 22.0.0 release includes many new features and bug fixes. Please be sure to read the upgrade section which describes the required actions to upgrade your cloud from 21.0.0 (Ussuri) to 22.0.0 (Victoria).

There are a few major changes worth mentioning. This is not an exhaustive list:

  • The latest Compute API microversion supported for Victoria is v2.87. No new microversions were added during this cycle but you can find all of them in the REST API Version History page.

  • Support for a new mixed flavor CPU allocation policy that allows both pinned and floating CPUs within the same instance.

  • Custom Placement resource inventories and traits can now be described using a single providers configuration file.

  • Glance multistore configuration with multiple RBD backends is now supported within Nova for libvirt RBD-backed images using [libvirt]/images_rbd_glance_store_name configuration option.

  • An emulated Virtual Trusted Platform Module can be exposed to instances running on a libvirt hypervisor with qemu or kvm backends.

  • Support for the xen, uml, lxc and parallels libvirt backends has been deprecated.

  • XenAPI virt driver has been removed, including the related configuration options.

  • VMWare virt driver is now supported again in Victoria after being deprecated during the Ussuri release, as testing issues have been addressed.

New Features

  • It is now possible to allocate all cores in an instance to realtime and omit the hw:cpu_realtime_mask extra spec. This requires specifying the hw:emulator_threads_policy extra spec.

  • It is now possible to specify a mask in hw:cpu_realtime_mask without a leading ^. When this is ommitted, the value will specify the cores that should be included in the set of realtime cores, as opposed to those that should be excluded.

  • Nova now supports adding an emulated virtual Trusted Platform Module to libvirt guests with a virt_type of kvm or qemu. Not all server operations are fully supported yet. See the documentation for details.

  • New [glance]/enable_rbd_download config option was introduced. The option allows for the configuration of direct downloads of Ceph hosted glance images into the libvirt image cache via rbd when [glance]/enable_rbd_download= True and [glance]/rbd_user, [glance]/rbd_pool, [glance]/rbd_connect_timeout, [glance]/rbd_ceph_conf are correctly configured.

  • Add --force option to the nova-manage placement heal_allocations command to forcefully heal allocations for a specific instance.

  • The libvirt RBD image backend module can now handle a Glance multistore environment where multiple RBD clusters are in use across a single Nova/Glance deployment, configured as independent Glance stores. In the case where an instance is booted with an image that does not exist in the RBD cluster that Nova is configured to use, Nova can ask Glance to copy the image from whatever store it is currently in to the one that represents its RBD cluster. To enable this feature, set [libvirt]/images_rbd_glance_store_name to tell Nova the Glance store name of the RBD cluster it uses.

  • A new configuration option, [DEFAULT]/max_concurrent_snapshots, has been added. This allow operator to configure maximum concurrent snapshots on a compute host and prevent resource overuse related to snapshot.

  • Nova now supports defining of additional resource provider traits and inventories by way of YAML configuration files. The location of these files is defined by the new config option [compute]provider_config_location. Nova will look in this directory for *.yaml files. See the specification and admin guide for more details.

  • Add the ability to use vmxnet3 NIC on a host using the QEMU/KVM driver. This allows the migration of an ESXi VM to QEMU/KVM, without any driver changes. vmxnet3 comes with better performance and lower latency comparing to an emulated driver like e1000.

  • Added params [libvirt]/rbd_destroy_volume_retries, defaulting to 12, and [libvirt]/rbd_destroy_volume_retry_interval, defaulting to 5, that Nova will use when trying to remove a volume from Ceph in a retry loop that combines these parameters together. Thus, maximum elapsing time is by default 60 seconds.

  • Nova now supports attaching and detaching PCI device backed Neutron ports to running servers.

  • Add the mixed instance CPU allocation policy for instance mixing with both PCPU and VCPU resources. This is useful for applications that wish to schedule the CPU intensive workload on the PCPU and the other workloads on VCPU. The mixed policy avoids the necessity of making all instance CPUs to be pinned CPUs, as a result, reduces the consuption of pinned CPUs and increases the instance density.

    Extend the real-time instance with the mixed CPU allocation policy. In comparing with dedicated policy real-time instance, the non-real-time CPUs are not longer required to be pinned on dedicated host CPUs, but float on a range of host CPUs sharing with other instances.

  • Add the extra spec hw:cpu_dedicated_mask to set the pinned CPUs for the mixed instance. This is a core mask and can be used to include or exclude CPUs. Any core not included or explicitly excluded is treated as a shared CPU.

  • Export instance pinned CPU list through the dedicated_cpus section in the metadata service API.

Known Issues

  • bug 1894804 documents a known device detachment issue with QEMU 4.2.0 as shipped by the Focal 20.04 Ubuntu release. This can lead to the failure to detach devices from the underlying libvirt domain of an instance as QEMU never emits the correct DEVICE_DELETED event to libvirt. This in turn leaves the device attached within libvirt and OpenStack Nova while it has been detached from the underlying QEMU process. Subsequent attempts to detach the device will also fail as it is no longer found within the QEMU process.

    There is no known workaround within OpenStack Nova to this issue.

Upgrade Notes

  • All APIs except deprecated APIs were modified to implement scope_type and use new defaults in 21.0.0 (Ussuri). The remaining APIs have now been updated.

    Refer to the Nova Policy Concepts for details and migration plan.

  • The default value of [oslo_policy] policy_file config option has been changed from policy.json to policy.yaml. Nova policy new defaults since 21.0.0 and current default value of [oslo_policy] policy_file config option (policy.json) does not work when policy.json is generated by oslopolicy-sample-generator tool. Refer to bug 1875418 for more details. Also check oslopolicy-convert-json-to-yaml tool to convert the JSON to YAML formatted policy file in backward compatible way.

  • When using file-backed memory, the nova-compute service will now fail to start if the amount of reserved memory configured using [DEFAULT] reserved_host_memory_mb is equal to or greater than the total amount of memory configured using [libvirt] file_backed_memory. Where reserved memory is less than the total amount of memory configured, a warning will be raised. This warning will become an error in a future release.

    The former combination is invalid as it would suggest reserved memory is greater than total memory available, while the latter is considered incorrect behavior as reserving of file-backed memory can and should be achieved by reducing the filespace allocated as memory by modifying [libvirt] file_backed_memory.

  • The default for [glance] num_retries has changed from 0 to 3. The option controls how many times to retry a Glance API call in response to a HTTP connection failure. When deploying Glance behind HAproxy it is possible for a response to arrive just after the HAproxy idle time. As a result, an exception will be raised when the connection is closed resulting in a failed request. By increasing the default value, Nova can be more resilient to this scenario were HAproxy is misconfigured by retrying the request.

  • Previously, the number of concurrent snapshots was unlimited, now it is limited via [DEFAULT]/max_concurrent_snapshots, which currently defaults to 5.

  • Support for hooks has been removed. In previous versions of nova, these provided a mechanism to extend nova with custom code through a plugin mechanism. However, they were deprecated in 13.0.0 (Mitaka) as unmaintainable long-term. Versioned notifications and vendordata should be used instead. For more information, refer to this thread.

  • The nova.image.download entry point hook has been removed, per the deprecation announcement in the 17.0.0 (Queens) release.

  • Intel CMT perf events - cmt, mbmt, and mbml - are no longer supported by the [libvirt] enabled_perf_events config option. These event types were broken by design and are not supported in recent Linux kernels (4.14+).

  • The [vnc] keymap and [spice] keymap configuration options, first deprecated in 18.0.0 (Rocky), have now been removed. The VNC option affected the libvirt and VMWare virt drivers, while the SPICE option only affected libvirt. For the libvirt driver, configuring these options resulted in lossy keymap conversions for the given graphics method. Users can replace this host-level configuration with guest-level configuration. This requires noVNC 1.0.0 or greater, which provides support for QEMU’s Extended Key Event messages. Refer to bug #1682020 and the QEMU RFB pull request for more information.

    For the VMWare driver, only the VNC option applied. However, the [vmware] vnc_keymap option was introduce in 18.0.0 (Rocky) and can be used to replace [vnc] keymap.

  • The following deprecated scheduler filters have been removed.

    RetryFilter

    Deprecated in Train (20.0.0). The RetryFilter has not been requied since Queens following the completion of the return-alternate-hosts blueprint

    Aggregatefilter, AggregateRAMFilter, AggregateDiskFilter

    Deprecated in Train (20.0.0). These filters have not worked correctly since the introduction of placement in ocata.

    On upgrade operators should ensure they have not configured any of the new removed filters and instead should use placement to control cpu, ram and disk allocation ratios.

    Refer to the config reference documentation for more information.

  • The XenAPI driver, which was deprecated in the 20.0.0 (Train), has now been removed.

  • The following config options only apply when using the XenAPI virt driver which has now been removed. The config options have therefore been removed also.

    • [xenserver] agent_timeout

    • [xenserver] agent_version_timeout

    • [xenserver] agent_resetnetwork_timeout

    • [xenserver] agent_path

    • [xenserver] disable_agent

    • [xenserver] use_agent_default

    • [xenserver] login_timeout

    • [xenserver] connection_concurrent

    • [xenserver] cache_images

    • [xenserver] image_compression_level

    • [xenserver] default_os_type

    • [xenserver] block_device_creation_timeout

    • [xenserver] max_kernel_ramdisk_size

    • [xenserver] sr_matching_filter

    • [xenserver] sparse_copy

    • [xenserver] num_vbd_unplug_retries

    • [xenserver] ipxe_network_name

    • [xenserver] ipxe_boot_menu_url

    • [xenserver] ipxe_mkisofs_cmd

    • [xenserver] connection_url

    • [xenserver] connection_username

    • [xenserver] connection_password

    • [xenserver] vhd_coalesce_poll_interval

    • [xenserver] check_host

    • [xenserver] vhd_coalesce_max_attempts

    • [xenserver] sr_base_path

    • [xenserver] target_host

    • [xenserver] target_port

    • [xenserver] independent_compute

    • [xenserver] running_timeout

    • [xenserver] image_upload_handler

    • [xenserver] image_handler

    • [xenserver] introduce_vdi_retry_wait

    • [xenserver] ovs_integration_bridge

    • [xenserver] use_join_force

    • [xenserver] console_public_hostname

  • The minimum required version of libvirt used by the nova-compute service is now 5.0.0. The minimum required version of QEMU used by the nova-compute service is now 4.0.0. Failing to meet these minimum versions when using the libvirt compute driver will result in the nova-compute service not starting.

Deprecation Notes

  • Support for the xen, uml, lxc and parallels libvirt backends, configured via the [libvirt] virt_type config option, has been deprecated. None of these drivers have upstream testing and the xen and uml backends specifically have never been considered production ready. With this change, only the kvm and qemu backends are considered supported when using the libvirt virt driver.

  • The vmwareapi driver was deprecated in Ussuri due to missing third-party CI coverage and a clear maintainer. These issues have been addressed during the Victoria cycle and the driver is now undeprecated.

Bug Fixes

  • Since Libvirt v.1.12.0 and the introduction of the libvirt issue , there is a fact that if we set cache mode whose write semantic is not O_DIRECT (i.e. “unsafe”, “writeback” or “writethrough”), there will be a problem with the volume drivers (i.e. LibvirtISCSIVolumeDriver, LibvirtNFSVolumeDriver and so on), which designate native io explicitly.

    When the driver_cache (default is none) has been configured as neither “none” nor “directsync”, the libvirt driver will ensure the driver_io to be “threads” to avoid an instance spawning failure.

  • Add support for the hw:hide_hypervisor_id extra spec. This is an alias for the hide_hypervisor_id extra spec, which was not compatible with the AggregateInstanceExtraSpecsFilter scheduler filter. See bug 1841932 for more details.

  • This release contains a fix for bug 1874032 which delegates snapshot upload into a dedicated thread. This ensures nova compute service stability on busy environment during snapshot, when concurrent snapshots or any other tasks slow down storage performance.

  • Bug 1875418 is fixed by changing the default value of [oslo_policy] policy_file config option to YAML format.

  • A new [workarounds]/reserve_disk_resource_for_image_cache config option was added to fix the bug 1878024 where the images in the compute image cache overallocate the local disk. If this new config is set then the libvirt driver will reserve DISK_GB resources in placement based on the actual disk usage of the image cache.

  • Previously, attempting to configure an instance with the e1000e or legacy VirtualE1000e VIF types on a host using the QEMU/KVM driver would result in an incorrect UnsupportedHardware exception. These interfaces are now correctly marked as supported.

  • Previously, it was possible to specify values for the hw:cpu_realtime_mask extra spec that were not within the range of valid instances cores. This value is now correctly validated.

  • Bug #1888022: An issue that prevented detach of multi-attached fs-based volumes is resolved.

  • An issue that could result in instances with the isolate thread policy (hw:cpu_thread_policy=isolate) being scheduled to hosts with SMT (HyperThreading) and consuming VCPU instead of PCPU has been resolved. See bug #1889633 for more information.

  • Resolve a race condition that may occur during concurrent interface detach/attach, resulting in an interface accidentally unbind after attached. See bug 1892870 for more details.

  • Addressed an issue that prevented instances using multiqueue feature from being created successfully when their vif_type is TAP.

  • Resolved an issue whereby providing an empty list for the policies field in the request body of the POST /os-server-groups API would result in a server error. This only affects the 2.1 to 2.63 microversions, as the 2.64 microversion replaces the policies list field with a policy string field. See bug #1894966 for more information.

  • Since the 16.0.0 (Pike) release, nova has collected NIC feature flags via libvirt. To look up the NIC feature flags for a whitelisted PCI device the nova libvirt driver computed the libvirt nodedev name by rendering a format string using the netdev name associated with the interface and its current MAC address. In some environments the libvirt nodedev list can become out of sync with the current MAC address assigned to a netdev and as a result the nodedev look up can fail. Nova now uses PCI addresses, rather than MAC addresses, to look up these PCI network devices.

  • Nova tries to remove a volume from Ceph in a retry loop of 10 attempts at 1 second intervals, totaling 10 seconds overall - which, due to 30 second ceph watcher timeout, might result in intermittent object removal failures on Ceph side (bug 1856845). Setting default values for [libvirt]/rbd_destroy_volume_retries to 12 and [libvirt]/rbd_destroy_volume_retry_interval to 5, now gives Ceph reasonable amount of time to complete the operation successfully.

  • In the Rocky (18.0.0) release support was added to nova to use neutron’s multiple port binding feature when the binding-extended API extension is available. In the Train (20.0.0) release the SR-IOV live migration feature broke the semantics of the vifs field in the migration_data object that signals if the new multiple port binding workflow should be used by always populating it even when the binding-extended API extension is not present. This broke live migration for any deployment that did not support the optional binding-extended API extension. The Rocky behavior has now been restored enabling live migration using the single port binding workflow when multiple port bindings are not available.