2024.1 Series Release Notes¶

29.2.1-5¶

バグ修正¶

During the Caracal cycle the libvirt driver was enhanced to support using device aliases to detach devices from a domain. I1dfe4ad3df81bc810835af9b09cfc6c06e9a5388 This introduced a regression for instance with vgpus. A prior bugfix https://bugs.launchpad.net/nova/+bug/1942345 addressed the symptom without correcting the underlying problem. A related bug for mdev devices was later reported. https://bugs.launchpad.net/nova/+bug/2074219 When this feature was added nova introduced a helper method to get device via the alias because the libvirt api does not provide one natively. That helper function assumed all devices would have an alias attribute. That assumption was not valid and had now been corrected. As a result detaching a volume from an instance with vgpus should now be possible and this class of bug should no longer happen.

When live migration fails during pre_live_migration on the destination, during rollback Cinder volumes will now be disconnected from the destination locally instead of remotely over RPC from the source. This should ensure that only connection_info for the destination will be used to disconnect volumes from the destination. See bug #1899835 for more details.

29.2.1¶

アップグレード時の注意¶

In the victoria release, the instance_numa_topology object was extended to enabled mix cpus (pinned and unpinned cpus) in the same instance. This change added a new field pcpuset to the instance_numa_topology object. While the change included object conversion code to handle the upgrade, it did not account for instances that have a numa_topology but were not pinned. i.e. a flavor with hw:mem_page_size or hw:numa_nodes set but without hw:cpu_policy set to dedicated. As a result, instances created between liberty and victoria releases with such a flavor cannot be started after upgrade to victoria. This has now been fixed. instances created post victoria are not affected by this issue. see: https://bugs.launchpad.net/nova/+bug/2080556 for more details.

バグ修正¶

Fixes an issue seen when using bare metal (Ironic) instances where an instance could fail to delete. See Bug 2019977 for more details.

Fixes a regression for live migration on shared storage that was removing the backing disk and instance folder during the cleanup of a virtual machine post live migration. bug 2080436 for details.

Nova now allows to use a hyphen in the [cinder]catalog_info service-type field, so in particular the official block-storage type is now valid. Bug 2092194

Fix displaying the reason messages from the Ironic validate node operation that is called just before the instance is deployed on the bare metal node. The message from Ironic is now correctly logged. Fixes bug 2100009 <https://bugs.launchpad.net/nova/+bug/2100009>_.

Before the Bug 2078999 was fixed, the nova-manage image_property set command would update the image properties embedded in the instance but would not update the ones in the request specs. This led to an unexpected rollback of the image properties that were updated by the command after an instance migration.

Introduced a new compute configuration option sharing_providers_max_uuids_per_request and applied a fix to handle the "Request-Too-Long" error that can occur when querying the placement API with a large number of aggregate UUIDs.

Bug #2091033: Fixed calls to libvirt listDevices() and listAllDevices() from potentially blocking all other greenthreads in nova-compute. Under certain circumstances, it was possible for the nova-compute service to freeze with all other greenthreads blocked and unable to perform any other activities including logging. This issue has been fixed by wrapping the libvirt listDevices() and listAllDevices() calls with eventlet.tpool.Proxy.

29.0.2¶

バグ修正¶

Nova now ensures that an instance cannot move between availability zones when the host of the instance is added or removed to an aggregate that is part of another availability zone. Moving from or to the default availability zone is also rejected.

This resolves bug 1907775 where after such move the instance become stuck in between availability zones.

29.0.1¶

紹介¶

The OpenStack 2024.1 (Nova 29.0.0) release includes many new features and bug fixes. Please be sure to read the upgrade section which describes the required actions to upgrade your cloud from 28.0.0 (2023.2) to 29.0.0 (2024.1). As a reminder, OpenStack 2024.1 is a Skip-Level-Upgrade Release (starting from now, we name it a SLURP release) meaning that you can do rolling-upgrade from 2023.1 and skip 2023.2.

There are a few major changes worth mentioning. This is not an exhaustive list:

The latest Compute API microversion supported for 2024.1 is v2.96.
The Ironic driver [ironic]/peer_list configuration option has been deprecated. The Ironic driver now more closely models other Nova drivers where compute nodes do not move between compute service instances. If high availability of a single compute service is required, operators should use active/passive failover between 2 compute service agents configured to share the same compute service host value``[DEFAULT]/host``. Ironic nova-compute services can now be configured to target a specific shard of ironic nodes by setting the [ironic]/shard configuration option and a new nova-manage db ironic_compute_node_move command can help the operators when upgrading their computes to specify which shard they should manage.
Instances using vGPUs can now be live-migrated if both of the compute nodes support libvirt-8.6.0 and QEMU-8.1.0, as the source mediated device will migrate the GPU memory to another target mediated device automatically. In order to do this, [libvirt/live_migration_downtime config option needs to be modified according to the aforementioned documentation.
As of the new 2.96 microversion, the server show and server list APIs now return a new parameter called pinned_availability_zone that indicates whether the instance is confined to a specific AZ. This field supplements the existing availability_zone field which reports the availability zone of the host where the service resides. The two values may be different if the service is shelved or is not pinned to an AZ which can help operators plan maintenance and better understand the workload constraints.
Instances using virtio-net will see an increase in performance between 10% and 20% if their image uses a new hw_virtio_packed_ring=true property or their flavor contains hw:virtio_packed_ring=true extra spec, provided libvirt version is >= 6.3 and QEMU >= 4.2.
As a security mechanism, a new [consoleauth]/enforce_session_timeout configuration option provides the ability to automatically close a server console session when the token expires. This is disabled by default to preserve the existing behaviour for upgrades.
The libvirt driver now supports requesting a configurable memory address space for the instances. This allows instances with large RAM requirements to be created by specifying either hw:maxphysaddr_mode=emulate and hw:maxphysaddr_bits flavor extra specs or hw_maxphysaddr_mode and hw_maxphysaddr_bits``image properties. The ``ImagePropertiesFilter and ComputeCapabilitiesFilter filters are required to support this functionality.
The Hyper-V virt driver has been removed. It was deprecated in the Nova 27.2.0 (Antelope) release. This driver was untested and has no maintainers. In addition, it had a dependency on the OpenStack Winstacker project that also has been retired.
A couple of other improvements target reducing the number of bugs we have: one automatically detecting the maximum number of instances with memory encryption which can run concurrently, another one allows specifying a specific IP address or hostname for incoming move operations (by setting [libvirt]/migration_inbound_addr) and yet another one that improves stability of block device management using libvirt device aliases.

新機能¶

Added new flavor extra_specs and image properties to control the physical address bits of vCPUs in Libvirt guests. This option is used to boot a guest with large RAM.

Instances using vGPUs can now be correctly live-migrated by the libvirt driver between compute nodes supporting the same mediated device types used by the instance. In order to be able to do this, the compute hosts need to support at least the minimum versions of libvirt-8.6.0, QEMU-8.1.0 and Linux kernel 5.18.0. If operators use multiple vGPU types per compute, they need to make sure they already use custom traits or custom resource classes for the GPUs resource providers and that the instance was created with a flavor using either a custom resource class or asking for a custom trait in order to make sure that Placement API will provide the right target GPU using the same mdev type for the instance.

The new config option [libvirt]migration_inbound_addr is now used to determine the address for incoming move operations (cold migrate, resize, evacuate). This config is defaulted to [DEFAULT]my_ip to keep the configuration backward compatible. However it allows an explicit hostname or FQDN to be specified, or allows to specify '%s' that is then resolved to the hostname of compute host. Note that this config should only be changed from its default after every compute is upgraded.

This is a security-enhancing feature that automatically closes console sessions exceeding a defined timeout period. To enable this functionality, operators are required to set the 'enforce_session_timeout' boolean configuration option to True.

The enforcement is implemented via a timer mechanism, initiating when users access the console and concluding upon the expiration of the set console token.

This ensures the graceful closure of console sessions on the server side, aligning with security best practices.

Ironic nova-compute services can now target a specific shard of ironic nodes by setting the config [ironic]shard. This is particularly useful when using active-passive methods to choose on which physical host your ironic nova-compute process is running, while ensuring [DEFAULT]host stays the same for each shard. You can use this alongside [ironic]conductor_group to further limit which ironic nodes are managed by each nova-compute service. Note that when you use [ironic]shard the [ironic]peer_list is hard coded to a single nova-compute service.

There is a new nova-manage command db ironic_compute_node_move that can be used to move ironic nodes, and the associated instances, between nova-compute services. This is useful when migrating from the legacy hash ring based HA towards the new sharding approach.

Now the libvirt driver is capable to detect maximum number of guests with memory encrypted which can run concurrently in its compute host using the new fields in libvirt API available since version 8.0.0.

The 2.96 microversion has been added. This microversion adds pinned_availability_zone in server show and server list --long responses.

Handling packed virtqueue requests for an instance is now supported on the nodes with Qemu v4.2 and Libvirt v6.3.

VMs using virtio-net will see an increase in performance. The increase can be anywhere between 10/20% (see DPDK Intel Vhost/virtio perf. reports) and 75% (using Napatech SmartNICs).

Packed Ring can be requested via image property or flavor extra spec. hw_virtio_packed_ring=true|false (default false) hw:virtio_packed_ring=true|false (default false)

Useful references: https://libvirt.org/formatdomain.html#virtio-related-options https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html https://specs.openstack.org/openstack/nova-specs/specs/2023.2/approved/virtio_packedring_configuration_support.html

アップグレード時の注意¶

The HyperV virt driver has been removed. It was deprecated in the Nova 27.2.0 (Antelope) release. This driver was untested and has no maintainers. In addition, it has a dependency on the OpenStack Winstacker project that also has been retired.

The RDP console was only available for the HyperV driver, therefore the RDP console related APIs below will return HTTP 400 (BadRequest) error:
- GET RDP console:
  - Server Action Get RDP Console: POST /servers/{server_id}/action (os-getRDPConsole Action)
  - RDP protocol support from remote console API: POST /servers/{server_id}/remote-consoles
- GET RDP console connection information:
  - Show Console Connection Information: GET /os-console-auth-tokens/{console_token}
The following config options which only apply for the HyperV virt driver or RDP console APIs also have been removed:
- [hyperv] dynamic_memory_ratio
- [hyperv] enable_instance_metrics_collection
- [hyperv] instances_path_share
- [hyperv] limit_cpu_features
- [hyperv] mounted_disk_query_retry_count
- [hyperv] mounted_disk_query_retry_interval
- [hyperv] power_state_check_timeframe
- [hyperv] power_state_event_polling_interval
- [hyperv] qemu_img_cmd
- [hyperv] vswitch_name
- [hyperv] wait_soft_reboot_seconds
- [hyperv] config_drive_cdrom
- [hyperv] config_drive_inject_password
- [hyperv] volume_attach_retry_count
- [hyperv] volume_attach_retry_interval
- [hyperv] enable_remotefx
- [hyperv] use_multipath_io
- [hyperv] iscsi_initiator_list
- [rdp] enabled
- [rdp] html5_proxy_base_url
The following extra specs which only apply for the HyperV virt driver have been removed.
- os:resolution
- os:monitors
- os:vram

The deprecated [upgrade_levels] cert option has been removed.

The deprecated [api] use_forwarded_for option has been removed.

バグ修正¶

Some OS platforms don't provide by default cpufreq resources in sysfs, so they don't have CPU scaling governors. That's why we should let the governor strategy to be optional for CPU power management.

Relaxed the config option checking of the cpu_power_management feature of the libvirt driver. The nova-compute service will start with [libvirt]cpu_power_management=True and an empty [compute]cpu_dedicated_set configuration. The power management is still only applied to dedicated CPUs. So the above configuration only allowed to ensure that cpu_power_management can be enabled independently for configuring cpu_dedicated_set during deployment.

With the change from ml2/ovs DHCP agents towards OVN implementation in neutron there is no port with device_owner network:dhcp anymore. Instead DHCP is provided by network:distributed port. Fix relies on enable_dhcp provided by neutron-api if no port with network:dhcp owner is found. See bug 2055245 for details.

Ironic virt driver now uses the node cache and respects partition keys, such as conductor group, for list_instances and list_instance_uuids calls. This fix will improve performance of the periodic queries which use these driver methods and reduce API and DB load on the backing Ironic service.

Bug 2009280 has been fixed by no longer enabling the evmcs enlightenment in the libvirt driver. evmcs only works on Intel CPUs, and domains with that enlightenment cannot be started on AMD hosts. There is a possible future feature to enable support for generating this enlightenment only when running on Intel hosts.

Previously switchdev capabilities should be configured manually by a user with admin privileges using port's binding profile. This blocked regular users from managing ports with Open vSwitch hardware offloading as providing write access to a port's binding profile to non-admin users introduces security risks. For example, a binding profile may contain a pci_slot definition, which denotes the host PCI address of the device attached to the VM. A malicious user can use this parameter to passthrough any host device to a guest, so it is impossible to provide write access to a binding profile to regular users in many scenarios.

This patch fixes this situation by translating VF capabilities reported by Libvirt to Neutron port binding profiles. Other VF capabilities are translated as well for possible future use.

2024.1 Series Release Notes

2024.1 Series Release Notes¶

29.2.1-5¶

バグ修正¶

29.2.1¶

アップグレード時の注意¶

バグ修正¶

29.0.2¶

バグ修正¶

29.0.1¶

紹介¶

新機能¶

アップグレード時の注意¶

バグ修正¶

nova

Page Contents