Wallaby Series Release Notes

14.3.0-117

Prelude

Environment file collectd-write-qdr.yaml no longer specifies a default CollectdAmqpInstances hash.

New Features

  • Added new heat role specific parameter option ‘DdpPackage’ to select the required DDP Package.

  • Added new heat role specific param OVNAvailabilityZone to set availability-zones for ovn. This param replace seting availability-zones throught OVNCMSOptions

  • Since genisoimage was removed from CentOS9 / RHEL9, the nova’s default mkisofs_cmd option will not work anymore. In RHEL/CentOS realm, mkisofs is an alias to alternatives that either map to xorriso (9) or genisoimage (8).

  • To support Glance Distributed Image Import, adding configuration of worker_self_reference_url by providing the internal API URL for each node where glance api will run with glance-direct method of image-import is enabled.

  • The new ApacheTimeout parameter has been added, which determines the timeout used for IO operations in Apache.

  • This change adds functionality to enable modular libvirt daemons. All these daemons runs in its respective container. Also the default configuration is to use modular libvirt daemons instead of monolithic libvirt daemon. Here is the list of libvirt daemon which are added in this change. - virtnodedevd - virtproxyd - virtqemud - virtsecretd - virtstoraged

    It’s possible to define the individual log filters for each one of these daemon using the following new parameters: - LibvirtVirtlogdLogFilters - LibvirtVirtsecretdLogFilters - LibvirtVirtnodedevdLogFilters - LibvirtVirtstoragedLogFilters - LibvirtVirtqemudLogFilters - LibvirtVirtproxydLogFilters

    More information regarding modular libvirt daemons is available here. Libvirt Daemons <https://libvirt.org/daemons.html> _.

  • Introduce new parameters {{role.name}}NetworkConfigUpdate. This will be a bool. When {{role.name}}NetworkConfigUpdate is True existing network configurations will be updated. By default, this is False and only new deployments will have the networks configured. This parameter is role based only, with no global option.

  • New config options for Neutron logging service plugin configuration were added. There are options added for L3 Agent: NeutronL3AgentLoggingRateLimit, NeutronL3AgentLoggingBurstLimit, NeutronL3AgentLoggingLocalOutputLogBase, for OVS agent: NeutronOVSAgentLoggingRateLimit, NeutronOVSAgentLoggingBurstLimit, NeutronOVSAgentLoggingLocalOutputLogBase and for ML2/OVN backend: NeutronOVNLoggingRateLimit, NeutronOVNLoggingBurstLimit, NeutronOVNLoggingLocalOutputLogBase.

  • With conditional monitoring enabled in OVN, southbound ovsdb-serve takes lot of time in handling the monitoring and sending the updates to all its connected clients. Its takes lot of CPU. With monitor-all option, all ovn-controllers do not enable conditional monitoring there by reducing the load on the Southbound ovsdb-server.

  • A heat parameter IronicPowerStateChangeTimeout has been added which sets the number of seconds to wait for power operations to complete, i.e., so that a baremetal node is in the desired power state. If timed out, the power operation is considered a failure. The default is 60 seconds, which is the same as the current Ironic default.

  • Added pure_iscsi_cidr and pure_host_personality and eradicate_on_delete support for the Pure Storage FlashArray Cinder driver.

Upgrade Notes

  • Changes the ironic PXE container TFTP service from in.tftpd to use the dnsmasq TFTP service. This is because the in.tftpd service is not anticipated to be carried by Linux distributions moving forward, and dnsmasq is actively maintained.

  • When upgrading an environment that uses collectd-write-qdr.yaml the CollectdAmqpInstances defaults previously specified need to be added to an administrator provided environment file and used during the overcloud deploy process.

  • Mistral has been removed as it was Deprecated in Wallaby and is no longer in use.

  • Zaqar has been removed as it was deprecated in Wallaby and is no longer in use on the undercloud. Additionally it hasn’t been supproted in the overcloud.

Deprecation Notes

  • This change deprecates the nova-libvirt-container-puppet.yaml heat-template which configures monolithic modular libvirt daemon. The newly added heat-template for modular libvirt daemons will be used to configure libvirt services in different containers.

  • This change removes NetworkDeploymentActions and {{role.name}}NetworkDeploymentActions. Since we can no longer rely on the Heat stack action when using Ephemeral Heat in tripleo.

  • With the switch to ephemeral heat for the overcloud, the UndercloudMinion is no longer viable. Deploying UndercloudMinion is not supported anymore and environments files to enable its deployment are dropped.

Bug Fixes

  • The collectd-write-qdr.yaml no longer specifies a default CollectdAmqpInstances hash. When specified, it was not possible to override the parameter, resulting in a combined hash of the default values and the administrators custom values which could lead to unexpected issues.

  • Default of the NovaSyncPowerStateInterval parameter has been changed from 0 to 600, to use the default value consistent with the one defined in nova.

Other Notes

  • Steps are taken to minimize chances of confusion between the default block storage volume type established by the CinderDefaultVolumeType parameter, and cinder’s own __DEFAULT__ volume type.

    In a new deployment where no volumes exist, cinder’s __DEFAULT__ type is deleted because it is redundant. In an upgrade scenerio, if volumes exist then the __DEFAULT__ type’s description is updated to indicate the actual default volume type is the one established by the CinderDefaultVolumeType parameter.

14.3.0

New Features

  • The libvirt driver has added support for hardware-offloaded OVS with vDPA (vhost Data Path Acceleration) type interfaces. vDPA allows virtio net interfaces to be presented to the guest while the datapath can be offloaded to a software or hardware implementation. This enables high performance networking with the portablity of standard virtio interfaces.

    Nova added support for vhost-vdpa devices in Wallaby.

  • Added OVN DBs clustering support. In this service model, a clustered database runs across multiple hosts in multi-active mode.

  • To help operators protect their workload, they can now enable the KernelArgsDeferReboot role parameter. This will prevent the tripleo-kernel ansible module from automatically rebooting nodes even if KernelArgs were changed unexpectedly.

  • Enable image copy for multiple RBD Glance stores

    Previously when using multiple RBD glance stores the operator was responsible for copying the image to all stores. Nova-compute now has the ability to automatically copy an image to the local glance store when required. This change enables the feature and adds the following role specific parameters to control the behaviour.

    • NovaGlanceRbdCopyPollInterval

    • NovaGlanceRbdCopyTimeout

Upgrade Notes

  • Upgrades from OVN non-HA and OVN DBs pacemaker to OVN DBs clustered are currently not supported.

Security Issues

  • The OVN database servers in an OVN DBs clustering and TLS-everywhere deployment will listen on all IP addresses (0.0.0.0). This is a caveat that can only be addressed once RHBZ 1952038 is fixed.

Bug Fixes

  • NFSv4.2 is now there for long time and default in RHEL/CentOS 8. This changes the default for NovaNfsVersion to be v4.2 instead of v4 to have this the new default.

14.2.0

Prelude

Enablement of data collection and transportation to an STF instance is now handled via existing templates.

New Features

  • The following parameters add support for mounting Cinder’s image conversion directory on an external NFS share.

    • CinderImageConversionNfsShare

    • CinderImageConversionNfsOptions

  • The glance_api_cron container has been introduced, which executes db purge job for Glance service. Use GlanceCronDbPurge* parameters to override cron parameters.

  • The new MemcacheUseAdvancedPool parameter is added which enables usage of advanced poll for memcached connections in keystone middleware. This parameter is set to true by default to avoind bursting connections in some services like neutron.

  • When nova_virtlogd container gets restarted the instance console auth files will not be reopened again by virtlogd. As a result either instances need to be restarted or live migrated to a different compute node to get new console logs messages logged again. Usually on receipt of SIGUSR1, virtlogd will re-exec() its binary, while maintaining all current logs and clients. This allows for live upgrades of the virtlogd service on non containerized environments where updates just by doing an RPM update. To reduce the likelihood in a containerized environment virtlogd should only be restarted on manual request, or on compute node reboot. It should not be restarted on a minor update without migration off instances. This introduces a nova_virtlogd_wrapper container and virtlogd wrapper script, to only restart virtlogd on either manual or compute node restart.

  • Add support for OVS DPDK pmd auto balance parameters. This feature adds 3 new role specific THT parameters to set pmd-auto-lb-load-threshold, pmd-auto-lb-improvement-threshold, and pmd-auto-lb-rebal-interval in OVS through OvsPmdLoadThreshold, OvsPmdImprovementThreshold and OvsPmdRebalInterval respectively.

  • Introduce new parameter to configure OVS PMD Auto Load Balance for OVS DPDK

  • New parameter RbdDiskCachemodes allows to override the disk cache modes for RBD. Defaults to [‘network=writeback’].

  • A new service, OS::TripleO::Services::UndercloudUpgradeEphemeralHeat is added to the Undercloud role. The service is mapped to OS::Heat::None by default, but when environments/lifecycle/undercloud-upgrade-prepare.yaml is included, the service will be enabled and will migrate any already deployed stacks in the undercloud’s Heat instance to be able to be used with the ephemeral Heat deployment option from tripleoclient.

Upgrade Notes

  • When upgrading a deployment with the use of enable-stf.yaml, add the following files to your overcloud deployment command in order to maintain the existing services defined in enable-stf.yaml.

    • environments/metrics/collectd-write-qdr.yaml

    • environments/metrics/ceilometer-write-qdr.yaml

    • environments/metrics/qdr-edge-only.yaml

Bug Fixes

  • On the compute nodes, right now ssl certificates got created for libvirt, qemu-default, qemu-vnc and qemu-nbd. This is not required because the all services use the same NovaLibvirtNetwork network and therefore multiple certificates for the same hostname get created. Also from qemu point of view, if default_tls_x509_cert_dir and default_tls_x509_verify parameters get set for all certificates, there is no need to specify any of the other *_tls* config options. From Secure live migration with QEMU-native TLS

    The intention (of libvirt) is that you can just use the default_tls_x509_* config attributes so that you don’t need to set any other *_tls* parameters, unless you need different certificates for some services. The rationale for that is that some services (e.g. migration / NBD) are only exposed to internal infrastructure; while some sevices (VNC, Spice) might be exposed publically, so might need different certificates. For OpenStack this does not matter, though, we will stick with the defaults.

    Therefore with this change InternalTLSNbdCAFile, InternalTLSVncCAFile and InternalTLSQemuCAFile get removed (which defaulted to /etc/ipa/ca.crt anyways) and just use InternalTLSCAFile.

    Also all cerfificates get created when EnableInternalTLS is true to and mount all SSL certificates from the host. This is to prevent certificate information is not available in a qemu’s process container environment if features get switched later, which has shown to be problematic.

Other Notes

  • Using enable-stf.yaml now defines the expected configuration in OpenStack for use with Service Telemetry Framework. Removal of the defined resource_registry now requires passing additional environment files to enable the preferred data collectors and transport architecture, providing better flexibility to support additional architectures in the future.

  • These parameters can now be set per-role - DnfStreams, UpgradeInitCommand, UpgradeLeappCommandOptions, UpgradeLeappDevelSkip, UpgradeLeappToRemove, UpgradeLeappToInstall

14.1.2

New Features

  • The parameters CephHciOsdCount and CephHciOsdType were added in order to support the derive parameters feature for hyperconverged deployments when using cephadm.

14.1.0

Prelude

It’s not necessary to install ceph-ansible nor prepare a Ceph container when configuring external Ceph in Wallaby and newer. External ceph configuration is done with TripleO (not cephadm nor ceph-ansible) and should be executed using the related environment file.

New Features

  • Added TripleO support for the Unbound DNS resolver service.

  • Adds a new IronicInspectorStorageBackend parameter that can be used to set the storage backend for introspection data.

  • New environments are added at environments/disable-heat.yaml and environments/disable-neutron.yaml which can be used to disable those services.

  • The new parameter GlanceCinderMountPointBase has been added which will be used for mounting NFS volumes on glance nodes. When glance uses cinder as store and cinder backend is NFS, this parameter must be set to match cinder’s mount point.

  • Added new options for deploying Barbican with PKCS#11 backends: BarbicanPkcs11CryptoTokenLabels and BarbicanPkcs11CryptoOsLockingOk

  • The new paramerter GlanceCinderVolumeType parameter has been added which is required while configuring multiple cinder stores as glance backends.

  • The logic to configure the connection from barbican to nShield HSMs has been augmented to parse a nshield_hsms parameter, which allows the specification of multiple HSMs. The underlying ansible role (ansible-role-thales-hsm) will configure the HSMs in load sharing mode to provide HA.

  • The OS::TripleO::{{role.name}}::PreNetworkConfig resource has been restored. This resource can be used to implement any configuration steps executed before network configurations are applied.

  • It is now possible to deploy Ceph with TripleO using cephadm.

  • New CinderRpcResponseTimeout and CinderApiWsgiTimeout parameters provide a means for configuring Cinder’s RPC response and WSGI connection timeouts, respectively.

  • The Cinder Backup service can be switched from running active/passive under pacemaker, to active-active mode where it runs simultaneously on every node on which it’s deployed. Note that the service will be restarted when switching modes, which will interrupt any backup operations currently in progress.

  • A new CinderBackupCompressionAlgorithm parameter supports specifying the compression algorithm used by Cinder Backup backends that support the feature. The parameter defaults to zlib, which is Cinder’s default value.

  • Two new parameters are added to control the concurrency of Cinder’s backup and restore operations:

    • CinderBackupWorkers

    • CinderBackupMaxOperations

  • Adds support for configuring the cinder-backup service with a Google Cloud Storage (GCS) backend, or an Amazon S3 backend.

  • The cinder-backup service can be configured to store backups on external Ceph clusters defined by the CephExternalMultiConfig parameter. New CinderBackupRbdClusterName and CinderBackupRbdClientUserName parameters can be specified, which override the default CephClusterName and CephClientUserName values respectively.

  • A new CinderRbdMultiConfig parameter may be used to configure additional cinder RBD backends on external Ceph clusters defined by the CephExternalMultiConfig parameter.

  • The environment file environments/external-ceph.yaml has been created and can be used when an external Ceph cluster is used.

  • Added FRR as a new TripleO service. This service allows cloud operators to deploy pure L3 control plane via BGP protocol. This has the following benefits:

    • Obtain multiple routes on multiple uplinks

    • BGP used for ECMP load balancing and BFD for resiliency

    • Advertise routes to API endpoints

    • Less L2 traffic

    Please refer to Install and Configure FRRouter specification for more information.

  • QemuDefaultTLSVerify will allow operators to enable or disable TLS client certificate verification. Enabling this option will reject any client who does not have a certificate signed by the CA in /etc/pki/qemu/ca-cert.pem. The default is true and matches libvirt’s. We will want to disable this by default in train.

  • The LibvirtDebug parameter has been added to enable or disable debug logging of libvirtd and virtlogd.

  • Now the debug logging of libvirtd and virtlogd is enabled automatically when the Debug parameter is true.

  • The manila_api_cron container has been introduced, which executes db purge job for Manila service. Use ManilaCronDbPurge* parameters to override cron parameters.

  • Add posibilities to configure ovn dbs monitor interval in tht by OVNDBSPacemakerMonitorInterval (default 30s). Under load, this can create extra stress and since the timeout has already been bumped, it makes sense to bump this interval to a higher value as a trade off between detecting a failure and stressing the service.

  • Introducing the following parameters:
    • NovaComputeForceRawImages

    • NovaComputeUseCowImages

    • NovaComputeLibvirtPreAllocateImages

    • NovaComputeImageCacheManagerInterval

    • NovaComputeImageCacheRemoveUnusedBaseImages

    • NovaComputeImageCacheRemoveUnusedResizedMinimumAge

    • NovaComputeImageCachePrecacheConcurrency

  • When a node has hugepages enabled, we can help with live migrations by enabling NovaLiveMigrationPermitPostCopy and NovaLiveMigrationPermitAutoConverge. These flags are automatically enabled if hugepages are detected, but operators can override these settings.

  • Add the following parameters to tune the behavior of nova-scheduler to achieve better distribution of instances.

    • NovaSchedulerHostSubsetSize

    • NovaSchedulerShuffleBestSameWeighedHosts

  • Introduce new compute role based parameter NovaGlanceEnableRbdDownload to enable direct download if rbd is used for glance, but compute is using local ephemeral storage, to allow nova-compute to direct download the images in this scenario from the glance ceph pool via rbd, instead going through glance api. If NovaGlanceEnableRbdDownload is set, per default the global RBD glance parameters are used, CephClientUserName GlanceRbdPoolName and CephClusterName for the used ceph.conf. Glance supports multi storage backends which can be configured using GlanceMultistoreConfig. If additional RBD glance backends are configured, the NovaGlanceRbdDownloadMultistoreID can be used to pointing to the hash key (backend ID) of GlanceMultistoreConfig to use. If CephClientUserName or GlanceRbdPoolName are not set in the GlanceMultistoreConfig, the global values of those parameters will be used.

  • Add NovaLibvirtMaxQueues role parameter to set [libvirt]/max_queues in nova.conf of the compute. Default 0 corresponds to not set meaning the legacy limits based on the reported kernel major version will be used.

  • security-group logging is now supported under ML2/OVN. A more detailed explanation can be found in bug 1914757.

  • Adds pre_deploy_step_tasks support which is run after kolla files are setup and podman is configured, but before any deployment task or external deployment task. The use case is being able to start containers before any deployment task.

  • Add parameter NovaSchedulerQueryPlacementForRoutedNetworkAggregates that allows the scheduler to verify if the requested networks or the port are related to Neutron routed networks _ with some specific segments to use. In this case, the routed networks prefilter will require the related aggregates to be reported in Placement, so only hosts within the asked aggregates would be accepted. In order to support this behaviour, operators need to set the [scheduler]/query_placement_for_routed_network_aggregates configuration option which defaults to False.

  • The keystone_cron container was reintroduced to run trust_flush job, which removes expired or soft-deleted trusts from keystone database.

  • The KeystoneEnableDBPurge parameter was readded, to enable or disable purge job for Keystone.

  • The following parameters were added, to configure parameters about trust_flush cron job.

    • KeystoneCronTrustFlushEnsure

    • KeystoneCronTrustFlushMinute

    • KeystoneCronTrustFlushHour

    • KeystoneCronTrustFlushMonthday

    • KeystoneCronTrustFlushMonth

    • KeystoneCronTrustFlushWeekday

    • KeystoneCronTrustFlushMaxDelay

    • KeystoneCronTrustFlushDestination

    • KeystoneCronTrustFlushUser

  • Adding ptp parameters for timemaster service configuration on overcloud compute node.Timemaster will use already present chrony parameters. PTPMessageTransport, PTPInterfaces are added new.

Upgrade Notes

  • All service Debug parameters are now booleans as expected by oslo. This helps in proper validation and service template composition complexities.

  • The Keepalived service has been removed. The OS::Tripleo::Service::Keepalived resource should be removed during update/upgrade.

  • The iscsi deploy interface is no longer enabled by default in ironic, making the direct deploy interface the default. You will need to update your nodes to the direct deploy before upgrading or re-enable the iscsi deploy in IronicEnabledDeployInterfaces (but note that it is going to be deprecated in the future).

  • The IronicImageDownloadSource parameter has been changed to http by default making ironic cache glance images and serve them via a local HTTP server. Set the parameter to swift to return the previous behavior of relying on swift temporary URLs.

  • The NovaHWMachineType parameter now defaults x86_64 based instances to the unversioned q35 machine type. The remaining architecture machine type defaults being provided directly by OpenStack Nova.

    A environments/nova-hw-machine-type-upgrade.yaml environment file has been provided to pin NovaHWMachineType to the previous versioned machine type defaults during an upgrade.

    When the upgrade of the overcloud is complete the following OpenStack Nova documentation should then be used to ensure a machine type is recorded for all existing instances before the new NovaHWMachineType default can be used in the environment.

    https://docs.openstack.org/nova/latest/admin/hw-machine-type.html#update

  • Users of the OS::TripleO::Network::Ports::RedisVipPort and OS::TripleO::Network::Ports::OVNDBsVipPort interfaces must update their templates. The interfaces has been removed, and the managment of these virtual IPs has been moved to the tripleo-heat-templates service template.

    This change will typically affect deployments using already deployed servers. Typically the virtual IPs for Redis and OVNDBs was overriden using the deployed-neutron-port template. For example:

    resource_registry:
      OS::TripleO::Network::Ports::RedisVipPort: /usr/share/openstack-tripleo-heat-templates/deployed-server/deployed-neutron-port.yaml
      OS::TripleO::Network::Ports::OVNDBsVipPort: /usr/share/openstack-tripleo-heat-templates/deployed-server/deployed-neutron-port.yaml
    
    parameter_defaults:
      DeployedServerPortMap:
        redis_virtual_ip:
          fixed_ips:
            - ip_address: 192.168.100.10
          subnets:
            - cidr: 192.168.100.0/24
          network:
            tags:
              - 192.168.100.0/24
        ovn_dbs_virtual_ip:
          fixed_ips:
            - ip_address: 192.168.100.11
          subnets:
            - cidr: 192.168.100.0/24
          network:
            tags:
              - 192.168.100.0/24
    

    This will have to be changed. The following example shows how to replicate the above configuration:

    parameter_defaults:
      RedisVirtualFixedIPs:
        - ip_address: 192.168.100.10
          use_neutron: false
      OVNDBsVirtualFixedIPs:
        - ip_address: 192.168.100.11
          use_neutron: false
    
  • The legacy DefaultPasswords interface to use passwords from heat resources has been removed as we don’t use it anymore.

  • The OVNVifType parameter has been removed because the parameter was not used in Neutron.

  • The following two services have been removed, and should be removed from role data during upgrade.

    • OS::TripleO::Services::CinderBackendVRTSHyperScale

    • OS::TripleO::Services::VRTSHyperScale

  • Remove deprecated OS::TripleO::Services::CinderBackendDellEMCXTREMIOIscsi. Use OS::TripleO::Services::CinderBackendDellEMCXtremio instead.

Deprecation Notes

  • The IronicInspectorUseSwift parameter has been deprecated in favor of IronicInspectorStorageBackend and will be removed in a future release.

  • The BarbicanPkcs11CryptoTokenLabel option has been deprecated and replaced with the BarbicanPkcs11CryptoTokenLabels option.

  • Some parameters within ThalesVars have been deprecated. These are - thales_hsm_ip_address and thales_hsm_config_location. See environments/barbican-backend-pkcs11-thales.yaml for details.

  • Ceph Deployment using Ceph versions older than Octopus is deprecated.

  • The CephOsdPercentageMin parameter has been deprecated and has a new default of 0 so that the validation is not run. There is no need to fail the deployment early if a percentage of the OSDs are not running because the Ceph pools created for OpenStack can now be created even if there are 0 OSDs as the PG number is no longer required on pool creation. TripleO no longer waits for OSD creation and instead only queues the request for OSD creation with the ceph orchestrator.

  • The environment file environments/ceph-ansible/ceph-ansible-external.yaml has been deprecated and will be removed in X.

  • The interfaces OS::TripleO::Network::Ports::RedisVipPort and OS::TripleO::Network::Ports::OVNDBsVipPort ha been removed. The resources are no longer used in the overcloud heat stack.

  • Supoort for the Veritas HyperScale Driver has been removed.

Bug Fixes

  • Now ExtraConfigPre resource and NodeExtraConfig resource are executed after network configurations are applied in nodes. This is consitent with the previous version with heat software deployment mechanism instead of config-download.

  • The default value of CinderNfsSnapshotSupport has been changed from true to false, to be consistent with the default value in cinder.

  • Previously access to the sshd running by the nova-migration-target container is only limited via the sshd_config. While login is not possible from other networks, the service is reachable via all networks. This change limits the access to the NovaLibvirt and NovaApi networks which are used for cold and live-migration.

  • Nova vnc configuration right now uses NovaVncProxyNetwork, NovaLibvirtNetwork and NovaApiNetwork to configure the different components (novnc proxy, nova-compute and libvirt) for vnc. If one of the networks get changed from internal_api, the service configuration between libvirt, nova-compute and novnc proxy gets inconsistent and the console is broken. This changed to just use NovaLibvirtNetwork for configuring the vnc endpoints and removes NovaVncProxyNetwork completely.

  • Decrease Swift proxy timeouts for GET/HEAD requests using a new parameter named SwiftProxyRecoverableNodeTimeout. The default node timeout is 10 seconds in Swift, however this has been set to 60 seconds in TripleO in case there are slow nodes. However, this affects all requests - GET, HEAD and PUT. GET/HEAD requests are typically much faster, thus it makes sense to use a lower timeout to recover earlier from node failures. This will increase stability, because the proxy can select another backend node to retry the request.

  • Bug #1915800: Add support for ports filtering in XtremIO driver.

Other Notes

  • The CephPoolDefaultPgNum paramter default is now 16. The Ceph pg_autoscaler is enabled by default in the supported versions of Ceph though the parameter CephPoolDefaultPgNum may still be used as desired.

  • The default value of the parameter ‘RabbitAdditionalErlArgs’ was updated to include the new options ‘+sbwtdcpu none +sbwtdio none’ which disables busy-wait for dirty cpu schedulers and dirty i/o schedulers respectively. This aligns with the flags recommended by RabbitMQ upstream (https://www.rabbitmq.com/runtime.html#busy-waiting).

14.0.0

New Features

  • Added MemcachedMaxConnections setting with a default of 8192 maximum connections in order to allow an operator to override that value in environments where memcached is heavily sollicited.

  • The aodh_api_cron container has been added to run aodh-expirer command periodically, to remove expired alarms from Aodh database. Use AodhExpire* parameters to override cron parameters.

  • The new AodhAlarmHistoryTTL parameter has been added, which defines TTL of alarm histories in aodh. This parameter is set as 86400 by default.

  • Support deploying multiple Cinder Netapp Storage backends. CinderNetappBackendName is enhanced to support a list of backend names, and a new CinderNetappMultiConfig parameter provides a way to specify parameter values for each backend.

  • Introducing the new NovaSchedulerEnabledFilters based on the new nova parameter filter_scheduler.enabled_filters.

  • The parameter NovaComputeStartupDelay allows the operator to delay the startup of nova-compute after a compute node reboot. When all the overcloud nodes are rebooted at the same time, it can take a few minutes to the Ceph cluster to get in a healthy state. This delay will prevent the instances from booting before the Ceph cluster is healthy.

  • The NovaApiMaxLimit parameter allows the operator to set Nova API max_limit using a Heat parameter in their templates.

  • New Heat parameter ClusterFullTag controls how we configure pacemaker container image names for the HA services. Compared to the previous parameter ClusterCommonTag, this new naming convention allows any part of the container image name to change during a minor update, without service disruption. e.g., registryA/namespaceA/imgA:tagA to registryB/namespaceB/imgB:tagB This new paramter ClusterFullTag is enabled by default.

  • Refresh Swift ring files without restarting containers. This makes it possible to update rings without service restarts, lowering the overhead for updates.

  • The SwiftHashPrefix parameter allows the operator to set Swift swift_hash_path_prefix using a Heat parameter in their Templates.

  • OVN now supports VXLAN network type for tenant networks.

Known Issues

  • Cell_v2 discovery has been moved from the nova-compute|nova-ironic containers as this requires nova api database credentials which must not be configured for the nova-compute service. As a result scale-up deployments which explicitly omit the Controller nodes will need to make alternative arrangements to run cell_v2 discovery. Either the nova-manage command can be run manually after scale-up, or an additional helper node using the NovaManage role can be deployed that will be used for this task instead of a Controller node. See Bug: 1786961 and Bug: 1871482.

Upgrade Notes

  • The following parameters have been removed since they have had no effect.

    • NovaDbSyncTimeout

    • ExtractedPlacementEnabled

  • The EnableEtcdInternalTLS parameter’s default value changes from false to true. The change is related to the fact that novajoin is deprecated, and the functionality associated with the EnableEtcdInternalTLS parameter is not required when TLS is deployed using the tripleo-ansible ansible module.

  • The AdminEmail parameter has been removed because it has had no effect since TripleO had bootstrap support implemented.

  • Support for Sahara service has been removed.

Deprecation Notes

  • The EnableEtcdInternalTLS parameter is deprecated. It was added to support a workaround that is necessary when novajoin is used to deploy TLS, but novajoin itself is deprecated. The workaround is not necessary when TLS is deployed using the tripleo-ansible ansible module.

  • Deprecating NovaSchedulerDefaultFilters, it’s replaced with the new setting, NovaSchedulerEnabledFilters.

  • Zaqar services are deprecated for removal.

Bug Fixes

  • When deploying a spine-and-leaf (L3 routed architecture) with TLS enabled for internal endpoints the deployment would fail because some roles are not connected to the network mapped to the service in ServiceNetMap. To fix this issue a role specific parameter {{role.name}}ServiceNetMap is introduced (defaults to: {}). The role specific ServiceNetMap parameter allow the operator to override one or more service network mappings per-role. For example:

    ComputeLeaf2ServiceNetMap:
      NovaLibvirtNetwork: internal_api_leaf2
    

    The role specific {{role.name}}ServiceNetMap override is merged with the global ServiceNetMap when it’s passed as a value to the {{role.name}}ServiceChain resources, and the {{role.name}} resource groups so that the correct network for this role is mapped to the service.

    Closes bug: 1904482.

  • Certificates get merged into the containers using kolla_config mechanism. If a certificate changes, or e.g. UseTLSTransportForNbd gets disabled and enabled at a later point the containers running the qemu process miss the required certificates and live migration fails. This change moves to use bind mount for the certificates and in case of UseTLSTransportForNbd ans creates the required certificates even if UseTLSTransportForNbd is set to False. With this UseTLSTransportForNbd can be enabled/disabled as the required bind mounts/certificates are already present.

  • https://review.opendev.org/q/I8df21d5d171976cbb8670dc5aef744b5fae657b2 introduced THT parameters to set libvirt/cpu_mode. The patch sets the NovaLibvirtCPUMode wrong to ‘none’ string which results in puppet-nova not to handle the default cases correct and sets libvirt/cpu_mode to none which results in ‘qemu64’ CPU model, which is highly buggy and undesirable for production usage. This changes the default to the recommended CPU mode ‘host-model’, for various benefits documented elsewhere.

  • When using RHSM Service (deployment/rhsm/rhsm-baremetal-ansible.yaml) based registration of the overcloud nodes and enabling the KSM using NovaComputeEnableKsm=True the overcloud deployment will fail because the RHSM registration and the ksm task run as host_prep task. The handling of enable/disable ksm is now handled in deploy step 1.

  • In case of cellv2 multicell environment nova-metadata is the only httpd managed service on the cell controller role. In case of tls-everywhere it is required that the cell controller host has ther needed metadata to be able to request the HTTP certificates. Otherwise the getcert request fails with “Insufficient ‘add’ privilege to add the entry ‘krbprincipalname=HTTP/cell1-cellcontrol-0….’”

  • Do not relabel Swift files on every container (re-)start. These will be relabeled already in step 3 preventing additional delays.

Other Notes

  • A new parameter called RabbitTCPBacklog that specifies the maximum tcp backlog for RabbitMQ as been added. The value defaults to 4096 to match previous behavior and can be modified for the needs of larger scale deployments by operators.