Train Series Release Notes

11.5.0-44

New Features

  • Added new options for deploying Barbican with PKCS#11 backends: BarbicanPkcs11CryptoTokenLabels and BarbicanPkcs11CryptoOsLockingOk

  • New CinderRpcResponseTimeout and CinderApiWsgiTimeout parameters provide a means for configuring Cinder’s RPC response and WSGI connection timeouts, respectively.

  • The new EnableCache parameter is added to enable/disable chacing using memcached services. The parameter is true by default, but should be false when memcached service is disabled in the deployment.

  • The MariaDB tuning parameter for Innodb_buffer_pool_size can now be set via a new TripleO Heat Template parameter ‘MysqlInnodbBufferPoolSize’. By default this is undefined.

  • QemuDefaultTLSVerify will allow operators to enable or disable TLS client certificate verification. Enabling this option will reject any client who does not have a certificate signed by the CA in /etc/pki/qemu/ca-cert.pem. The default is true and matches libvirt’s. We will want to disable this by default in train.

  • Add posibilities to configure ovn dbs monitor interval in tht by OVNDBSPacemakerMonitorInterval (default 30s). Under load, this can create extra stress and since the timeout has already been bumped, it makes sense to bump this interval to a higher value as a trade off between detecting a failure and stressing the service.

  • The nova-ironic setting for ‘max_concurrent_builds’ can now be set via the use of a new TripleO Heat templates parameter ‘IronicMaxConcurrentBuilds’. It is set to the service default of 10 by default in TripleO Heat templates.

  • Adding ptp parameters for timemaster service configuration on overcloud compute node.Timemaster will use already present chrony parameters. PTPMessageTransport, PTPInterfaces are added new.

Deprecation Notes

  • The BarbicanPkcs11CryptoTokenLabel option has been deprecated and replaced with the BarbicanPkcs11CryptoTokenLabels option.

Bug Fixes

  • Previously access to the sshd running by the nova-migration-target container is only limited via the sshd_config. While login is not possible from other networks, the service is reachable via all networks. This change limits the access to the NovaLibvirt and NovaApi networks which are used for cold and live-migration.

  • Nova vnc configuration right now uses NovaVncProxyNetwork, NovaLibvirtNetwork and NovaApiNetwork to configure the different components (novnc proxy, nova-compute and libvirt) for vnc. If one of the networks get changed from internal_api, the service configuration between libvirt, nova-compute and novnc proxy gets inconsistent and the console is broken. This changed to just use NovaLibvirtNetwork for configuring the vnc endpoints and removes NovaVncProxyNetwork completely.

11.5.0

New Features

  • The new parameter GlanceCinderMountPointBase has been added which will be used for mounting NFS volumes on glance nodes. When glance uses cinder as store and cinder backend is NFS, this parameter must be set to match cinder’s mount point.

  • The logic to configure the connection from barbican to nShield HSMs has been augmented to parse a nshield_hsms parameter, which allows the specification of multiple HSMs. The underlying ansible role (ansible-role-thales-hsm) will configure the HSMs in load sharing mode to provide HA.

  • A new multipathd-container-ansible.yaml heat template replaces the multipathd-container.yaml template. The new template adds support for the following new parameters. * MultipathdSkipKpartx * MultipathdCustomConfigFile

  • When a node has hugepages enabled, we can help with live migrations by enabling NovaLiveMigrationPermitPostCopy and NovaLiveMigrationPermitAutoConverge. These flags are automatically enabled if hugepages are detected, but operators can override these settings.

  • Add NovaLibvirtMaxQueues role parameter to set [libvirt]/max_queues in nova.conf of the compute. Default 0 corresponds to not set meaning the legacy limits based on the reported kernel major version will be used.

Known Issues

  • Cell_v2 discovery has been moved from the nova-compute|nova-ironic containers as this requires nova api database credentials which must not be configured for the nova-compute service. As a result scale-up deployments which explicitly omit the Controller nodes will need to make alternative arrangements to run cell_v2 discovery. Either the nova-manage command can be run manually after scale-up, or an additional helper node using the NovaManage role can be deployed that will be used for this task instead of a Controller node. See Bug: 1786961 and Bug: 1871482.

Upgrade Notes

  • When upgrading from the multipathd-container.yaml template to the new multipathd-container-ansible.yaml template, bear in mind the new MultipathdSkipKpartx parameter will configure the corresponding skip_kpartx setting in /etc/multipath.conf.

Deprecation Notes

  • Some parameters within ThalesVars have been deprecated. These are - thales_hsm_ip_address and thales_hsm_config_location. See environments/barbican-backend-pkcs11-thales.yaml for details.

  • The multipathd-container.yaml template is deprecated in favor of a new multipathd-container-ansible.yaml template. The new template is backward compatible with the old template, but see the features and upgrade notes for additional details.

Bug Fixes

  • When deploying a spine-and-leaf (L3 routed architecture) with TLS enabled for internal endpoints the deployment would fail because some roles are not connected to the network mapped to the service in ServiceNetMap. To fix this issue a role specific parameter {{role.name}}ServiceNetMap is introduced (defaults to: {}). The role specific ServiceNetMap parameter allow the operator to override one or more service network mappings per-role. For example:

    ComputeLeaf2ServiceNetMap:
      NovaLibvirtNetwork: internal_api_leaf2
    

    The role specific {{role.name}}ServiceNetMap override is merged with the global ServiceNetMap when it’s passed as a value to the {{role.name}}ServiceChain resources, and the {{role.name}} resource groups so that the correct network for this role is mapped to the service.

    Closes bug: 1904482.

  • Fixed the Octavia OctaviaTenantLogFacility setting default to 0 to align it with the project default.

  • Previously, HorizonDebug and Debug parameters change the value of horizon::django_debug. However, those parameters didn’t set DEBUG log level to horizon logger components. By this change, if those are true, horizon::log_level is set to ‘DEBUG’.

  • Do not relabel Swift files on every container (re-)start. These will be relabeled already in step 3 preventing additional delays.

11.4.0

New Features

  • Adds a new ContainerNovaLibvirtPidsLimit parameter in order to set the PIDs limit for nova_libvirt container. Defaults to 65536, set to 0 for unlimited.

  • Adds support for IGMP snooping (Multicast) in the Neutron ML2/OVS driver.

  • Added the configuration option to set reserved_huge_pages. When NovaReservedHugePages is set, “reserved_huge_pages” is set to the value of NovaReservedHugePages. If NovaReservedHugePages is unset and OvsDpdkSocketMemory is set, reserved_huge_pages value is calcuated from KernelArgs and OvsDpdkSocketMemory. KernelArgs helps determine the default huge page size used, the default is set to 2048kb and OvsDpdkSocketMemory helps determine the number of hugepages to reserve.

  • Add new BarbicanClient tripleo service for configuring DCN/Edge nodes to access a barbican service running in the control plane. The client service is disabled by default, and can be enabled by including the environments/services/barbican-edge.yaml environment file when deploying a DCN/Edge stack.

  • Added the Octavia anti-affinity parameters.

  • Added enhancements to Octavia’s OVN driver configuration, so it can connect to OVN_Northbound DB using SSL/TLS.

  • Added new PublicTLSCAFile parameter, that is used to set the ca cert in clouds.yaml for keystone public endpoint. This defaults to empty string (‘’) assuming that the certs are already trusted.

  • Add GlanceImagePrefetcherInterval parameter to run periodic job which fetches the queued images for caching in cache directory, when image cache is enabled.

  • Inclusion and configuration of ReaR service to undercloud and overcloud nodes.

  • Added MemcachedMaxConnections setting with a default of 8192 maximum connections in order to allow an operator to override that value in environments where memcached is heavily sollicited.

  • Add parameter NovaAllowResizeToSameHost to allow instances to resize to the host they are currently on. Normally the source host is excluded.

  • To isolate LVM volumes created by compute guests, within Cinder volumes, from the LVM volumes created/managed by the host itself, a new task has been introduced to create an allowlist and denylist of devices which should be accessible (or not) to the host, configured in lvm.conf using the global_filter key. The allowlist is generated gathering the list of existing in-use physical disks (or partitions) and appending to it any user provided device passed via LVMFilterAllowlist parameter. The denylist is configured via LVMFilterDenylist and defaults to [‘.*’], which means it blocks any device not explicitly allowed. Both the list parameters can be specified per-role. The feature is, by default, disabled and can be enabled passing LVMFilterEnabled: true; when disabled the existing lvm.conf won’t be touched and a version of it which includes the global_filter will be left, for debugging, in /tmp/tripleo_lvmfilter.conf.

  • The new parameter CephExternalMultiConfig may be used to configure OpenStack to use multiple external Ceph clusters.

  • Add parameters NovaLibvirtCPUMode, NovaLibvirtCPUModels and NovaLibvirtCPUModelExtraFlags to allow configuration of CPU related parameters libvirt/cpu_mode, libvirt/cpu_model and libvirt/cpu_model_extra_flags respectively.

  • Add a role specific parameter, ContainerCpusetCpus, default to ‘all’, which allows to limit the specific CPUs or cores a container can use. To disable it and rely on container engine default, set it to ‘’.

  • Add boolean parameter NovaSchedulerEnableIsolatedAggregateFiltering which allows to set scheduler/enable_isolated_aggregate_filtering parameter. This configures scheduler to restrict hosts in aggregates based on matching required traits in the aggregate metadata and the instance flavor/image. If an aggregate is configured with a property with key trait:$TRAIT_NAME and value required, the instance flavor extra_specs and/or image metadata must also contain trait:$TRAIT_NAME=required to be eligible to be scheduled to hosts in that aggregate. Default value for NovaSchedulerEnableIsolatedAggregateFiltering is False.

  • This change updates the multiple-nics and multiple-nics-vlans templates so that an external bridge is created if either the role uses the External network or the “external_bridge” tag is set in the role definition. This is done instead of checking if the role name is “Controller”. This change also assigns the “external_bridge” tag to the Controller as well as the Compute roles so that both roles can access the Neutron external bridge for floating IPs or SNAT by default so that OVN can use DVR.

  • Introduce “{{role.name}}ExtraGroupVars” which allows to define a dictionary of Ansible group vars per role. These extra group vars will override any pre-defined group var from a service.

  • Add parameters for configuring multiple glance-api backends. The existing GlanceBackend parameter represents the default backend, and a new GlanceMultistoreConfig parameter is a hash representing the configuration of additional backends. A new GlanceStoreDescription parameter provides a means of describing each backend.

    The configuration can specify any combination of supported backend types. Multiple rbd backends can be specified, but cinder, file and swift backends are limited to one each.

  • The following parameters were added to support configuration of gnocchi nfs backend.

    • GnocchiNfsEnabled

    • GnocchiNfsShare

    • GnocchiNfsOptions

  • For baremetal operations on DHCPv6-stateful networks multiple IPv6 addresses can now be allocated for neutron ports created for provisioning, cleaning, rescue or inspection. The new parameter IronicDhcpv6StatefulAddressCount controls the number of addresses to allocate.

  • Add Heat parameter EnableMysqlAuthEd25519, which when set to true, configures MySQL user credentials to require ed25519-based authentication to the mariadb server, instead of the default SHA1-based native authentication.

  • Add boolean parameter NeutronDhcpAgentDnsmasqEnableAddr6List to support the dnsmasq_enable_addr6_list option in dhcp agent settings. (See bug: #1861032)

  • Adding two parameters to manage vPMEM [0] configuration parameters. NovaPMEMMappings parameter set Nova’s configuration option pmem_namespaces that reflects mappings between vPMEM and physical PMEM namespaces. NovaPMEMNamespaces creates and manages physical backend PMEM namespaces which win be used as backend for vPMEM. NovaPMEMMappings example: 6GB:ns0|ns1|ns2,LARGE:ns3 will expose namespaces ns0, ns1, ns2 using label 6GB and namespace ns3 using label LARGE. NovaPMEMNamespaces example: 100G:ns0|14096M:ns1 will create two namespaces: ns0 - size 100G, ns1 - size 14096M.

  • The NovaApiMaxLimit parameter allows the operator to set Nova API max_limit using a Heat parameter in their templates.

  • Add the NovaImageCacheTTL to the nova compute service. This exposes the remove_unused_original_minimum_age_seconds from nova.conf which controls the time (in seconds) that nova compute should continue caching an image once it is no longer used by and instances on the host. Defaults to 86400 (24hrs)

  • Add boolean parameter NovaSchedulerPlacementAggregateRequiredForTenants which allows to set scheduler/placement_aggregate_required_for_tenants parameter. It controls whether or not a tenant with no aggregate affinity will be allowed to schedule to any available node. If aggregates are used to limit some tenants but not all, then this should be False. If all tenants should be confined via aggregate, then this should be True. Default value for NovaSchedulerPlacementAggregateRequiredForTenants is false.

  • Add boolean parameter NovaSchedulerQueryPlacementForAvailabilityZone that sets scheduler/query_placement_for_availability_zone parameter. It allows the scheduler to look up a host aggregate with metadata key of availability zone set to the value provided by incoming request, and request result from placement be limited to that aggregate. Default value for NovaSchedulerQueryPlacementForAvailabilityZone is false.

  • Adds the “OctaviaLogOffload” setting to enable amphora log offloading.

  • Adds support for IGMP snooping (Multicast) in the OVN driver. Defaults to False. IGMP snooping requires OVN version 2.12 or above.

  • Support for PowerMax backend cinder driver. Supports both iSCSI and FC volume drivers and support deploying one or multiple cinder PowerMax storage backends.

  • Support for Dell EMC SC backend cinder driver. Supports both iSCSI and FC volume drivers and support deploying one or multiple cinder SC storage backends.

  • Add the ability to deploy the glance-api service at DCN/Edge sites. Glance service at the Edge shares the same database as the Glance service in the central control plane, but allows other services such as Cinder and Nova to access a Glance endpoint that is local to the DCN/Edge site.

  • When SwiftRawDisks is set, try to mount the disks using uuids instead of paths. This makes mounts more stable, eg. if a kernel gets updates and device orders are changed.

  • The ansible tripleo-hosts-entries is now used for adding individual entries to /etc/hosts for each overcloud node. This role is used instead of the output data from the Heat stack.

  • Added support for VxFlexOS cinder block storage backend driver

  • Support for Dell EMC Xtremio backend cinder driver. Supports both iSCSI and FC volume drivers and support deploying one or multiple cinder Xtremio storage backends.

  • A new Heat parameter ‘ZaqarWsTimeout’ exposes the Puppet variable ‘tripleo::haproxy::zaqar_ws_timeout_tunnel’. This allows operators to configure the Mistral API timeout. It currently defaults to four hours.

Upgrade Notes

  • Cinder’s legacy “volume” service and its associated endpoints are automatically removed from the keystone catalog. The “volume” service is associated with Cinder’s v1 API, which was removed in Queens.

  • Now NotificationDriver is set to noop by default, as legacy telemetry services are disabled by default. Explicitly set NotificationDriver parameter to notifications from each services.

  • The “external_bridge” tag is now used for the Compute node. An external network bridge is required on the compute nodes in order to host floating IPs when using DVR. OVN deploys with DVR by default.

  • The CIDR for the StorageNFS network in the sample network_data_ganesha.yaml file has been modified to provide more usable IPs for the corresponding Neutron overcloud StorageNFS provider network. Since the CIDR of an existing network cannot be modified, deployments with existing StorageNFS networks should be sure to customize the StorageNFS network definition to use the same CIDR as that in their existing deployment in order to avoid a heat resource failure when updating or upgrading the overcloud.

  • Exclude /var/lib/ironic/* from container-puppet.sh rsync, this is a leftover from the initial containerization of TripleO; now we have host prep tasks, the ironic conductor and inspector bind mount /var/lib/ironic and generate the data that they need. But this data should not be in the config volume or it can conflict from each other when rsync runs at the same time. Check launchpad bug 1868934. TripleO upgrade tasks and host prep tasks will take care of removing the var directory from the config volumes and the containers will just use the bind mount, like it should be doing now. These tasks will run during a minor update, major upgrade, and fast forward upgrade.

Deprecation Notes

  • The deployed-server bootstrap environments, templates, and scripts that were previously deprecated are now removed. These removals include deployed-server/deployed-server-bootstrap-centos.sh deployed-server/deployed-server-bootstrap-centos.yaml deployed-server/deployed-server-bootstrap-rhel.sh deployed-server/deployed-server-bootstrap-rhel.yaml environments/deployed-server-bootstrap-environment-centos.yaml environments/deployed-server-bootstrap-environment-rhel.yaml

  • As the fast forward upgrade workflow to skip multiple releases now relies on the very same upgrade_tasks, there is no need to mantain the fast_forward_upgrade_tasks, as well as any of its references.

  • ExternalPublicUrl, ExternalAdminUrl and ExternalInternalUrl are deprecated. ExternalSwiftPublicUrl, ExternalSwiftAdminUrl and ExternalSwiftInternalUrl should now be used.

Bug Fixes

  • The parameter ControlPlaneSubnetCidr was missing in the network/ports/net_vip_map_external.j2.yaml and network/ports/net_vip_map_external_v6.j2.yaml template files. This caused deployment failure since the VipMap resource pass this property. (See Bug: #1864912)

  • Ensure the barbican Key Manager settings are configured on DCN/Edge nodes when the barbican service is deployed in the control plane. See bug 1886070.

  • As per launchpad bug 1855704, the lvmfilter task aims at hiding to the host the LVM2 volumes created by compute guests in Cinder volumes or Glance images.

  • When using the Shared File Systems service (manila), you may now use the Heat template parameter “ManilaEnabledShareProtocols” to configure the NAS protocols that users may use. If not set, the value is inferred per the storage backends that have been enabled.

  • Ansible GroupVars incorrectly keept a single subnet prefix per-network. This caused a problem when multiple subnets using different subnet prefixes where defined. Resulting in the wrong subnet prefix being referenced in the NetworkConfig for roles.

    AnsibleHostVars stores networks subnet prefixes instead. See bug: 1895899.

  • The keystone catalog is automatically updated to remove any entries associated with Cinder’s v1 API “volume” service. This fixes bug 1897761.

  • All roles now default to using the net-config-static-bridge.yaml nic config when using deployed-server. Since OVN is the default in TripleO, Compute roles need to have br-ex. Previously when using deployed-server, the default nic config for the non-Controller roles was net-config-static.yaml, which did not create br-ex.

  • Fixed issue in the sample network_data_ganesha.yaml file where the IPv4 allocation range for the StorageNFS network occupies almost the whole of its CIDR. If network_data_ganesha.yaml is used without modification in a customer deployment then there are too few IPs left over in its CIDR for use by the corresponding overcloud Neutron StorageNFS provider network for its overcloud DHCP service. (See bug: #1889682)

  • Fixed an issue where disabling one or more networks in network_data.yaml caused deployment failure. (See bug: #1842001)

  • Fixes an issue where the parameter CloudNameStorageManagement was used for all custom networks with service_net_map_replace defined. (See bug: 1862679.)

  • Fixed an issue where containers octavia_api and octavia_driver_agent would fail to start on node reboot.

  • Certificates get merged into the containers using kolla_config mechanism. If a certificate changes, or e.g. UseTLSTransportForNbd gets disabled and enabled at a later point the containers running the qemu process miss the required certificates and live migration fails. This change moves to use bind mount for the certificates and in case of UseTLSTransportForNbd ans creates the required certificates even if UseTLSTransportForNbd is set to False. With this UseTLSTransportForNbd can be enabled/disabled as the required bind mounts/certificates are already present.

  • https://review.opendev.org/q/I8df21d5d171976cbb8670dc5aef744b5fae657b2 introduced THT parameters to set libvirt/cpu_mode. The patch sets the NovaLibvirtCPUMode wrong to ‘none’ string which results in puppet-nova not to handle the default cases correct and sets libvirt/cpu_mode to none which results in ‘qemu64’ CPU model, which is highly buggy and undesirable for production usage. This changes the default to the recommended CPU mode ‘host-model’, for various benefits documented elsewhere.

  • When using RHSM Service (deployment/rhsm/rhsm-baremetal-ansible.yaml) based registration of the overcloud nodes and enabling the KSM using NovaComputeEnableKsm=True the overcloud deployment will fail because the RHSM registration and the ksm task run as host_prep task. The handling of enable/disable ksm is now handled in deploy step 1.

  • In case of cellv2 multicell environment nova-metadata is the only httpd managed service on the cell controller role. In case of tls-everywhere it is required that the cell controller host has ther needed metadata to be able to request the HTTP certificates. Otherwise the getcert request fails with “Insufficient ‘add’ privilege to add the entry ‘krbprincipalname=HTTP/cell1-cellcontrol-0….’”

  • HA container naming scheme has been updated to look like ‘container.common.tag/<servicename>:pcmklatest’, in order for podman to not prepend any host suffix in front of this tag, otherwise this confuses the podman resource agent in pacemaker.

  • Fixes an issue where TripleO fails to set the Barbican key ID for Swift with a permission error if the config files are not relabeled.

  • Fix Swift ring synchronization to ensure every node on the overcloud has the same copy to start with. This is especially required when replacing nodes or using manually modifed rings.

Other Notes

  • Moving this chcon call to the specific podman container upgrade part allows to prevent consuming time for nothing. This chcon call is needed only if we move from docker to podman, meaning upgrading to train.

  • The ValidateNtp has been removed from the all nodes validation configuration. During the time sync configuration we already do a check to ensure the ntp servers are available. If they are not we will fail with an appropriate message. The ValidateNtp option came from a time before we could fail in a more explicit way.

11.3.1

New Features

  • Added the “connection_logging” parameter for the Octavia service.

  • Added support for running the Octavia driver agent in a container. This will enable features such as the OVN load balancer provider in octavia as well as other third party providers.

  • Added the Octavia log offload parameters.

  • The ManageNetworks parameter has been added. The parameter controls management of the network and related resources (subnets and segments) with either create, update, or delete operations (depending on the stack operation). Does not apply to ports which will always be managed as needed. Defaults to true. For multi-stack use cases where the network related resources have already been managed by a separate stack, this parameter can be set to false.

  • Introduces two new parameters to configure the archive deleted instances cron job. 1) NovaCronArchiveDeleteAllCells To make sure deleted instances get archived also from the cell0 in a single cell deployment and also in additional cell databases in case of a multi cell deployment.

    2) NovaCronArchiveDeleteRowsAge –before is required to prevent the orphaning of libvirt guests if/when nova-compute is down when a db archive cron job fires.

    This change also modifies 1) the default from 100 to 1000 for NovaCronArchiveDeleteRowsMaxRows to match the default from the nova-manage command instead the default of 100 from the puppet-nova parameter.

    2) changes the default for NovaCronPurgeShadowTablesAllCells from false to true also the nova-manage db purge command needs to run for all cells instead of only the default cell.

  • Added new heat param OVNOpenflowProbeInterval to set ovn_openflow_probe_interval which is inactivity probe interval of the OpenFlow connection to the OpenvSwitch integration bridge, in seconds. If the value is zero, it disables the connection keepalive feature, by default this value is set on 60s. If the value is nonzero, then it will be forced to a value of at least 5s.

  • Added a TripleO service OvsDpdkNetcontrold to enable netcontrold PMD rebalance tool for OvS-DPDK deployments.

  • HA services use a special container image name derived from the one configured in Heat parameter plus a fixed tag part, i.e. ‘<registry>/<namespace>/<servicename>:pcmklatest’. To implement rolling update without service disruption, this ‘pcmklatest’ tag is adjusted automatically during minor update every time a new image is pulled. A new Heat parameter ClusterCommonTag can now control the prefix part of the container image name. When set to true, the container name for HA services will look like ‘container-common-tag/<servicename>:pcmklatest’. This allows rolling update of HA services even when the <namespace> changes in Heat.

  • Enable the new container image naming scheme for HA services. They are now configured in pacemaker to use container image name like ‘container-common-tag/<servicename>:pcmklatest’. This allows rolling update of HA services even when the <namespace> changes in Heat.

  • Under pressure, the default monitor timeout value of 20 seconds is not enough to prevent unnecessary failovers of the ovn-dbs pacemaker resource. While spawning a few VMs in the same time this could lead to unnecessary movements of master DB, then re-connections of ovn-controllers (slaves are read-only), further peaks of load on DBs, and at the end it could lead to snowball effect. Now this value can be configurable by OVNDBSPacemakerTimeout which will configure tripleo::profile::pacemaker::ovn_dbs_bundle (default is set to 60s).

  • Enabling additional healtchecks for Swift to monitor account, container and object replicators as well as the rsync process.

Deprecation Notes

  • The roles file at deployed-server/deployed-server-roles-data.yaml is deprecated in train. It’s contents are the same as roles_data.yaml, and no special roles files are needed when using deployed-server.

  • OpenDaylight service templates and environment files have been removed. It was deprecated in Stein and removed in Train.

Bug Fixes

  • After we switch default neutron driver to ovn also NeutronPluginExtensions should contain dns because “qos,port_security,dns” is default value for ovn

  • Fixed an issue where Octavia controller services were not properly configured.

  • Fixes an issue where filtering of networks for kerberos service principals was too aggressive, causing deployment failure. See bug 1854846.

  • Restart certmnonger after registering system with IPA. This prevents cert requests not completely correctly when doing a brownfield update.

Other Notes

  • Add “radvd_user” configuration parameter to the Neutron L3 container. This parameter defines the user pased to radvd. The default value is “root”.

11.3.0

New Features

  • Add new role “ComputeSriovIB” for infiniband compute nodes that would contain the required services enabled.

  • Three new parameter options are now added to Octavia service (OctaviaConnectionMaxRetries, OctaviaBuildActiveRetries, OctaviaPortDetachTimeout)

  • Support deploying multiple Cinder Pure Storage backends. CinderPureBackendName is enhanced to support a list of backend names, and a new CinderPureMultiConfig parameter provides a way to specify parameter values for each backend.

  • Add new role parameter NovaComputeCpuDedicatedSet to specify list or range of physical CPU cores to reserve to be used for allocating PCPU resources to virtual machines. Defaults to []

  • Environment files for distributed compute node (DCN) deployments have been added at environments/dcn.yaml and environments/dcn-hci.yaml.

  • deep_compare is now enabled by default for stonith resources, allowing their properties to be updated via stack update. To disable it set ‘tripleo::fencing::deep_compare: false’.

  • LibvirtLogLevel is added to configure libvirt log level. This option also works if environments/stdout-logging.yaml used to enable stdout logging

Upgrade Notes

  • The bare metal service (ironic) no longer allows nodes in maintenance to enter deployment or cleaning. If a node enters maintenance during deployment or cleaning, the process will be immediately aborted.

Deprecation Notes

  • The NovaVcpuPinSet parameter is deprecated and superseded by NovaComputeCpuSharedSet and NovaComputeCpuDedicatedSet parameters, which are used to define list or range of VCPU and PCPU resources for virtual machine processes.

  • Kubernetes installation via Kubespray has been deprecated.

  • The OpenStack EC2 API project isn’t maintained upstream, therefore we deprecate it.

  • Support for uuid token provider in keystone wes dropped, as its implementation was already removed from Keystone. Options related to db purging and token flushing in keystone were also removed because these are necessory only when uuid token provider is used.

  • LibvirtLogOutputs option was removed and now has no effect. Use LivirtLogLevel to change log level in libvirt.

Bug Fixes

  • The multiple-nics network template example was rendered without the ExternalMtu parameter when the role tag external_bridge was set. This caused the deployment to fail with parameter not provided error. Bug: 1847360.

  • When using IPv6 provisioning network the tftp server used by the Baremetal service did not start. The address passed as bind host to the tftp server is now wrapped in [] to fix the issue. Bug: 1844713.

  • Deployment or cleaning of bare metal nodes no longer gets stuck if a node is in maintenance mode. The process is aborted instead and has to be restarted after moving the node out of maintenance.

  • This change (with its dependent reviews) creates a separate VIP for the OVN DBS service. A more detailed explanation can be found in https://bugs.launchpad.net/tripleo/+bug/1841811. The short explanation is that the OVN DBS HA service puts some additional constraints on the VIP it uses and that is problematic when that VIP is used by other services (e.g. a change in OVN DBS master will move the VIP and will also reset all mysql connections. It also prevents us splitting OVN DBS from where haproxy runs).

  • We revert I0d9eb663405d1113ea84e3c12651a3f0dbdfc75d and we instead export ovn_dbs_vip on all nodes so it can be used in cells. Reason for this is that we want a separate VIP for OVN because a) composable roles and b) we do not want to impose the extra promote master constraints on the internal_api VIP which ends up being used by OVN.

  • If nova-api is delayed starting then the nova_wait_for_compute_service can timeout. A deployment using a slow/busy remote container repository is particularly susceptible to this issue. To resolve this nova_compute and nova_wait_for_compute_service have been postponed to step_5 and a task has been added to step_4 to ensure nova_api is active before proceeding. Resolves Bug 1842948.

Other Notes

  • Add “port_forwarding” service plugin and L3 agent extension to be enabled by default when Neutron ML2 plugin with OVS driver is used. New config option “NeutronL3AgentExtensions” is also added. This new option allows to set list of L3 agent’s extensions which should be used by agent.

  • Removed Tacker service definitions. The Tacker containers have not been available since Queens. bug 1838704 <https://bugs.launchpad.net/tripleo/+bug/1714270>

11.2.0

New Features

  • Add CinderRbdFlattenVolumeFromSnapshot parameter to control whether cinder RBD volumes created from a snapshot should be flattened in order remove a dependency on the snapshot. The default value is False, which is the same as the cinder RBD driver’s default value.

  • Created a ExtraKernelPackages parameter to allow users to install additional kernel related packages prior to loading the kernel modules defined in ExtraKernelModules.

  • Add new role parameters NovaCPUAllocationRatio, NovaRAMAllocationRatio and NovaDiskAllocationRatio which allows to configure cpu_allocation_ratio, ram_allocation_ratio and disk_allocation_ratio. Default value for NovaCPUAllocationRatio is 0.0 Default value for NovaRAMAllocationRatio is 1.0 Default value for NovaDiskAllocationRatio is 0.0

    The default values for CPU and Disk allocation ratio are taken 0.0 as mentioned in [1]. [1] https://specs.openstack.org/openstack/nova-specs/specs/stein/implemented/initial-allocation-ratios.html

  • Named debug ansible tasks have been added to the plays that get generated in deploy_steps_playbook.yaml (from common/deploy-steps.j2). The explicitly named tasks allow for using ansible-playbook’s –start-at-task option to resume a deployment from the start of a given play.

  • Added NeutronPermittedEthertypes to allow configuring additional ethertypes on neutron security groups for L2 agents that support it.

  • The NetworkConfig resource now passes in ansible vars as the values for the IP parameters to the nic config templates. This enables the nic config template to be rendered generic per role coming out of Heat by config-download. The templates can then be reused by any node of that same role type.

  • New parameter, NovaCronPurgeShadowTablesMaxDelay, is introduced to configure max delay parameter, which controles randomized sleep before each controller node executes the cron job to purge items in nova shadow tables.

  • Adds LibvirtLogFilters parameter to define a filter to select a different logging level for a given category log outputs, as specified in https://libvirt.org/logging.html . Default: ‘1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util’

  • Adds LibvirtTLSPriority parameter to override the compile time default TLS priority string. Default: ‘NORMAL:-VERS-SSL3.0:-VERS-TLS-ALL:+VERS-TLS1.2’

  • Adds NovaLocalMetadataPerCell cell support, default false. Indicates that the nova-metadata API service has been deployed per-cell, so that we can have better performance and data isolation in a multi-cell deployment. Users should consider the use of this configuration depending on how neutron is setup. If networks span cells, you might need to run nova-metadata API service globally. If your networks are segmented along cell boundaries, then you can run nova-metadata API service per cell.

  • This parameter sets inactive probe interval of the JSON session from ovn-controller to the OVN SB database. By default this it is 5s which not be sufficient in loaded systems or during high control-plane activity spikes, leading to unnecessary reconnections to OVSDB server. Now it is extended by default to 1 min and it is configurable by param OVNRemoteProbeInterval.

  • Introduce a PacemakerTLSPriorities parameter (which will set the PCMK_tls_priorities config option in /etc/sysconfig/pacemaker and the PCMK_tls_priorities variable inside the bundle. This, when set, allows an operator to specify what kind of GNUTLS ciphers are desired for the pacemaker control port.

Upgrade Notes

  • During upgrade/update the NeutronSriovNumVFs shall be avoided and instead the sriov_pf object in nic-configs shall be used. The numvfs attribute of sriov_pf type shall will lead to the equivalent configuration.

  • Removed DeploymentSwiftDataMap parameter that has become unusable with config-download workflow.

Deprecation Notes

  • The NeutronSriovNumVFs is deprecated and any new or existing deployments using this THT parameter shall perform the equivalent configuration implemented using sriov_pf network object in nic configs.

  • Support for Cisco N1KV has been removed from TripleO Train, since the N1KV isn’t supported by Cisco anymore.

Bug Fixes

  • Enable VFIO module on boot for SR-IOV deployments. Before this change on SR-IOV capable deployments when rebooting a compute node, vfio_iommu_type1 will not be loaded which will cause guest instances with VF/PF fail to start/spawn.

Other Notes

  • OpenShift deployed by TripleO support has been removed in a downstream version of Stein which make the upstream support difficult to maintain. OpenShift can be deployed using OpenShift-Ansible and users who desire to deploy OpenShift 3.11 onto bare metal nodes can still do so using openshift-ansible directly. The provisioning of the Operating System on baremetal can be done with OpenStack Ironic on the Overcloud or also can be done with deployed-servers, achieving the same result.

  • The DeployedServerEnvironment output has been removed from the stack as they are no longer needed when using config-download with pre-provisioned nodes.

11.1.0

New Features

  • ContainerImageRegistryLogin has been added to indicate if login calls should be issued by the container engine on deployment. The default is set to false.

  • Values specified in ContainerImageRegistryCredentials will now be used to issue a login call when deploying the container engine on the hosts if ContainerImageRegistryLogin is set to true

  • The parameter {{role.name}}RemovalPoliciesMode can be set to ‘update’ to reset the existing blacklisted nodes in heat. This will help re-use the node indexes when required.

  • As ceph-dashboard is available on Ceph, the new ceph dashboard composable service enables a user scenario in which the dashboard is deployed along with the other ceph components using TripleO. This feature is disabled by default and can be enabled by operators adding to the deployment the ceph-dashboard.yaml environment file included in tripleo-heat-templates.

  • Add support for the Multipathd service on nodes that access Block Storage (cinder) volumes. Multipathd is an optional service that can be enabled by including environments/multipathd.yaml in the deployment.

  • Introduce new tag into roles that will create external_bridge (usable only for multiple-nics).

  • When running config-download manually, fact gathering at the play level can now be controlled with the gather_facts Ansible boolean variable.

  • Add parameter NovaLiveMigrationWaitForVIFPlug which allows to set live_migration_wait_for_vif_plug which in turn allows whether to wait for network-vif-plugged events before starting guest transfer. The default value for the parameter is set to true.

  • Add ContainerNovaLibvirtUlimit to configure Ulimit for containerized Libvirt. Defaults to nofile=131072,nproc=126960.

  • Enables new Neutron “kill script” feature in order to avoid dangling containers when it kills an agent.

  • Add parameter NovaLibvirtMemStatsPeriodSeconds, which allows to set libvirt/mem_stats_period_seconds parameter value to number of seconds to memory usage statistics period, zero or negative value mean to disable memory usage statistics. Default value for NovaLibvirtMemStatsPeriodSeconds is 10.

  • Add boolean parameter NovaSchedulerLimitTenantsToPlacementAggregate which allows to set scheduler/limit_tenants_to_placement_aggregate parameter value, to have tenant isolation with placement. It ensures hosts are in tenant-isolated host aggregate and availability zones will only be available to specific set of tenants. Default value for NovaSchedulerLimitTenantsToPlacementAggregate is false.

  • Parameter scheduler/query_placement_for_image_type_support is enabled by default for all deployments. Setting it causes the scheduler to ask Placement only for compute hosts that support the disk_format of the image used in the request which is beneficial for example, the libvirt driver, when using Ceph as an ephemeral backend, does not support qcow2 images (without an expensive conversion step).

Upgrade Notes

  • During upgrade user will need to create custom roles_data.yaml and remove external_bridge from tags to be sure that bridge will be not added.

  • Removes the environment for deprecated non-config-download workflow. Now specifying –no-config-download/–stack-only in cli would create/update the heat stack but would not deploy configurations on the nodes.

  • The new role variable update_serial is introduced allowing parallel update execution. On Controller role this variable defaults to 1 as pacemaker has to be taken down and up in rolling fashion. The default value is 25 as that is default value for parallel ansible execution used by tripleo.

Deprecation Notes

  • The template aide-baremetal-puppet has been deprecated. This template has been replaced by aide-baremetal-ansible which provides for the same functionality and interfaces.

  • Support for the Midonet plugin has been removed from TripleO Train. The reason is the lack of maintainers and testing around this plugin.

  • The environments at environments/deployed-server-bootstrap-environment-centos.yaml and environments/deployed-server-bootstrap-environment-rhel.yaml are deprecated as the functionality they enabled in the bootstrap scripts has been moved to the tripleo-boostrap ansible role provided by tripleo-common.

  • Deprecated environment files are removed. Removed environments/neutron-sriov.yaml, use environments/services/neutron-sriov.yaml file. Removed environments/neutron-ovs-dpdk.yaml, use environments/services/neutron-ovs-dpdk.yaml file. Removed environments/ovs-dpdk-permissions.yaml, as the required parameter is added to the OvS-DPDK roles.

  • The rhel-registration scripts support has been removed. It was replaced in Rocky by the Ansible RHSM role. Upgrades have been tested and the new configuration is well documented.

  • Support for the Cisco UCSM plugin has been removed from TripleO Train. The reason is the lack of maintainers and testing around this plugin.

Bug Fixes

  • When changeing the name_lower of the InternalApi network and using the service_net_map_replace option in network data. The subnet referenced in VipSubnetMapDefaults did not take in account the custom lowercase name for the network, causing deployment error. See bug: 1832461.

  • The passphrase for config option ‘server_certs_key_passphrase’, is used as a Fernet key in Octavia and thus must be 32 bytes long. In the case of an operator-provided passphrase, TripleO will validate that.

  • Certain nova containers require more locked memory that the default limit of 16KiB. Increase the default memlock to 64MiB via DockerNovaComputeUlimit.

    As this is only a maximum limit and not a pre-allocatiosn this will not increase the memory requirements for all nova containers. To date the only container to require this is nova_cell_v2_discover_hosts which is short lived.

  • Recent changes for e.g edge scenarios caused intended move of discovery from controller to bootstrap compute node. The task is triggered by deploy-identifier to make sure it gets run on any deploy,scale, … run. If deploy run is triggered with –skip-deploy-identifier flag, discovery will not be triggered at and as result causing failures in previously supported scenarios. This change moves the host discovery task to be an ansible deploy_steps_tasks that it gets triggered even if –skip-deploy-identifier is used, or the compute bootstrap node is blacklisted.

  • Deployment with enabled NFS share for nova ephemeral storage fails. Podman fails to relable with mounted nfs in /var/lib/nova/instances and container fail to start with “operation not supported”. This change only sets the z flag for the /var/lib/nova in case nfs is not enabled for the compute.

Other Notes

  • Services that were in extraconfig/services are now in deployment directory among other services.

  • The use of parameter EC2MetadataIp and the configuration of routes to metadata has been removed. Nothing is consuming metadata over the network anymore since config-drive is used as the data source.

  • The environment files to enable/disable config-download at environments/disable-config-download-environment.yaml and environments/config-download-environment.yaml are removed as disabling config-download was deprecated in Stein, and it’s enabled by default.

11.0.0

New Features

  • Allows a deployer to specify the IdM domain with –domain on the ipa-client-install invocation by providing the IdMDomain parameter.

  • Allows a deployer to direct the ipa-client-install to skip NTP setup by specifying the IdMNoNtpSetup parameter. This is useful if the ipa-client-install setup clobbers the NTP setup by puppet.

  • Add GlanceImageCacheDir parameter to set base directory location that the Image Cache uses. Add GlanceImageCacheMaxSize parameter to set the upper limit on cache size, in bytes, after which the cache-pruner cleans up the image cache. Add GlanceImageCacheStallTime parameter to set the amount of time to let an image remain in the cache without being accessed.

  • Bluestore replaces Filestore as the default Ceph backend.

  • New parameters, NovaCronDBArchivedMaxDelay and CinderCronDbPurgeMaxDelay, are introduced to configure max_delay parameter to calculate randomized sleep time before db archive/purge. This avoids db collisions when performing db archive/purge operations on multiple controller nodes.

  • The passphrase for config option ‘server_certs_key_passphrase’, that was recently added to Octavia, and will now be auto-generated by TripleO by adding OctaviaServerCertsKeyPassphrase to the list of parameters TripleO configures in Octavia.

  • To allow PAM to create home directory for user who do not have one, ipa-client-install need an option. This change allow to enable it.

  • IronicConductorGroup allows to define an Ironic Conductor Group so that the managed baremetal nodes may be later manually distributed by operators across multiple conductors. By default, IronicConductorGroup takes an empty value, which creates no conductor groups associated with the given Ironic Conductor service instance.

    Note

    There is the default Ironic conductor group named “’’”, but it cannot be re-defined with IronicConductorGroup because of the empty value has been reserved for another purposes in t-h-t.

  • IronicRpcTransport controlls the remote procedure call transport between Ironic Conductor and API processes. For some case, like Edge DCN, this parameter may be set to ‘json-rpc’, when the used messaging broker should not be stretched over WAN. For such cases, this option also plays nicely alongside the Ironic Conductor Groups feature. Defaults to an empty value, which leaves the corresponding service’s default value intact.

  • Neutron can be configured for using avaialabiity zones (AZs).

    Note

    OVN does not normally run Neutron agents and also has yet support for AZ-aware routing scheduling. Therefore, no effective AZ configurations can be applied for the network services for the NeutronMechanismDrivers: ovn case.

    NeutronDefaultAvailabilityZones, NeutronDhcpAgentAvailabilityZone, NeutronL3AgentAvailabilityZone, NeutronDhcpAgentsPerNetwork, NeutronNetworkSchedulerDriver, NeutronRouterSchedulerDriver and NeutronDhcpLoadType can be used to configure various AZ configurations.

    By default, Neutron*AvailabilityZone(s) takes an empty value, which defines no AZs associated with the associated Neutron network service.

    Note

    The empty AZ name cannot be re-defined via Neutron*AvailabilityZone(s) because of the empty value has been reserved for another purposes in t-h-t.

    For details, see Official Documentaion.

  • Configure Neutron API for Nova Placement When the Neutron Routed Provider Networks feature is used in the overcloud, the Networking service will use those credentials to communicate with the Compute scheduler’s placement API.

  • The parameters NovaNfsEnabled, NovaNfsShare, NovaNfsOptions, NovaNfsVersion are changed to be role specific. This requires the usage of host aggregates as otherwise it will break live migration of instances as we can not do this with different storage backends.

  • Add role parameter NovaLibvirtNumPciePorts which sets libvirt/num_pcie_ports to specify the number of PCIe ports an instance will get. Libvirt allows a custom number of PCIe ports (pcie-root-port controllers) a target instance will get. Some will be used by default, rest will be available for hotplug use. When using the ‘q35’ machine type, by default, it allows only a single PCIe device to be hotplugged. And Nova currently sets ‘num_pcie_ports’ to “0” (which means, it defaults to libvirt’s “1”), which is not sufficient for hotplug use. Default for NovaLibvirtNumPciePorts is 16.

  • Added OVN-DPDK support

  • Introduced two new numeric parameters OvsRevalidatorCores and OvsHandlerCores to set values of n-revalidator-threads and n-handler-threads on openvswitch.

  • Composable service templates can now define scale_tasks. They are meant for scale down/up logic of services which need to be stopped/started during the scaling procedure. All happens within a single playbook and the down/up Ansible tags are required to differenciate them during the run.

Upgrade Notes

  • Removed the OS::TripleO::Services::Ntp service and related ntp files as chrony is the new default.

Deprecation Notes

  • OpenDaylight service is deprecated in Stein and will be disabled in future releases.

  • OS::TripleO::Services::SELinux has been deprecated. Management of selinux configuration is now handled via ansible during the deployment.

  • The following files are removed (environments/neutron-ml2-ovn-dvr-ha.yaml and environments/neutron-ml2-ovn-ha.yaml). The reason for this is that the maintained versions are kept under environment/services and to avoid confusion we remove the unmaintained ones.

  • The only OVN Tunnel Encap Type that we are supporting in OVN is Geneve and this is set by default in ovn puppet. So there are no need to set it in TripleO

  • The Neutron LBaaS project was retired and support for it in TripleO removed.

  • The template tuned-baremetal-puppet has been deprecated. This template has been replaced by tuned-baremetal-ansible which provides for the same functionality and interfaces.

Bug Fixes

  • OpenDaylight inactivity probe for setting the OVSDB timeout now defaults to 180s. This helps fix scale issues for large number of computes nodes in OpenDaylight deployments.

  • Fixes an issue where deployment would fail if a non-default name_lower is used in network data for one of the networks: External, InternalApi or StorageMgmt. (See bug: 1830852.)

  • Fixed service auth URL in Octavia to use the Keystone v3 internal endpoint.

  • As of Rocky [1], the nova-consoleauth service has been deprecated and cell databases are used for storing token authorizations. All new consoles will be supported by the database backend and existing consoles will be reset. Console proxies must be run per cell because the new console token authorizations are stored in cell databases.

    nova-consoleauth was deprecated in tripleo with: I68485a6c4da4476d07ec0ab5e7b5a4c528820a4f

    This change now removes the NovaConsoleauth Service.

    [1] https://docs.openstack.org/releasenotes/nova/rocky.html

  • With 405366fa32583e88c34417e5f46fa574ed8f4e98 the parameters RpcPort, RpcUserName, RpcPassword and RpcUseSSL got deprecated and nova::rabbitmq_port removed. As a result the healtcheck get called with null parameter and fail. We now get the global_config_settings from RabbitMQService and use oslo_messaging_rpc_port for the healthcheck.

  • Change-Id: I1a159a7c2ac286373df2b7c566426b37b7734961 moved the dicovery to run on a single compute host to not race on simultanious nova-manage commands. This change make sure we run the discover on every deploy run which is required for scaling up events.

Other Notes

  • The EndpointMap parameter is now required by post_deploy templates. So if an user overrides OS::TripleO::NodeExtraConfigPost with another template, the template would need to have EndpointMap parameter to work fine.