Train Series Release Notes

11.3.0-74

New Features

  • Added the “connection_logging” parameter for the Octavia service.

  • Added support for running the Octavia driver agent in a container. This will enable features such as the OVN load balancer provider in octavia as well as other third party providers.

  • Added the Octavia log offload parameters.

  • The ManageNetworks parameter has been added. The parameter controls management of the network and related resources (subnets and segments) with either create, update, or delete operations (depending on the stack operation). Does not apply to ports which will always be managed as needed. Defaults to true. For multi-stack use cases where the network related resources have already been managed by a separate stack, this parameter can be set to false.

  • Added new heat param OVNOpenflowProbeInterval to set ovn_openflow_probe_interval which is inactivity probe interval of the OpenFlow connection to the OpenvSwitch integration bridge, in seconds. If the value is zero, it disables the connection keepalive feature, by default this value is set on 60s. If the value is nonzero, then it will be forced to a value of at least 5s.

  • Under pressure, the default monitor timeout value of 20 seconds is not enough to prevent unnecessary failovers of the ovn-dbs pacemaker resource. While spawning a few VMs in the same time this could lead to unnecessary movements of master DB, then re-connections of ovn-controllers (slaves are read-only), further peaks of load on DBs, and at the end it could lead to snowball effect. Now this value can be configurable by OVNDBSPacemakerTimeout which will configure tripleo::profile::pacemaker::ovn_dbs_bundle (default is set to 60s).

Deprecation Notes

  • The roles file at deployed-server/deployed-server-roles-data.yaml is deprecated in train. It’s contents are the same as roles_data.yaml, and no special roles files are needed when using deployed-server.

  • OpenDaylight service templates and environment files have been removed. It was deprecated in Stein and removed in Train.

Bug Fixes

  • Fixed an issue where Octavia controller services were not properly configured.

  • Restart certmnonger after registering system with IPA. This prevents cert requests not completely correctly when doing a brownfield update.

Other Notes

  • Add “radvd_user” configuration parameter to the Neutron L3 container. This parameter defines the user pased to radvd. The default value is “root”.

11.3.0

New Features

  • Add new role “ComputeSriovIB” for infiniband compute nodes that would contain the required services enabled.

  • Three new parameter options are now added to Octavia service (OctaviaConnectionMaxRetries, OctaviaBuildActiveRetries, OctaviaPortDetachTimeout)

  • Support deploying multiple Cinder Pure Storage backends. CinderPureBackendName is enhanced to support a list of backend names, and a new CinderPureMultiConfig parameter provides a way to specify parameter values for each backend.

  • Add new role parameter NovaComputeCpuDedicatedSet to specify list or range of physical CPU cores to reserve to be used for allocating PCPU resources to virtual machines. Defaults to []

  • Environment files for distributed compute node (DCN) deployments have been added at environments/dcn.yaml and environments/dcn-hci.yaml.

  • deep_compare is now enabled by default for stonith resources, allowing their properties to be updated via stack update. To disable it set ‘tripleo::fencing::deep_compare: false’.

  • LibvirtLogLevel is added to configure libvirt log level. This option also works if environments/stdout-logging.yaml used to enable stdout logging

Upgrade Notes

  • The bare metal service (ironic) no longer allows nodes in maintenance to enter deployment or cleaning. If a node enters maintenance during deployment or cleaning, the process will be immediately aborted.

Deprecation Notes

  • The NovaVcpuPinSet parameter is deprecated and superseded by NovaComputeCpuSharedSet and NovaComputeCpuDedicatedSet parameters, which are used to define list or range of VCPU and PCPU resources for virtual machine processes.

  • Kubernetes installation via Kubespray has been deprecated.

  • The OpenStack EC2 API project isn’t maintained upstream, therefore we deprecate it.

  • Support for uuid token provider in keystone wes dropped, as its implementation was already removed from Keystone. Options related to db purging and token flushing in keystone were also removed because these are necessory only when uuid token provider is used.

  • LibvirtLogOutputs option was removed and now has no effect. Use LivirtLogLevel to change log level in libvirt.

Bug Fixes

  • The multiple-nics network template example was rendered without the ExternalMtu parameter when the role tag external_bridge was set. This caused the deployment to fail with parameter not provided error. Bug: 1847360.

  • When using IPv6 provisioning network the tftp server used by the Baremetal service did not start. The address passed as bind host to the tftp server is now wrapped in [] to fix the issue. Bug: 1844713.

  • Deployment or cleaning of bare metal nodes no longer gets stuck if a node is in maintenance mode. The process is aborted instead and has to be restarted after moving the node out of maintenance.

  • This change (with its dependent reviews) creates a separate VIP for the OVN DBS service. A more detailed explanation can be found in https://bugs.launchpad.net/tripleo/+bug/1841811. The short explanation is that the OVN DBS HA service puts some additional constraints on the VIP it uses and that is problematic when that VIP is used by other services (e.g. a change in OVN DBS master will move the VIP and will also reset all mysql connections. It also prevents us splitting OVN DBS from where haproxy runs).

  • We revert I0d9eb663405d1113ea84e3c12651a3f0dbdfc75d and we instead export ovn_dbs_vip on all nodes so it can be used in cells. Reason for this is that we want a separate VIP for OVN because a) composable roles and b) we do not want to impose the extra promote master constraints on the internal_api VIP which ends up being used by OVN.

  • If nova-api is delayed starting then the nova_wait_for_compute_service can timeout. A deployment using a slow/busy remote container repository is particularly susceptible to this issue. To resolve this nova_compute and nova_wait_for_compute_service have been postponed to step_5 and a task has been added to step_4 to ensure nova_api is active before proceeding. Resolves Bug 1842948.

Other Notes

  • Add “port_forwarding” service plugin and L3 agent extension to be enabled by default when Neutron ML2 plugin with OVS driver is used. New config option “NeutronL3AgentExtensions” is also added. This new option allows to set list of L3 agent’s extensions which should be used by agent.

  • Removed Tacker service definitions. The Tacker containers have not been available since Queens. bug 1838704 <https://bugs.launchpad.net/tripleo/+bug/1714270>

11.2.0

New Features

  • Add CinderRbdFlattenVolumeFromSnapshot parameter to control whether cinder RBD volumes created from a snapshot should be flattened in order remove a dependency on the snapshot. The default value is False, which is the same as the cinder RBD driver’s default value.

  • Created a ExtraKernelPackages parameter to allow users to install additional kernel related packages prior to loading the kernel modules defined in ExtraKernelModules.

  • Add new role parameters NovaCPUAllocationRatio, NovaRAMAllocationRatio and NovaDiskAllocationRatio which allows to configure cpu_allocation_ratio, ram_allocation_ratio and disk_allocation_ratio. Default value for NovaCPUAllocationRatio is 0.0 Default value for NovaRAMAllocationRatio is 1.0 Default value for NovaDiskAllocationRatio is 0.0

    The default values for CPU and Disk allocation ratio are taken 0.0 as mentioned in [1]. [1] https://specs.openstack.org/openstack/nova-specs/specs/stein/implemented/initial-allocation-ratios.html

  • Named debug ansible tasks have been added to the plays that get generated in deploy_steps_playbook.yaml (from common/deploy-steps.j2). The explicitly named tasks allow for using ansible-playbook’s –start-at-task option to resume a deployment from the start of a given play.

  • Added NeutronPermittedEthertypes to allow configuring additional ethertypes on neutron security groups for L2 agents that support it.

  • The NetworkConfig resource now passes in ansible vars as the values for the IP parameters to the nic config templates. This enables the nic config template to be rendered generic per role coming out of Heat by config-download. The templates can then be reused by any node of that same role type.

  • New parameter, NovaCronPurgeShadowTablesMaxDelay, is introduced to configure max delay parameter, which controles randomized sleep before each controller node executes the cron job to purge items in nova shadow tables.

  • Adds LibvirtLogFilters parameter to define a filter to select a different logging level for a given category log outputs, as specified in https://libvirt.org/logging.html . Default: ‘1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util’

  • Adds LibvirtTLSPriority parameter to override the compile time default TLS priority string. Default: ‘NORMAL:-VERS-SSL3.0:-VERS-TLS-ALL:+VERS-TLS1.2’

  • Adds NovaLocalMetadataPerCell cell support, default false. Indicates that the nova-metadata API service has been deployed per-cell, so that we can have better performance and data isolation in a multi-cell deployment. Users should consider the use of this configuration depending on how neutron is setup. If networks span cells, you might need to run nova-metadata API service globally. If your networks are segmented along cell boundaries, then you can run nova-metadata API service per cell.

  • This parameter sets inactive probe interval of the JSON session from ovn-controller to the OVN SB database. By default this it is 5s which not be sufficient in loaded systems or during high control-plane activity spikes, leading to unnecessary reconnections to OVSDB server. Now it is extended by default to 1 min and it is configurable by param OVNRemoteProbeInterval.

  • Introduce a PacemakerTLSPriorities parameter (which will set the PCMK_tls_priorities config option in /etc/sysconfig/pacemaker and the PCMK_tls_priorities variable inside the bundle. This, when set, allows an operator to specify what kind of GNUTLS ciphers are desired for the pacemaker control port.

Upgrade Notes

  • During upgrade/update the NeutronSriovNumVFs shall be avoided and instead the sriov_pf object in nic-configs shall be used. The numvfs attribute of sriov_pf type shall will lead to the equivalent configuration.

  • Removed DeploymentSwiftDataMap parameter that has become unusable with config-download workflow.

Deprecation Notes

  • The NeutronSriovNumVFs is deprecated and any new or existing deployments using this THT parameter shall perform the equivalent configuration implemented using sriov_pf network object in nic configs.

  • Support for Cisco N1KV has been removed from TripleO Train, since the N1KV isn’t supported by Cisco anymore.

Bug Fixes

  • Enable VFIO module on boot for SR-IOV deployments. Before this change on SR-IOV capable deployments when rebooting a compute node, vfio_iommu_type1 will not be loaded which will cause guest instances with VF/PF fail to start/spawn.

Other Notes

  • OpenShift deployed by TripleO support has been removed in a downstream version of Stein which make the upstream support difficult to maintain. OpenShift can be deployed using OpenShift-Ansible and users who desire to deploy OpenShift 3.11 onto bare metal nodes can still do so using openshift-ansible directly. The provisioning of the Operating System on baremetal can be done with OpenStack Ironic on the Overcloud or also can be done with deployed-servers, achieving the same result.

  • The DeployedServerEnvironment output has been removed from the stack as they are no longer needed when using config-download with pre-provisioned nodes.

11.1.0

New Features

  • ContainerImageRegistryLogin has been added to indicate if login calls should be issued by the container engine on deployment. The default is set to false.

  • Values specified in ContainerImageRegistryCredentials will now be used to issue a login call when deploying the container engine on the hosts if ContainerImageRegistryLogin is set to true

  • The parameter {{role.name}}RemovalPoliciesMode can be set to ‘update’ to reset the existing blacklisted nodes in heat. This will help re-use the node indexes when required.

  • As ceph-dashboard is available on Ceph, the new ceph dashboard composable service enables a user scenario in which the dashboard is deployed along with the other ceph components using TripleO. This feature is disabled by default and can be enabled by operators adding to the deployment the ceph-dashboard.yaml environment file included in tripleo-heat-templates.

  • Add support for the Multipathd service on nodes that access Block Storage (cinder) volumes. Multipathd is an optional service that can be enabled by including environments/multipathd.yaml in the deployment.

  • Introduce new tag into roles that will create external_bridge (usable only for multiple-nics).

  • When running config-download manually, fact gathering at the play level can now be controlled with the gather_facts Ansible boolean variable.

  • Add parameter NovaLiveMigrationWaitForVIFPlug which allows to set live_migration_wait_for_vif_plug which in turn allows whether to wait for network-vif-plugged events before starting guest transfer. The default value for the parameter is set to true.

  • Add ContainerNovaLibvirtUlimit to configure Ulimit for containerized Libvirt. Defaults to nofile=131072,nproc=126960.

  • Enables new Neutron “kill script” feature in order to avoid dangling containers when it kills an agent.

  • Add parameter NovaLibvirtMemStatsPeriodSeconds, which allows to set libvirt/mem_stats_period_seconds parameter value to number of seconds to memory usage statistics period, zero or negative value mean to disable memory usage statistics. Default value for NovaLibvirtMemStatsPeriodSeconds is 10.

  • Add boolean parameter NovaSchedulerLimitTenantsToPlacementAggregate which allows to set scheduler/limit_tenants_to_placement_aggregate parameter value, to have tenant isolation with placement. It ensures hosts are in tenant-isolated host aggregate and availability zones will only be available to specific set of tenants. Default value for NovaSchedulerLimitTenantsToPlacementAggregate is false.

  • Parameter scheduler/query_placement_for_image_type_support is enabled by default for all deployments. Setting it causes the scheduler to ask Placement only for compute hosts that support the disk_format of the image used in the request which is beneficial for example, the libvirt driver, when using Ceph as an ephemeral backend, does not support qcow2 images (without an expensive conversion step).

Upgrade Notes

  • During upgrade user will need to create custom roles_data.yaml and remove external_bridge from tags to be sure that bridge will be not added.

  • Removes the environment for deprecated non-config-download workflow. Now specifying –no-config-download/–stack-only in cli would create/update the heat stack but would not deploy configurations on the nodes.

  • The new role variable update_serial is introduced allowing parallel update execution. On Controller role this variable defaults to 1 as pacemaker has to be taken down and up in rolling fashion. The default value is 25 as that is default value for parallel ansible execution used by tripleo.

Deprecation Notes

  • The template aide-baremetal-puppet has been deprecated. This template has been replaced by aide-baremetal-ansible which provides for the same functionality and interfaces.

  • Support for the Midonet plugin has been removed from TripleO Train. The reason is the lack of maintainers and testing around this plugin.

  • The environments at environments/deployed-server-bootstrap-environment-centos.yaml and environments/deployed-server-bootstrap-environment-rhel.yaml are deprecated as the functionality they enabled in the bootstrap scripts has been moved to the tripleo-boostrap ansible role provided by tripleo-common.

  • Deprecated environment files are removed. Removed environments/neutron-sriov.yaml, use environments/services/neutron-sriov.yaml file. Removed environments/neutron-ovs-dpdk.yaml, use environments/services/neutron-ovs-dpdk.yaml file. Removed environments/ovs-dpdk-permissions.yaml, as the required parameter is added to the OvS-DPDK roles.

  • The rhel-registration scripts support has been removed. It was replaced in Rocky by the Ansible RHSM role. Upgrades have been tested and the new configuration is well documented.

  • Support for the Cisco UCSM plugin has been removed from TripleO Train. The reason is the lack of maintainers and testing around this plugin.

Bug Fixes

  • When changeing the name_lower of the InternalApi network and using the service_net_map_replace option in network data. The subnet referenced in VipSubnetMapDefaults did not take in account the custom lowercase name for the network, causing deployment error. See bug: 1832461.

  • The passphrase for config option ‘server_certs_key_passphrase’, is used as a Fernet key in Octavia and thus must be 32 bytes long. In the case of an operator-provided passphrase, TripleO will validate that.

  • Certain nova containers require more locked memory that the default limit of 16KiB. Increase the default memlock to 64MiB via DockerNovaComputeUlimit.

    As this is only a maximum limit and not a pre-allocatiosn this will not increase the memory requirements for all nova containers. To date the only container to require this is nova_cell_v2_discover_hosts which is short lived.

  • Recent changes for e.g edge scenarios caused intended move of discovery from controller to bootstrap compute node. The task is triggered by deploy-identifier to make sure it gets run on any deploy,scale, … run. If deploy run is triggered with –skip-deploy-identifier flag, discovery will not be triggered at and as result causing failures in previously supported scenarios. This change moves the host discovery task to be an ansible deploy_steps_tasks that it gets triggered even if –skip-deploy-identifier is used, or the compute bootstrap node is blacklisted.

  • Deployment with enabled NFS share for nova ephemeral storage fails. Podman fails to relable with mounted nfs in /var/lib/nova/instances and container fail to start with “operation not supported”. This change only sets the z flag for the /var/lib/nova in case nfs is not enabled for the compute.

Other Notes

  • Services that were in extraconfig/services are now in deployment directory among other services.

  • The use of parameter EC2MetadataIp and the configuration of routes to metadata has been removed. Nothing is consuming metadata over the network anymore since config-drive is used as the data source.

  • The environment files to enable/disable config-download at environments/disable-config-download-environment.yaml and environments/config-download-environment.yaml are removed as disabling config-download was deprecated in Stein, and it’s enabled by default.

11.0.0

New Features

  • Allows a deployer to specify the IdM domain with –domain on the ipa-client-install invocation by providing the IdMDomain parameter.

  • Allows a deployer to direct the ipa-client-install to skip NTP setup by specifying the IdMNoNtpSetup parameter. This is useful if the ipa-client-install setup clobbers the NTP setup by puppet.

  • Add GlanceImageCacheDir parameter to set base directory location that the Image Cache uses. Add GlanceImageCacheMaxSize parameter to set the upper limit on cache size, in bytes, after which the cache-pruner cleans up the image cache. Add GlanceImageCacheStallTime parameter to set the amount of time to let an image remain in the cache without being accessed.

  • Bluestore replaces Filestore as the default Ceph backend.

  • New parameters, NovaCronDBArchivedMaxDelay and CinderCronDbPurgeMaxDelay, are introduced to configure max_delay parameter to calculate randomized sleep time before db archive/purge. This avoids db collisions when performing db archive/purge operations on multiple controller nodes.

  • The passphrase for config option ‘server_certs_key_passphrase’, that was recently added to Octavia, and will now be auto-generated by TripleO by adding OctaviaServerCertsKeyPassphrase to the list of parameters TripleO configures in Octavia.

  • To allow PAM to create home directory for user who do not have one, ipa-client-install need an option. This change allow to enable it.

  • IronicConductorGroup allows to define an Ironic Conductor Group so that the managed baremetal nodes may be later manually distributed by operators across multiple conductors. By default, IronicConductorGroup takes an empty value, which creates no conductor groups associated with the given Ironic Conductor service instance.

    Note

    There is the default Ironic conductor group named “’’”, but it cannot be re-defined with IronicConductorGroup because of the empty value has been reserved for another purposes in t-h-t.

  • IronicRpcTransport controlls the remote procedure call transport between Ironic Conductor and API processes. For some case, like Edge DCN, this parameter may be set to ‘json-rpc’, when the used messaging broker should not be stretched over WAN. For such cases, this option also plays nicely alongside the Ironic Conductor Groups feature. Defaults to an empty value, which leaves the corresponding service’s default value intact.

  • Neutron can be configured for using avaialabiity zones (AZs).

    Note

    OVN does not normally run Neutron agents and also has yet support for AZ-aware routing scheduling. Therefore, no effective AZ configurations can be applied for the network services for the NeutronMechanismDrivers: ovn case.

    NeutronDefaultAvailabilityZones, NeutronDhcpAgentAvailabilityZone, NeutronL3AgentAvailabilityZone, NeutronDhcpAgentsPerNetwork, NeutronNetworkSchedulerDriver, NeutronRouterSchedulerDriver and NeutronDhcpLoadType can be used to configure various AZ configurations.

    By default, Neutron*AvailabilityZone(s) takes an empty value, which defines no AZs associated with the associated Neutron network service.

    Note

    The empty AZ name cannot be re-defined via Neutron*AvailabilityZone(s) because of the empty value has been reserved for another purposes in t-h-t.

    For details, see Official Documentaion.

  • Configure Neutron API for Nova Placement When the Neutron Routed Provider Networks feature is used in the overcloud, the Networking service will use those credentials to communicate with the Compute scheduler’s placement API.

  • The parameters NovaNfsEnabled, NovaNfsShare, NovaNfsOptions, NovaNfsVersion are changed to be role specific. This requires the usage of host aggregates as otherwise it will break live migration of instances as we can not do this with different storage backends.

  • Add role parameter NovaLibvirtNumPciePorts which sets libvirt/num_pcie_ports to specify the number of PCIe ports an instance will get. Libvirt allows a custom number of PCIe ports (pcie-root-port controllers) a target instance will get. Some will be used by default, rest will be available for hotplug use. When using the ‘q35’ machine type, by default, it allows only a single PCIe device to be hotplugged. And Nova currently sets ‘num_pcie_ports’ to “0” (which means, it defaults to libvirt’s “1”), which is not sufficient for hotplug use. Default for NovaLibvirtNumPciePorts is 16.

  • Added OVN-DPDK support

  • Introduced two new numeric parameters OvsRevalidatorCores and OvsHandlerCores to set values of n-revalidator-threads and n-handler-threads on openvswitch.

  • Composable service templates can now define scale_tasks. They are meant for scale down/up logic of services which need to be stopped/started during the scaling procedure. All happens within a single playbook and the down/up Ansible tags are required to differenciate them during the run.

Upgrade Notes

  • Removed the OS::TripleO::Services::Ntp service and related ntp files as chrony is the new default.

Deprecation Notes

  • OpenDaylight service is deprecated in Stein and will be disabled in future releases.

  • OS::TripleO::Services::SELinux has been deprecated. Management of selinux configuration is now handled via ansible during the deployment.

  • The following files are removed (environments/neutron-ml2-ovn-dvr-ha.yaml and environments/neutron-ml2-ovn-ha.yaml). The reason for this is that the maintained versions are kept under environment/services and to avoid confusion we remove the unmaintained ones.

  • The only OVN Tunnel Encap Type that we are supporting in OVN is Geneve and this is set by default in ovn puppet. So there are no need to set it in TripleO

  • The Neutron LBaaS project was retired and support for it in TripleO removed.

  • The template tuned-baremetal-puppet has been deprecated. This template has been replaced by tuned-baremetal-ansible which provides for the same functionality and interfaces.

Bug Fixes

  • OpenDaylight inactivity probe for setting the OVSDB timeout now defaults to 180s. This helps fix scale issues for large number of computes nodes in OpenDaylight deployments.

  • Fixes an issue where deployment would fail if a non-default name_lower is used in network data for one of the networks: External, InternalApi or StorageMgmt. (See bug: 1830852.)

  • Fixed service auth URL in Octavia to use the Keystone v3 internal endpoint.

  • As of Rocky [1], the nova-consoleauth service has been deprecated and cell databases are used for storing token authorizations. All new consoles will be supported by the database backend and existing consoles will be reset. Console proxies must be run per cell because the new console token authorizations are stored in cell databases.

    nova-consoleauth was deprecated in tripleo with: I68485a6c4da4476d07ec0ab5e7b5a4c528820a4f

    This change now removes the NovaConsoleauth Service.

    [1] https://docs.openstack.org/releasenotes/nova/rocky.html

  • With 405366fa32583e88c34417e5f46fa574ed8f4e98 the parameters RpcPort, RpcUserName, RpcPassword and RpcUseSSL got deprecated and nova::rabbitmq_port removed. As a result the healtcheck get called with null parameter and fail. We now get the global_config_settings from RabbitMQService and use oslo_messaging_rpc_port for the healthcheck.

  • Change-Id: I1a159a7c2ac286373df2b7c566426b37b7734961 moved the dicovery to run on a single compute host to not race on simultanious nova-manage commands. This change make sure we run the discover on every deploy run which is required for scaling up events.

Other Notes

  • The EndpointMap parameter is now required by post_deploy templates. So if an user overrides OS::TripleO::NodeExtraConfigPost with another template, the template would need to have EndpointMap parameter to work fine.