Victoria Series Release Notes


Security Issues

  • Kolla Ansible used to run Ironic’s tftpd as an (unprivileged) root user. Now, it will explicitly use the nobody user.

Bug Fixes

  • Fixes the copy job for grafana custom home dashboard file. The copy job for the grafana home dashboard file needs to run priviliged, otherwise permission denied error occurs. LP#[1947710]

  • Fixed bug #1987982 This bug caused the database log_bin_trust_function_creators variable not to be set back to “OFF” after a keystone upgrade.

  • Fixes an issue where Ironic Inspector could be configured without authentication in a multi-region environment in a region without a local Keystone service.

  • Fixes an issue with ironic-neutron-agent using the wrong option to configure the interface used to communicate with the Ironic API. LP#1990675

  • Fixes an issue with Masakari instance monitor when libvirt SASL is enabled. libvirt SASL was enabled by default in a recent change to Kolla Ansible. LP#1965754

  • Fixes an issue where a failure of any Nova compute service to register itself would cause only the host querying the nova API to fail. Now, only hosts that fail to register will fail the Kolla Ansible run. Alternatively, to fail all hosts in a cell when any compute service fails to register, set nova_compute_registration_fatal to true. LP#1940119

  • The prometheus openstack exporters are now behind haproxy, providing a unique time series in the prometheus database. Also ensures that only one exporter queries the openstack APIs at any given time interval. With the previous behavior each openstack exporter was scraped at the same time. This caused each exporter to query the openstack APIs simultaneously introducing unneccesary load and duplicate time series in the prometheus database due to the instance label being unique for each exporter. LP#1972818


New Features

  • Adds a tls_connect module to the Prometheus blackbox exporter. This can be used to test connectivity of TLS servers.

  • Implements container healthchecks for ironic-neutron-agent service. See blueprint

  • Adds support for libvirt SASL authentication. It is enabled by default. LP#1964013

Known Issues

  • Existing fluentd log rotation failed to delete old haproxy, swift, glance-tls-proxy and neutron-tls-proxy logs. These will not be deleted by the new logrotate config and will have to be removed manually.

Upgrade Notes

  • The addition of libvirt SASL authentication requires a new password in passwords.yml, libvirt_sasl_password. This may be generated using the existing kolla-genpwd and kolla-mergepwd tooling.

  • The addition of libvirt SASL authentication requires both the nova_libvirt and nova_compute containers to be updated simultaneously, using new images with the necessary Cyrus SASL dependencies, as well as configuration containing the SASL credentials.

  • update the default value of node_custom_config to {{ node_config }}/config, when specified using –configdir

Security Issues

  • Explicitly removes the net.ipv4.ip_forward sysctl from /etc/sysctl.conf on hosts with Neutron L3 Agent. In the absence of another source for this sysctl, it should revert to the default of 0 after the next reboot. This is a follow up to a previous change which stopped setting the sysctl, but leaves existing systems with the original value of 1 set.

    A deployer looking to more aggressively change the value may set neutron_l3_agent_host_ipv4_ip_forward to 0 using a Yoga release of Kolla Ansible. This option will be removed in future. Any deployments still relying on the previous value may set neutron_l3_agent_host_ipv4_ip_forward to 1. LP#1945453

  • Fixes an issue where the default configuration of libvirt did not use authentication for the API exposed over TCP on the internal API network. This allowed anyone with access to the internal API network read-write access to libvirt. While the internal API network is typically trusted, other services on this network generally at least require authentication.

    SASL authentication is now enabled for libvirt by default. Kolla Ansible supports libvirt TLS since the Train release, and this is recommended to provide a higher level of security. LP#1964013

Bug Fixes

  • Continue to run all actions if one action failed in Elasticsearch curator. LP#1954720

  • Fixes Nova resize failing when migration_interface is customised. LP#1956976

  • Fixes Glance with Cinder iSCSI backend failing due to lack of lock_path setting. LP#1959663

  • Fixes logrotate config missing for openvswitch and prometheus services. LP#1961795

  • Fixes an issue with Ironic’s PXE components not getting updated on upgrade. LP#1963752

  • Fixes configuration of the Prometheus HTTP API URL when using the Prometheus collector in CloudKitty. LP#1961615

  • Fixes the baremetal role to avoid an error “Unable to remove “libvirtd”. Now the symlink /etc/apparmor.d/disable/usr.sbin.libvirtd is created by the role. LP#1960302

  • Existing fluentd log rotation failed to delete old haproxy, swift, glance-tls-proxy and neutron-tls-proxy logs. Standardise rotation and deletion of logs using logrotate.

  • adds back the option to configure the rabbitmq clustering interface via kolla LP#1900160 <>

  • Fixes an issue where the Libvirt AppArmor profile is disable and the bootstrap-servers process tries to remove it. See bug 1909874 for details.

  • Fixes an issue seen when using Jinja2 3.1.0.

  • Fixes the configuration option setting the type of endpoint used by Neutron to send requests to Placement. LP#1960503

  • Fixes a configuration issue with Node Exporter causing all file system metrics of a host to be identical. LP#1961438

  • Fixes an issue where RabbitMQ was configured to mirror classic transient queues for all services. According to the RabbitMQ documentation this is not a supported configuration, and contributed to numerous bug reports. In order to avoid making unexpected changes to the RabbitMQ cluster, it is necessary to set rabbitmq_remove_ha_all_policy to yes in order to apply this fix. This variable will be removed in the Yoga release. LP#1954925

  • Fixes an issue with Cinder upgrade where Cinder services would remain pinned to the previous release’s RPC & object versions. LP#1954932


Upgrade Notes

  • RabbitMQ’s Prometheus plugin is no longer enabled by default if Prometheus is not deployed. If external Prometheus is used, you need to turn on rabbitmq_enable_prometheus_plugin to get old behaviour.

Bug Fixes

  • Removes custom value of max_allowed_secret_in_bytes in barbican.conf. The default maximum size in Barbican was doubled to avoid issues with some certificates. LP #1957795

  • Fixed the deployment failure of outward_rabbitmq by resolving port conflicts by customizing RabbitMQ’s prometheus.tcp.port. LP #1885106

  • Fixes Octavia’s “Connection refused” errors by adding ovn_sb_connection to octavia.conf. LP#195011

  • Ironic API and Ironic Inspector API use separate policy files. Ironic role was updated to be able to handle both policies separately. LP#1952948

  • Fixes Placement no logrotate configuration LP#1954723

  • Fixes unable to connect to zun console when kolla_enable_tls_external is true. Access to console of any zun container fails when kolla_enable_tls_external is true. This fix sets the protocol for wsproxy base_url in zun.conf according to the value of kolla_enable_tls_external LP#1957117

  • Fix the apache’s wsgi configuration for the aodh service in Debuntu binary flavours. LP#1953059


New Features

  • Add new option prometheus_openstack_exporter_timeout to override default scrape_timeout for openstack exporter job.

  • Adds support for elasticsearch storage backend with cloudkitty: That feature let you store cloudkitty rating documents directly within your elasticsearch cluster.

    If you already have an elasticsearch cluster running for logging it create a new cloudkitty specific index. That let you use kibana, grafana or any other interface to browse your rating data and create appropriate dashboard or build an appropriate billing service over it.

    Adds support for prometheus as a fetcher/collector for cloudkitty: That feature let you use prometheus metrics as your source of rating. Using prometheus let you rate pretty much any openstack object directly from the kolla provided exporters (Openstack_exporter) or your own customs exporters.

  • Adds config parameter haproxy_nova_spicehtml5_proxy_tunnel_timeout to configure the Tunnel TimeOut directive for spicehtml5proxy haproxy service.

  • Adds a new variable, disable_firewall, which defaults to true. If set to false, then the host firewall will not be disabled during kolla-ansible bootstrap-servers.

  • Adds two new variables service_images_pull_retries and service_images_pull_delay which control the behaviour of image pulling tasks. These are useful if your registry is not 100% reliable (usually due to load). The defaults have been set to 3 retries and 5 seconds delay to ensure a better default experience (these are actually Ansible defaults when task retries are enabled).

  • It is now possible to use Neutron DHCP agent together with OVN networking. New variable is added to control this feature: neutron_ovn_dhcp_agent, defaulting to no.

  • Adds support for configuring the filter and gather_subset arguments for the setup module via kolla_ansible_setup_filter and kolla_ansible_setup_gather_subset respectively. These can be used to reduce the number of facts, which can have a significant effect on performance of Ansible.

  • New variable ironic_enable_keystone_integration was added. It helps to add keystone connection information into ironic.conf if we want to connect to existing keystone (not installing it at the same time).

Upgrade Notes

  • Updates all references to Ansible facts within Kolla Ansible from using individual fact variables to using the items in the ansible_facts dictionary. This allows users to disable fact variable injection in their Ansible configuration, which may provide some performance improvement. Check for facts referenced in local configuration files, and update to use ansible_facts before disabling fact variable injection.

  • Modifies the default value of ceph_nova_user from nova to the value of ceph_cinder_user, in line with the default for ceph_nova_keyring. Users who have overridden ceph_nova_keyring to use separate keyrings for Nova and Cinder should also override ceph_nova_user to match the Nova keyring. LP#1934145

  • Modifies the default value of rabbitmq_server_additional_erl_args from an empty string to +S 2:2 +sbwt none +sbwtdcpu none +sbwtdio none.

Security Issues

  • Fixes net.ipv4.ip_forward not to be enabled by Kolla Ansible on the default network namespace. It was enabled on hosts with Neutron L3 Agent (thus in most common setups with OVS and/or Linux Bridge, but not OVN) and allowed, unless users had extra iptables rules to avoid that, any traffic to be accepted for forwarding (as long as it was routable and passed other checks). Users of existing setups are advised to re-evaluate whether they need this sysctl enabled and disable if not necessary. Kolla Ansible will simply no longer try to set this sysctl at all. Neutron L3 Agent handles forwarding enablement per managed namespace. LP#1945453

  • Adds mitigation for the Apache Log4j2 Remote Code Execution (RCE) Vulnerability in Elasticsearch - CVE-2021-44228.

Bug Fixes

  • Fixes monasca-thresh to correctly submit the topology to Storm. The previous container ran the topology in local mode (within the container), and didn’t use the Storm cloud. The new container handles submitting the topology to Storm and also handles killing and replaces the topology when it’s configuration has changed. As a result, the monasca-thresh container is only used for submission, and exits after that’s completed. The logs for the topology will now be available in the storm worker-artifact logs. LP#1808805

  • Fixes an issue where configuration in containers could become stale. This prevented containers with updated configuration from being restarted, e.g., if the kolla-ansible genconfig and kolla-ansible deploy-containers commands were used together. LP#1848775

  • Fixes elasticsearch fluentd output being enabled when elasticsearch is not enabled. LP#1927880

  • Fixes an issue with timesync checks on deployment host. See bug 1933347 for details.

  • Fixes horizon’s healthcheck when SSL is turned on. LP#1933846

  • Fixes an issue seen when customising the Docker Yum repository URL on CentOS, where the docker_yum_gpgkey variable is not used consistently. LP#1934913

  • Fixes an issue where spice console is freezed after while, see LP#1938549.

  • Fixes Masakari in multi-region deployments to query Nova API in its own region. LP#1939291

  • Fixes nova’s healthchecks when upgrading from previous version. LP#1939679

  • Fixed broken kolla-toolbox container when RabbitMQ is disabled and IPv6 is used. LP#1939883

  • Fixes mariadb-clustercheck not to run when there is no HAProxy. LP#1944114

  • No longer creates directories for haproxy and swift logs where they are not needed. LP#1945070

  • Fixes an error in placement role which prevents to deploy the placement service when custom policy file is used. LP#1948835

  • Fixes missing current Ansible version in the error message. LP#1948979

  • Fix octavia role doesn’t set the amphora network’s gateway_ip LP#1949260

  • Fixes an issue where the Nova API logs were written to files ending with -wsgi.log which affected the processing of these logs in the Fluentd pipeline. LP#1950185

  • Fixes an issue with Cyborg deployment. LP#1937911

  • Fixes an issue with config.json for neutron-server when a VMware plugin agent is used.

  • On slower nodes, the initial grafana startup could experience a timeout failure when the migrations for setting up the database took longer than expected. This has been fixed by increasing the default timeout. The timeout settings can be changed via new parameters grafana_start_first_node_delay and grafana_start_first_node_retries for the grafana role. LP#1769962

  • Fixes an issue with Neutron linuxbridge ML2 agent when neutron_external_interface includes multiple interfaces. LP#1863935

  • Fixes an issue with Manila configuration which was missing a [glance] section, preventing some drivers from operating.

  • Fixes an issue with default Nova configuration for Ceph where the RBD user is set to nova, but only a cinder keyring is copied. The default value of ceph_nova_user is changed to the value of ceph_cinder_user, in line with the default for ceph_nova_keyring. LP#1934145

Other Notes

  • Optimised image pulling to avoid looping over disabled services.


New Features

  • Add octavia-driver-agent to Octavia deployments to allow for additional providers, e.g. ovn-octavia-provider. It is automatically deployed when Octavia is enabled and neutron_plugin_agent is set to ovn. It can be also enabled by setting enable_octavia_driver_agent to yes. Users need to update their inventory to include octavia-driver-agent Ansible group.

  • Adds a new flag, docker_disable_default_network, which defaults to no. Docker is using by default for bridge networking on docker0, and this might cause routing problems for operator networks. Setting this flag to yes will disable Docker’s bridge networking. This feature will be enabled by default from the Wallaby 12.0.0 release.

  • Added a new haproxy configuration variable, haproxy_host_ipv4_tcp_retries2, which allows users to modify this kernel option. This option sets maximum number of times a TCP packet is retransmitted in established state before giving up. The default kernel value is 15, which corresponds to a duration of approximately between 13 to 30 minutes, depending on the retransmission timeout. This variable can be used to mitigate an issue with stuck connections in case of VIP failover, see bug 1917068 for details.

  • Adds the ability to override the automatic detection of fluentd_version and fluentd_binary. These can now be defined as extra variables. This removes the dependency of having docker configured for config generation.

  • OVN deployment will now configure external_ids:ovn-chassis-mac-mappings to make DVR work on VLAN tenant networks.

  • Adds support for collecting Prometheus metrics from RabbitMQ. This is enabled by default when Prometheus and RabbitMQ are enabled, and may be disabled by setting enable_prometheus_rabbitmq_exporter to false.

Bug Fixes

  • Fixes an issue with kolla-ansible bootstrap-servers if Zun is enabled where Zun-specific configuration for Docker was applied to all nodes. LP#1914378

  • Fix the issue when Swift deployed with S3 Token Middleware enabled. Fixes LP#1862765

  • Fixes the Northbound and Southbound database socket paths in OVN.

  • chronyd crash loop if server is rebooted (Debian) LP#1915528

  • Fixes an issue preventing prechecks from succeeding when “non-native” NTP daemon was used, such as ntpd` as opposed to ``systemd-timesyncd on a Debian/Ubuntu system or to chronyd on a CentOS/RHEL system. LP#1922721

  • Fixed an issue when Docker was configured after startup on Debian/Ubuntu, which resulted in iptables rules being created - before they were disabled. LP#1923203

  • Fixes an issue with Octavia SSH key copying if user disabled Octavia auto configuration. LP##1927727

  • Fixed an issue where docker python SDK 5.0.0 was failing due to missing six - introduced a constraint to install version lower than 5.x. LP#1928915

  • Fixes more-than-2-node RabbitMQ upgrade failing randomly. LP#1930293.

  • Fixes Swift deploy when TLS enabled. Added the missing handler and corrected the container name. LP#1931097

  • Fixes missing region_name in keystone_auth sections. See bug 1933025 for details.

  • Fixes iscsid failing in current CentOS 8 based images due to pid file being needlessly set. LP#1933033

  • Fixes host bootstrap on Debian not removing the conflicting packages. It now behaves in accordance with the docs. LP#1933122

  • Fixes an issue where kolla-ansible exits with a zero exit code when executed with a bogus command name. LP#1929397

  • Fixes potential issue with Alertmanger in non-HA deployments. In this scenario, peer gossip protocol is now disabled and Alertmanager won’t try to form a cluster with non-existing other instances. LP#1926463

  • Adds a new flag, docker_disable_ip_forward, which defaults to no and can be used (by setting yes) to disable docker’s ip-forward option which makes docker set net.ipv4.ip_forward sysctl to 1. This is to protect from creating all-forwarding hosts. LP#1931615

  • Fixes an issue when generating /etc/hosts during kolla-ansible bootstrap-servers when one or more hosts has an api_interface with dashes (-) in its name. LP#1927357

  • Fixes some configuration issues around Barbican logging. LP#1891343

  • Fixes some configuration issues around Cinder logging. LP#1916752

  • Fix cyborg api doesn’t listen on api interface. change host to host_ip in cyborg.conf. See the cyborg documentation

  • Fix the wrong configuration of the ovs-dpdk service. this breaks the deployment of kolla-ansible. For more details please see bug 1908850.

  • Fixes an issue with Magnum when TLS is enabled. LP#781062

  • Fixes an issue with executing kolla-ansible when installed via pip install --user. LP#1915527

  • Fixes an issue where masakari.conf was generated for the masakari-instancemonitor service but not used.

  • Fixes an issue where masakari-monitors.conf was generated for the masakari-api and masakari-engine services but not used.

  • Uses a consistent variable name for container dimensions for masakari-instancemonitor - masakari_instancemonitor_dimensions. The old name of masakari_monitors_dimensions is still supported.

  • Fixes an issue with Octavia deployment when using a custom service auth project. If octavia_service_auth_project is set to a project that does not exist, Octavia deployment would fail. The project is now created. LP#1922100

  • Fixes LP#1892376 by updating deprecated syntax in the Monasca Elasticsearch template.

  • Removes whitespace around equal signs in zookeeper.cfg which were preventing the script from running correctly.

Other Notes

  • Following Cinder upstream, support for using ZFSSA with Cinder has been removed. ZFSSA was unsupported in Train and later removed in Ussuri.

  • Updates the container image used by mariabackup. It was using the mariadb image, which was deprecated in Victoria and will be removed in Wallaby. The mariadb-server image is used instead. LP#1928129


New Features

  • Adds ability to provide a custom elasticsearch config.

  • Adds configuration parameter kolla_httpd_keep_alive to configure the keep-alive timeout for services that use httpd to handle HTTP requests. The default value is 60 seconds, overriding the httpd default of 5 seconds.

  • Adds support for multiple globals files. The main globals.yml file still exists. In addition to that, operators can now create a globals.d directory (next to globals.yml), where they can place any number of *.yml files, for example for specific services they want to add.

  • With the boolean parameter enable_neutron_trunk it is now possible to enable or disable the Neutron service plugin trunk.

  • add support of octavia dev mod.

  • Add “etcd_enable_tls” configuration parameter which can be used to enable TLS encryption for the etcd service. The default value of “etcd_enable_tls” is set by the value of “kolla_enable_tls_backend”.

  • Adds support for automatic registration resources required for Octavia. This includes a management network and subnet, security groups, flavor and SSH key. The amphora image must be uploaded manually currently. BP#automatic-deploy-of-octavia

  • Adds support for automatic generation of certificates for Octavia via the kolla-ansible octavia-certificates command.

  • Kolla Ansible checks now that the local Ansible Python environment is coherent, i.e. used Ansible can see Kolla Ansible. LP#1856346

  • Makes it possible to set HAProxy’s timeout http-keep-alive via haproxy_http_keep_alive_timeout. The default is 10s, like the previous default that applied via haproxy_http_request_timeout. LP#1892622

  • Adds kolla_sysctl_conf_path variable that allows to customise the path to sysctl.conf that will be modified by Kolla Ansible plays. The default is /etc/sysctl.conf as it was before.

  • Adds a new flag, docker_disable_default_iptables_rules, which defaults to no. Docker is manipulating iptables rules by default to provide network isolation, and this might cause problems if the host already has an iptables based firewall. A common problem is that Docker sets the default policy of the FORWARD chain in the filter to DROP. Setting docker_disable_default_iptables_rules to yes will disable Docker’s iptables manipulation. This feature will be enabled by default from the Victoria 11.0.0 release.

  • Adds configuration options to enable backend TLS encryption from HAProxy to the Nova, Ironic, and Neutron services. When used in conjunction with enabling TLS for service API endpoints, network communcation will be encrypted end to end, from client through HAProxy to the backend service.

  • Adds support for completing the http-01 challenge of ACME (e.g. as provided by Let’s Encrypt - using an external ACME client (e.g. certbot). The relevant variable is acme_client_servers. Please read the docs for more info on this integration.

  • Improves performance of the common role by generating all fluentd configuration in a single file.

  • Self-signed TLS certificates can be used to test TLS in a development OpenStack environment. The kolla-ansible certificates command will generate the required self-signed TLS certificates. This command has been updated to first create a self-signed root certificate authority. The command then generates the internal and external facing certificates and signs them using the root CA. If backend TLS is enabled, the command will generate the backend certificate and sign it with the root CA.

  • (internal/advanced) Adds support for with_frontend and with_backend to haproxy service definitions. These new fields preserve the old logic by defaulting to true but can be set to false to make the selected service not configure the respective “end”. This requires haproxy_service_template to be set to haproxy_single_service_split.cfg.j2 which is the new default.

  • Implements container healthchecks for core OpenStack services. Docker healthchecks are periodically called scripts that check health of a running service that expose health information in docker ps output and trigger a health_status event. Healthchecks are now enabled by default and can be disabled by setting enable_container_healthchecks to no in globals.yml.

  • Improves performance of the common role by generating all logrotate configuration in a single file.

  • Enable Galera node state checking by using clustercheck script that is used by HAProxy to define node up/down state.

  • Kolla started building neutron-mlnx-agent in Ussuri cycle. Now those containers can be deployed too via Kolla-Ansible. Note that neutron-mlnx-agent image is also used to deploy neutron-eswitchd container.

  • Adds support for TLS encryption of RabbitMQ client-server communication. See blueprint for details.

  • Extracts the common role into a separate play. This provides a performance benefit at scale, since the role dependency mechanism used previously had a overhead. This change allows the only common role to be executed by specifying the common tag.

  • Adds a mechanism to customize skydive.conf.

  • Allows to skip and unset sysctl variables controlled by Kolla Ansible plays using KOLLA_SKIP and KOLLA_UNSET values.

  • Adds timesync prechecks which run when containerised chrony is not enabled to ensure the host has its system clock synchronized.

Known Issues

  • Since Ussuri, there is a bug in how Ceph (RBD) is handled with Cinder: the backend_host option is missing from the generated configuration for external Ceph. The symptoms are that volumes become unmanageable until extra admin action is taken. This does not affect the data plane - running virtual machines are not affected.

    There is a related issue regarding active-active cinder-volume services (single-host cinder-volume not affected), which is that they should not have been configured with backend_host in the first place but with cluster and proper coordination instead. Some users might have customised their config already to address this issue.

    The Kolla team is investigating the best way to address this for all its users. In the meantime, please ensure that, before upgrading to Ussuri, the backend_host option is set to its previous value (the default was rbd:volumes) via a config override.

    For more details please refer to the referenced bug. Do note this issue affects both new deployments and upgrades. LP#1904062

Upgrade Notes

  • Adds configuration parameter kolla_httpd_keep_alive to configure the keep-alive timeout for services that use httpd to handle HTTP requests. The default value is 60 seconds, overriding the httpd default of 5 seconds.

  • Resources required for Octavia are now registered automatically by default. Octavia users upgrading from Ussuri should set octavia_auto_configure to no in globals.yml to avoid registering conflicting resources.

  • When deploying Monasca with Logstash 6, any custom Logstash 2 configuration for Monasca will need to be updated to work with Logstash 6. Please consult the documentation.

  • HAProxy’s timeout http-keep-alive is now set via haproxy_http_keep_alive_timeout. The default is 10s, like the previous default that applied via haproxy_http_request_timeout, but the new variable does not follow the old one so might need customising if the former was customised for keep-alive purposes. Do note the two values do not have to have the same value and often do not. Please read the linked record for background. LP#1892622

  • No longer uses option http-tunnel for Neutron Server in HAProxy. Please amend manually if you relied on the quirky behaviour. LP#1892686

  • The default migration_interface is moved from network_interface to api_interface, which is treated as internal and security network plane in most cases.

  • Replaced kolla_external_fqdn_cacert and kolla_internal_fqdn_cacert with kolla_admin_openrc_cacert, which by default is not set. OS_CACERT is now set to the value of kolla_admin_openrc_cacert in the generated file.

  • The default of haproxy_service_template changed to haproxy_single_service_split.cfg.j2. This template allows more flexibility in service config. The previously-default haproxy_single_service_listen.cfg.j2 is now deprecated for removal. No action needs to be taken unless one relied on replacing the previous default template contents with one’s own (under the same name).

  • Changes the default value of kibana_elasticsearch_ssl_verify from false to true. LP#1885110

  • mariadb role uses now mariadb-server image by default (compared to mariadb previously), since mariadb image will be deprecated in the Kolla Victoria release and removed in Wallaby.

  • Enabling multipathd will now configure Nova to use it.

  • The Prometheus OpenStack exporter now uses internal endpoints to communicate with OpenStack services, to match the configuration of other services deployed by Kolla Ansible. Using public endpoints can be retained by setting the prometheus_openstack_exporter_endpoint_type variable to public.

  • The congress project is no longer maintained. This has been retired since Victoria and has not been used by other OpenStack services since.

  • Support for deploying with Hyper-V integrations has been removed.

  • Customizing Neutron Linux bridge and Open vSwitch Agents config via ml2_conf.ini is removed. The config has been split out for these agents into linuxbridge_agent.ini and openvswitch_agent.ini respectively. The old behaviour was deprecated in Ussuri for removal in Victoria.

  • Support for deploying with mongodb integrations has been removed.

  • The neutron-fwaas project is no longer maintained. This has been retired and will be removed in the Victoria cycle.

  • Support mongodb for panko as backend has been removed.

  • Support for deploying with XenAPI integrations has been removed.

  • Kolla-Ansible now requires Ansible 2.9. Ansible 2.8 is not supported anymore.

  • The common role is now executed in a separate play. This introduces a few small changes in behaviour:

    • the common role is now run for all hosts at the beginning, rather than prior to their first enabled service

    • hosts must be in the necessary group for each of the common services (cron, fluentd, kolla-logs, kolla-toolbox) in order to have that service deployed

    • if tags are specified for another service e.g. nova, the common role will not automatically run for matching hosts. The common tag must be specified explicitly

  • Apache ZooKeeper will now be automatically deployed whenever Apache Storm is enabled.

  • The default value of REST_API_REQUIRED_SETTINGS was synchronized with Horizon. You may want to review settings exposed by the updated configuration.

  • Adds support for Ubuntu Focal 20.04 as a host operating system. Ubuntu users upgrading from Ussuri should first upgrade OpenStack containers to Victoria, which uses the Ubuntu Focal 20.04 base container image. Hosts should then be upgraded to Ubuntu Focal 20.04.

  • The Monasca Log API has been removed. All logs now go to the unified Monasca API when Monasca is enabled. Any custom Fluentd configuration and inventory files will need to be updated. Any monasca_log_api containers will be removed automatically.

Deprecation Notes

  • The previous default of haproxy_service_template (haproxy_single_service_listen.cfg.j2) is now deprecated for removal as haproxy_single_service_split.cfg.j2 takes it place to allow more flexibility in service config.

  • The variable kolla_internal_address is deprecated. This variable is used only as the default value for kolla_internal_vip_address, and is not documented.

  • VMware integration, which was deprecated in the Ussuri release, is no longer deprecated. Nova has reversed the deprecation of its VMware driver, and the Kolla community has shown interest in VMware.

Security Issues

  • The file generated by kolla-ansible post-deploy was previously created with root:root ownership and 644 permissions. This would allow anyone with access to the same directory to read the file, including the admin credentials. The ownership of is now set to the user executing kolla-ansible, and the file is assigned a mode of 600. This change can be applied by running kolla-ansible post-deploy.

Bug Fixes

  • Add support to use bifrost-deploy behind proxy. It uses existing container_proxy variable.

  • Fixes handling of /dev/kvm permissions to be more robust against host-level actions. LP#1681461

  • Fixes Kibana deployment with the new E*K stack (6+). LP#1799689

  • IPv6 fully-routed topology (/128 addressing) is now allowed (where applicable). LP#1848941

  • Fix prometheus-openstack-exporter to use CA certificate.

  • Makes haproxy and keepalived restarts during Kolla-Ansible actions more robust, especially in multinode scenarios (HA).

  • Removing chrony package and AppArmor profile from docker host if containerized chrony is enabled. LP#1882513

  • When deploying Elasticsearch 6, Logstash 2 was deployed by default which is not compatible with Elasticsearch 6. Logstash 6 is now deployed by default.

  • Fixes an issue when using ip addresses instead of hostnames in Ansible inventory. OpenvSwitch role sets system-id based on inventory_hostname, which in case of ip addresses in is first ip octet. Such a deployment would result in multiple OVN chassis with duplicate name e.g. “10” connecting to OVN Southbound database - which spawns high numbers of create/delete events in Encap database table - leading to near 100% CPU usage of OVN/OVS/Neutron processes.

  • Fixes an issue with Manila deployment starting openvswitch and neutron-openvswitch-agent containers when enable_manila_backend_generic was set to False. LP#1884939

  • Fixes the Elasticsearch Curator cron schedule run. LP#1885732

  • Fixes an incorrect configuration for nova-conductor when a custom Nova policy was applied, preventing the nova_conductor container from starting successfully. LP#1886170

  • Fix Castellan (Barbican client) when used with enabled TLS. LP#1886615

  • Fixes an incorrect Ceph keyring file configuration in gnocchi.conf, which prevented Gnocchi from connecting to Ceph. LP#1886711

  • Fixes --configdir parameter to apply to default passwords.yml location. LP#1887180

  • fluentd is now logging to /var/log/kolla/fluentd/fluentd.log instead of stdout. LP#1888852

  • Add functionality to the glance role to add extra config file for image property protection and interoperable image import LP#1889272

  • Fixes deploy-containers action missing for the Masakari role. LP#1889611

  • No longer uses option http-tunnel for Neutron Server in HAProxy. This mode was long not recommended and recently deprecated entirely. LP#1892686

  • Fixes some Neutron subprojects not using the rolling upgrade scheme. SFC forcibly used the legacy scheme and dynamic routing was not migrated at all. LP#1894380

  • An issue has been fixed when keystone container would be stuck in restart loop with a message that fernet key is stale. LP#1895723

  • Fixes haproxy_single_service_split template to work with default for mode (http). LP#1896591

  • Fixed invalid fernet cron file path on Debian/Ubuntu from /var/spool/cron/crontabs/root/fernet-cron to /var/spool/cron/crontabs/root. LP#1898765

  • Add with_first_found on placement for placement-api wsgi configuration to allow overwrite from users. LP#1898766

  • OVN will no longer schedule SNAT routers on compute nodes when neutron_ovn_distributed_fip is enabled. LP#1901960

  • RabbitMQ services are now restarted serially to avoid a split brain. LP#1904702

  • Fixes LP#1906796 by adding notice and note loglevels to monasca log-metrics drop configuration

  • Fixes Swift’s stop action. It will no longer try to start swift-object-updater container again. LP#1906944

  • Fixes an issue with the kolla-ansible prechecks command with Docker 20.10. LP#1907436

  • Fixes an issue with kolla-ansible mariadb_recovery when the mariadb container does not exist on one or more hosts. LP#1907658

  • fix deploy freezer failed when use kolla_dev_mod LP#1888242

  • Add missing “become: true” on some VMWare related tasks. Fixed on Copying VMware vCenter CA file and Copying over nsx.ini.

  • fix deploy nova failed when use kolla_dev_mod.

  • In line with clients for other services used by Magnum, Cinder and Octavia also use endpoint_type = internalURL. In the same tune, these services also use the globally defined openstack_region_name.

  • Fixes issues with some CloudKitty commands trying to connect to an external TLS endpoint using HTTP. LP#1888544

  • Fixes an issue where Docker may fail to start if iptables is not installed. LP#1899060

  • The file generated by kolla-ansible post-deploy was previously created with root:root ownership and 644 permissions. This would allow anyone with access to the same directory to read the file, including the admin credentials. The ownership of is now set to the user executing kolla-ansible, and the file is assigned a mode of 600. This change can be applied by running kolla-ansible post-deploy.

  • Fixes an issue with Cinder upgrades that would cause online schema migration to fail. LP#1880753

  • Fix the configuration of the etcd service so that its protocol is independant of the value of the internal_protocol parameter. The etcd service is not load balanced by HAProxy, so there is no proxy layer to do TLS termination when internal_protocol is configured to be https.

  • Fixes an issue during deleting evacuated instances with encrypted block devices. LP#1891462

  • Fixes LP#1885885 where the default chunk size in the Monasca Fluentd output plugin increased from 8MB to 256MB for file buffering which exceeded the limit allowed by the Monasca Log / Unified API.

  • Fixes an issue where Keystone Fernet key rotation may fail due to permission denied error if the Keystone rotation happens before the Keystone container starts. LP#1888512

  • Fixes an issue with Keystone startup when Fernet key rotation does not occur within the configured interval. This may happen due to one of the Keystone hosts being down at the scheduled time of rotation, or due to uneven intervals between cron jobs. LP#1895723

  • Reverts the arp_responder option setting to the default (‘False’) for the LinuxBridge agent, as this is known to cause problems with l2_population as well as other issues such as not being fully compatible with the allowed-address-pairs extension. LP#1892776

  • Fixes an issue with the Neutron Linux bridge ML2 driver where the firewall driver configuration was not applied. LP#1889455

  • Fixes an issue with Masakari and internal TLS where CA certificates were not copied into containers, and the path to the CA file was not configured. Depends on masakari bug 1873736 being fixed. LP#1888655

  • Fixes an issue where Grafana instances would race to bootstrap the Grafana DB. See LP#1888681.

  • Fixes LP#1892210 where the number of open connections to Memcached from neutron-server would grow over time until reaching the maximum set by memcached_connection_limit (5000 by default), at which point the Memcached instance would stop working.

  • Adds a new variable fluentd_elasticsearch_cacert, which defaults to the value of openstack_cacert. If set, this will be used to set the path of the CA certificate bundle used by Fluentd when communicating with Elasticsearch. LP#1885109

  • An issue where when Kafka default topic creation was used to create a Kafka topic, no redundant replicas were created in a multi-node cluster. LP#1888522. This affects Monasca which uses Kafka, and was previously masked by the legacy Kafka client used by Monasca which has since been upgraded in Ussuri. Monasca users with multi-node Kafka clusters should consultant the Kafka documentation to increase the number of replicas.

  • Improves error reporting in kolla-genpwd and kolla-mergepwd when input files are not in the expected format. LP#1880220.

  • Fixes an issue where the br_netfilter kernel module was not loaded on compute hosts. LP#1886796

  • Fixes Magnum trust operations in multi-region deployments.

  • The Prometheus OpenStack exporter now uses internal endpoints to communicate with OpenStack services, to match the configuration of other services deployed by Kolla Ansible.

  • Prevents adding a new Keystone host to an existing cluster when not targeting all Keystone hosts (e.g. due to --limit or --serial arguments), to avoid overwriting existing Fernet keys. LP#1891364

  • Reduce the use of SQLAlchemy connection pooling, to improve service reliability during a failover of the controller with the internal VIP. LP#1896635

  • No longer configures the Prometheus OpenStack exporter to use the prometheus Docker volume, which was never required.

  • Deploys Apache ZooKeeper if Apache Storm is enabled explicitly. ZooKeeper would only be deployed if Apache Kafka was also enabled, which is often done implicitly by enabling Monasca.

Other Notes

  • Add trove-guestagent.conf for trove