2023.1 Series Release Notes¶
16.1.0-29¶
New Features¶
Added capability to specify custom kernel modules for Neutron: neutron_modules_default: Lists default modules. neutron_modules_extra: For custom modules and parameters.
Supports Debian Bookworm (12) as host distribution.
In the configuration template of the Senlin service the
cafile
parameter is now set by default in theauthentication
section. This way the use of self-signed certificates on the internal Keystone endpoint is also usable in the Senlin service.
Upgrade Notes¶
Now
ironic_tftp
service does not bind on 0.0.0.0, by default it uses ip address of theapi_interface
. To revert to the old behaviour, please setironic_tftp_interface_address: 0.0.0.0
inglobals.yml
.
Configure Nova libvirt.num_pcie_ports to 16 by default. Nova currently sets ‘num_pcie_ports’ to “0” (defaults to libvirt’s “1”), which is not sufficient for hotplug use with ‘q35’ machine type.
Changes default value of nova libvirt driver setting
skip_cpu_compare_on_dest
to true. With the libvirt driver, during live migration, skip comparing guest CPU with the destination host. When using QEMU >= 2.9 and libvirt >= 4.4.0, libvirt will do the correct thing with respect to checking CPU compatibility on the destination host during live migration.
Security Issues¶
Restrict the access to the http Openstack services exposed /server-status by default through the HAProxy on the public endpoint. Fixes issue for Ubuntu/Debian installations. RockyLinux/CentOS not affected. LP#1996913
Bug Fixes¶
Fixes issues with OVN NB/SB DB deployment, where first node needs to be rebootstrapped. LP#1875223
enable_keystone_federation
andkeystone_enable_federation_openid
have not been explicitly handled as bool in various templates in the keystone role so far. LP#2036390
Fixes an issue when Kolla is setting the producer tasks to None, and this disables all designate producer tasks. LP#1879557
Fixes
ironic_tftp
which binds to all ip addresses on the system. Addedironic_tftp_interface
,ironic_tftp_address_family
andironic_tftp_interface_address
parameters to set the address for theironic_tftp
service. LP#2024664
Fixes an issue where a Docker health check wasn’t configured for the OpenSearch Dashboards container. See bug 2028362.
Fixes an issue where ‘q35’ libvirt machine type VM could not hotplug more than one PCIe device at a time.
Fixes an issue where keepalived track script fails on single controller environment and keepalived VIP goes into BACKUP state.
keepalived_track_script_enabled
variable has been introduced (default: true), which can be used to disable track scripts in keepalived configuration. LP#2025219
Fixes an issue were an OVS-DPDK task had a different name to how it was being notified.
16.1.0¶
Upgrade Notes¶
Removes the restriction on the maximum supported version of 2.14.2 for
ansible-core
. Any 2.14 series release is now supported.
Security Issues¶
The kolla-genpwd, kolla-mergepwd, kolla-readpwd and kolla-writepwd commands now creates or updates passwords.yml with correct permissions. Also they display warning message about incorrect permissions.
Bug Fixes¶
Set correct permissions for opensearch-dashboard data location LP#2020152 https://bugs.launchpad.net/kolla-ansible/+bug/2020152
Fixes the incorrect endpoint URLs and service type information for the Cyborg service in the Keystone. LP#2020080
Other Notes¶
Refactors the MariaDB and RabbitMQ restart procedures to be compatible with Ansible 2.14.3+. See Ansible issue 80848 for details.
16.0.0¶
New Features¶
Adds the flag
om_enable_rabbitmq_high_availablity
. Setting this totrue
will enable both durable queues and classic mirrored queues in RabbitMQ. Note that classic queue mirroring and transient (aka non-durable) queues are deprecated and subject to removal in RabbitMQ version 4.0 (date of release unknown). Changes the pattern used in classic mirroring to exclude some queue types. This pattern is^(?!(amq\\.)|(.*_fanout_)|(reply_)).*
.
Since CVE-2022-29404 is fixed the default value for the LimitRequestBody directive in the Apache HTTP Server has been changed from 0 (unlimited) to 1073741824 (1 GiB). This limits the size of images (for example) uploaded in Horizon. Now this limit can be configured via
horizon_httpd_limitrequestbody
. LP#2012588
Add skyline ansible role
Adds support for container state control through systemd in kolla_docker. Every container logs only to journald and has it’s own unit file in
/etc/systemd/system
named kolla-<container name>-container.service. Systemd control is implemented in new fileansible/module_utils/kolla_systemd_worker.py
.
etcd is now exposed internally via HAProxy on
etcd_client_port
.
Adds the command
kolla-ansible validate-config
. This runsoslo-config-validator
against the configurgation files present in the deployed OpenStack services. By default, results are saved to/var/log/kolla/config-validate
With the parameter
mariadb_datadir_volume
it is possible to use a directory as volume for the mariadb service. By default, a volume named mariadb is used (the previous default).
Adds support for deploying
neutron-ovn-agent
. The agent is disabled by default and can be enabled usingneutron_enable_ovn_agent
. This new agent will run on a compute node using OVN as network backend, similar to other ML2 mechanism drivers as ML2/OVS or ML2/SRIOV. This new agent will perform those actions that the ovn-controller service cannot execute. More details in RFE <https://bugs.launchpad.net/neutron/+bug/1998608>__
With the new
neutron_ovn_availability_zones
parameter it is possible to define network availability zones for OVN. Further details can be found in the Neutron OVN documentation: https://docs.openstack.org/neutron/latest/admin/ovn/availability_zones.html#how-to-configure-it
Masakari coordination backend can now be configured via masakari_coordination_backend variable. Coordination is optional and can now be set to either redis or etcd.
Adds
ovn-monitor-all
variable. A boolean value that tells if ovn-controller should unconditionally monitor all records in OVS databases. Settingovn-monitor-all
variable to ‘true’ will remove some CPU load from OVN SouthBound DB but will effect with more updates comming to ovn-controller. Might be helpfull in large deployments with many compute hosts.
Added two new flags to alter behaviour in RabbitMQ: * rabbitmq_message_ttl_ms, which lets you set a TTL on messages. * rabbitmq_queue_expiry_ms, which lets you set an expiry time on queues. See https://www.rabbitmq.com/ttl.html for more information on both.
Adds the ability to configure rabbitmq via
rabbitmq_extra_config
which can be overriden in globals.yml.
The config option rabbitmq_ha_replica_count is added, to allow for changing the replication factor of mirrored queues in RabbitMQ. While the flag is unset, the queues are mirrored across all nodes using “ha-mode”:”all”. Note that this only has an effect if the flag ` om_enable_rabbitmq_high_availability` is set to True, as otherwise queues are not mirrored.
The config option rabbitmq_ha_promote_on_shutdown has been added, which allows changing the RabbitMQ definition ha-promote-on-shutdown. By default ha-promote-on-shutdown is “when-synced”. We recommend changing this to be “always”. This basically means we don’t mind losing some messages, instead we give priority to rabbitmq availability. This is most relevant when restarting rabbitmq, such as when upgrading. Note that setting the value of this flag, even to the default value of “when-synced”, will cause RabbitMQ to be restarted on the next deploy. For more details please see: https://www.rabbitmq.com/ha.html#cluster-shutdown
When restarting a RabbitMQ container, the node is now first put into maintenance mode. This will make the node shutdown less disruptive. For details on what maintenance mode does, see: https://www.rabbitmq.com/upgrade.html#maintenance-mode
Switch
trove-api
to WSGI running under Apache.
Added configuration options to enable backend TLS encryption from HAProxy to the Trove service.
Services using etcd3gw via tooz now use etcd via haproxy. This removes a single point of failure, where we hardcoded the first etcd host for backend_url.
Upgrade Notes¶
Minimum supported Ansible version is now
6
(ansible-core 2.13) and maximum supported is7
(ansible-core 2.14). Due to a regression inansible-core
, it must not be greater than2.14.2
.
skydive
service deployment support has been dropped, following removal of Kollaskydive
images.
RabbitMQ replica count has changed from n to (n//2+1) where n is the number of RabbitMQ nodes. That is, for a 3 node clusters, we request exactly 2 replicas, for a 1 node cluster, we request 1 replica, and for a 5 node cluster, we request 3 replicas. This only has an effect if om_enable_rabbitmq_high_availability is set to True, otherwise queues are not replicated. The number of mirrored queues is not changed automatically, and instead requires the queues to be recreated (for example, by restarting RabbitMQ). This follows the good practice advice here: https://www.rabbitmq.com/ha.html#replication-factor A major motivation is to reduce the load on RabbitMQ in larger deployments. It is hoped, the improved performance should also help rabbitmq recover more quickly from cluster issues. Note that the contents of the RabbitMQ definitions.json are now changed, meaning RabbitMQ containers will be restarted on next deploy/upgrade.
The RabbitMQ variable rabbitmq-ha-promote-on-shutdown now defaults to “always”. This only has an effect if om_enable_rabbitmq_high_availability is set to True. When ha-promote-on-shutdown is set to always, queue mirrors are promted on shutdown even if they aren’t fully synced. This means that value availability over the risk of losing some messages. Note that the contents of the RabbitMQ definitions.json are now changed, meaning RabbitMQ containers will be restarted on next deploy/upgrade.
In RabbitMQ, messages now have a TTL of 10 minutes and inactive queues will expire after 1 hour. These queue arguments can be changed dynamically at runtime [1], but it should be noted that applying a TTL to queues which already have messages will discard the messages when specific events occur. See [2] for more details. Note that the contents of the RabbitMQ definitions.json are now changed, meaning RabbitMQ containers will be restarted on next deploy/upgrade. [1] https://www.rabbitmq.com/queues.html#optional-arguments [2] https://www.rabbitmq.com/ttl.html#per-message-ttl-caveats
Changes
rabbitmq
upgrade procedure from full stop of a cluster to a rolling upgrade that is supported since RabbitMQ 3.8.
OpenStack services (except Ironic and Keystone) stopped supporting the system scope in their API policy. Kolla who started using the system scope token during the OpenStack Xena release needs to revert it and use the project scope token to perform those services API operations. The Ironic and Keystone operations are still performed using the system scope token.
Default tags of
neutron_tls_proxy
andglance_tls_proxy
have been changed tohaproxy_tag
, as both services are usinghaproxy
container image. Any custom tag overrides for those services should be altered before upgrade.
Deprecation Notes¶
Deprecates support for deploying Sahara. Support for deploying Sahara will be removed from Kolla Ansible in the Bobcat Release.
Deprecates support for deploying Vitrage. Support for deploying Vitrage will be removed from Kolla Ansible in the Bobcat Release.
Bug Fixes¶
The precheck for RabbitMQ failed incorrectly when
kolla_externally_managed_cert
was set totrue
. LP#1999081
Fixes
kolla_docker
module which did not take into account the common_options parameter, so there were always module’s default values. LP#2003079
Fixes keystone’s task which is connecting via ssh instead locally. LP#2004224
Fixes create sasl account before config file is ready. LP#2015589
The flags
--db-nb-pid
and--db-sb-pid
have been corected to be--db-nb-pidfile
and--db-sb-pidfile
respectively. See here for reference: https://github.com/ovn-org/ovn/blob/6c6a7ad1c64a21923dc9b5bea7069fd88bcdd6a8/utilities/ovn-ctl#L1045 LP#2018436
Configuration of service user tokens for all Nova and Cinder services is now done automatically, to ensure security of block-storage volume data.
See LP#[2004555] for more details.
The value of
[oslo_messaging_rabbit] heartbeat_in_pthread
is explicitly set to eithertrue
for wsgi applications, orfalse
otherwise.
Fixes deployment when using Ansible check mode. LP#2002661
Set the etcd internal hostname and cacert for tls internal enabled deployments. This allows services to work with etcd when coordination is enabled for TLS interal deployments. Without this fix, the coordination backend fails to connect to etcd and the service itself crashes.
Fix issue with octavia config generation when using
octavia_auto_configure
and thegenconfig
command. Note that access to the OpenStack API is necessary for Octavia auto configuration to work, even when generating config. See LP#1987299 for more details.
Fixes OVN deployment order - as recommended in OVN docs. LP#1979329
When upgrading Nova to a new release, we use the tool
nova-status upgrade check
to make sure that there are nonova-compute
that are older than N-1 releases. This was performed using the currentnova-api
container, so computes which will be too old after the upgrade were not caught. Now the upgradednova-api
container image is used, so older computes are identified correctly. LP#1957080
Fixes an issue where some prechecks would fail or not run when running in check mode. LP#2002657
When upgrading or deploying RabbitMQ, the policy ha-all is cleared if om_enable_rabbitmq_high_availability is set to false.
In HA mode, parallel restart of neutron-l3-agent containers will cause a network outage. Adding routers increases the recovery time. This release makes restarts serial and adds a user-configurable delay to ensure each agent is returned to operation before the next one is restarted.
The default value is 0. A nonzero starting value would only result in outages if the failover time was greater than the delay, which would be more difficult to diagnose than consistent behaviour.
Prevent haproxy-config role from attempting to configure firewalld during a kolla-ansible genconfig. LP#2002522