Ussuri Series Release Notes¶
When using a distribution with a recent SELinux release such as CentOS 8 Stream, PING health-monitor does not work as shell_exec_t calls are denied by SELinux.
Fixed configuration issue which allowed authenticated and authorized users to inject code into HAProxy configuration using API requests. Octavia API no longer accepts unencoded whitespace characters in url_path values in update requests for healthmonitors.
The fix that updates the Netfilter Conntrack Sysfs variables requires rebuilding the amphora image in order to be effective.
Increased the TCP buffer memory maximum and enabled MTU ICMP black hole detection.
The generated RSyslog configuration on the amphora supports now RSyslog failover with TCP if multiple RSyslog servers were specified.
In order to avoid hitting the Neutron API hard when batch update with creating many new members, we cache the subnet validation results in batch update members API call. We also change to validate new members only during batch update members since subnet ID is immutable.
Disable conntrack for TCP flows in the Amphora, it reduces memory usage for HAProxy-based listeners and prevents some kernel warnings about dropped packets.
Fix disabled UDP pools. Disabled UDP pools were marked as “OFFLINE” but the requests were still forwarded to the members of the pool.
Correctly detect the member operating status “drain” when querying status data from HAProxy.
Enable required SELinux booleans for CentOS or RHEL amphora image.
Fixed backwards compatibility issue with the feature that preserves HAProxy server states between reloads. HAProxy version 1.5 or below do not support this feature, so Octavia will not to activate it on amphorae with those versions.
Fix a bug that prevented the provisioning_state of a health-monitor to be set to ERROR when an error occurred while creating, updating or deleting a health-monitor.
Fix an issue with amphorav2 and persistence, some long tasks executed by a controller might have been released in taskflow and rescheduled on another controller. Octavia now ensures that a task is never released early by using a keepalive mechanism to notify taskflow (and its redis backend) that a job is still running.
Fixed an issue with members in ERROR operating status that may have been updated briefly to ONLINE during a Load Balancer configuration change.
Netfilter Conntrack Sysfs variables net.netfilter.nf_conntrack_max and nf_conntrack_expect_max get set to sensible values on the amphora now. Previously, kernel default values were used which were much too low for the configured net.netfilter.nf_conntrack_buckets value. As a result packets could get dropped because the conntrack table got filled too quickly. Note that this affects only UDP and SCTP protocol listeners. Connection tracking is disabled for TCP-based connections on the amphora including HTTP(S).
Fix an issue with the provisioning status of a load balancer that was set to ERROR too early when an error occurred, making the load balancer mutable while the execution of the tasks for this resources haven’t finished yet.
Fix an issue that could set the provisioning status of a load balancer to a PENDING_UPDATE state when an error occurred in the amphora failover flow.
Fix a bug when updating a load balancer with a QoS policy after a failover, Octavia attempted to update the VRRP ports of the deleted amphorae, moving the provisioning status of the load balancer to ERROR.
Fix a potential race condition when updating a resource in the amphorav2 worker. The worker was not waiting for the resource to be set to PENDING_UPDATE, so the resource may have been updated with old data from the database, resulting in a no-op update.
Fix an issue when Octavia performs a failover of an ACTIVE-STANDBY load balancer that has both amphorae missing. Some tasks in the controller took too much time to timeout because the timeout value defined in
[haproxy_amphora].active_connection_rety_intervalwas not used.
Fix a bug that could have triggered a race condition when configuring a member interface in the amphora. Due to a race condition, a network interface might have been deleted from the amphora, leading to a loss of connectivity.
Fixed “Could not retrieve certificate” error when updating/deleting the client_ca_tls_container_ref field of a listener after a CA/CRL was deleted.
Fixed validations in L7 rule and session cookie APIs in order to prevent authenticated and authorized users to inject code into HAProxy configuration. CR and LF (\r and \n) are no longer allowed in L7 rule keys and values. The session persistence cookie names must follow the rules described in https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie.
Fix load balancers stuck in PENDING_UPDATE issues for some API calls (POST /l7rule, PUT /pool) when a provider denied the call.
Validate that the creation of L7 policies is compatible with the protocol of the listener in the Amphora driver. L7 policies are allowed for Terminated HTTPS or HTTP protocol listeners, but not for HTTPS, TCP or UDP protocols listeners.
Fixes loadbalancer creation failure when one of the listener port matches with the octavia generated peer ports and the allowed_cidr is explicitly set to 0.0.0.0/0 on the listener. This is due to creation of two security group rules with remote_ip_prefix as None and remote_ip_prefix as 0.0.0.0/0 which neutron rejects the second request with security group rule already exists.
Fix a serialization error when using host_routes in VIP subnets when persistence in the amphorav2 driver is enabled.
Fixed MAX_TIMEOUT for timeout_client_data, timeout_member_connect, timeout_member_data, timeout_tcp_inspect API listener. The value was reduced from 365 days to 24 days, which now does not exceed the value of the data type in DB.
Increase the limit value for nr_open and file-max in the amphora, the new value is based on what HAProxy 2.x is expecting from the system with the greatest maxconn value that Octavia can set.
Fixed an issue with batch member updates, that don’t have any changes, not properly rolling back the update.
Fixed an issue that an amphorav2 LB cannot be reached after loadbalancer failover. The LB security group was not set in the amphora port.
Fixes an issue where provider drivers may not decrement the load balancer objects quota on delete.
Fix an issue with the rsyslog configuration file in the Amphora when the log offloading feature and the local log storage feature are both disabled.
Some IPv6 UDP members were incorrectly marked in ERROR status, because of a formatting issue while generating the health message in the amphora.
Fixed an issue with the
lointerface in the
amphora-haproxynetwork namespace. The
lointerface was down and prevented haproxy to communicate with other haproxy processes (for persistent stick tables) on configuration change. It delayed old haproxy worker cleanup and increased the memory consumption usage after reloading the configuration.
Fix load balancers that use customized host_routes in the VIP or the member subnets in amphorav2.
Fix weighted round-robin for UDP listeners with keepalived and lvs. The algorithm must be specified as ‘wrr’ in order for weighted round-robin to work correctly, but was being set to ‘rr’.
Fixed the healthcheck endpoint always querying the backends by caching results for a configurable time. The default is five seconds.
Fix a bug that allowed a user to create a load balancer on a
vip_subnet_idthat belongs to another user using the subnet UUID.
Fixes an issue with load balancer failover, when the VIP subnet is out of IP addresses, that could lead to the VIP being deallocated.
Fixed an issue where members added to TLS-enabled pools would go to ERROR provisioning status.
Fix default value override for timeout values for listeners. Changing the default timeouts in the configuration file wasn’t correctly applied in the default listener parameters.
Fix operational status for disabled UDP listeners. The operating status of disabled UDP listeners is now OFFLINE instead of ONLINE, the behavior is now similary to the behavior of HTTP/HTTPS/TCP/… listeners.
Fixed an issue that could cause load balancers, with multiple amphora in a failed state, to be unable to complete a failover.
Fix an incorrect
operating_statuswith empty UDP pools. A UDP pool without any member is now
Add missing cloud-utils-growpart RPM to Red Hat based amphora images.
Add missing cronie RPM to Red Hat based amphora images.
Fix nf_conntrack_buckets sysctl in the Amphora, its value was incorrectly set.
Fixed an issue were updating a CRL or client certificate on a pool would cause the pool to go into ERROR.
Fixed an issue where TLS-enabled pools would fail to provision.
Fix a potential invalid DOWN operating status for members of a UDP pool. A race condition could have occured when building the first heartbeat message after adding a new member in a pool, this recently added member could have been seen as DOWN.
Add a validation step in the Octavia Amphora driver to ensure that the port_security_enabled parameter is set on the VIP network.
Add a new configuration option to define the default connection_limit for new listeners that use the Amphora provider. The option is [haproxy_amphora].default_connection_limit and its default value is 50,000. This value is used when creating or setting a listener with -1 as connection_limit parameter, or when unsetting connection_limit parameter.
The failover improvements do not require an updated amphora image, but updating existing amphora will minimize the failover outage time for standalone amphora on subsequent failovers.
If you are using the admin_or_owner-policy.yaml policy override file you should upgrade your API processes to include the unscoped token fix. The default policies are not affected by this issue.
Fixed an issue with failing over an amphora if the pair amphora in an active/standby pair had a missing VRRP port in neutron.
Fixed an issue where setting of SNI containers were not being applied on listener update API calls.
Fixed an Octavia API validation on listener update where SNI containers could be set on non-TERMINATED_HTTPS listeners.
Fixed an issue where some columns could not be used for sort keys in API list calls.
Fix an issue when the barbican service enable TLS, we create the listerner failed.
Fixed an issue where amphora load balancers fail to create when Nova anti-affinity is enabled and topology is SINGLE.
Fixed an issue where listener “insert_headers” parameter was accepted for protocols that do not support header insertion.
Fixed an issue where UDP only load balancers would not bring up the VIP address.
Fixes an issue when using the admin_or_owner-policy.yaml policy override file and unscoped tokens.
With haproxy 1.8.x releases, haproxy consumes much more memory in the amphorae because of pre-allocated data structures. This amount of memory depends on the maxconn parameters in its configuration file (which is related to the connection_limit parameter in the Octavia API). In the Amphora provider, the default connection_limit value -1 is now converted to a maxconn of 50,000. It was previously 1,000,000 but that value triggered some memory allocation issues when quickly performing multiple configuration updates in a load balancer.
Significantly improved the reliability and performance of amphora and load balancer failovers. This is especially true when the Nova service is experiencing failures.
An amphora image update is recommended to pick up a workaround to an HAProxy issue where it would fail to reload on configuration change should the local peer name start with “-x”.
Fixed an issue when a loadbalancer is disabled, Octavia Health Manager keeps failovering the amphorae
Workaround an HAProxy issue where it would fail to reload on configuration change should the local peer name start with “-x”.
HTTPS-terminated listeners can now be individually configured with an OpenSSL cipher string. The default cipher string for new listeners can be specified with
octavia.conf. The built-in default is OWASP’s “Suite B” recommendation. (https://cheatsheetseries.owasp.org/cheatsheets/TLS_Cipher_String_Cheat_Sheet.html) Existing listeners will be unaffected.
Added the oslo-middleware healthcheck app to the Octavia API. Hitting /healthcheck will return a 200. This is enabled via the [api_settings]healthcheck_enabled setting and is disabled by default.
Operators can now use the amphorav2 provider which uses jobboard-based controller. A jobboard controller solves the issue with resources stuck in PENDING_* states by writing info about task states in persistent backend and monitoring job claims via jobboard.
Add listener and pool protocol validation. The pool and listener can’t be combined arbitrarily. We need some constraints on the protocol side.
Added support for CentOS 8 amphora images.
Two new types of healthmonitoring are now valid for UDP listeners. Both
TCPcheck types can now be used.
Add an API for allowing administrators to manage Octavia Availability Zones and Availability Zone Profiles, which behave nearly identically to Flavors and Flavor Profiles.
Availability zone profiles can now override the
Added an option to the diskimage-create.sh script to specify the Octavia Git branch to build the image from.
TLS-enabled pools can now be individually configured with an OpenSSL cipher string. The default cipher for new pools can be specified with
octavia.conf. The built-in default is OWASP’s “Suite B” recommendation. (https://cheatsheetseries.owasp.org/cheatsheets/TLS_Cipher_String_Cheat_Sheet.html) Existing pools will be unaffected.
The load balancer create command now accepts an availability_zone argument. With the amphora driver this will create a load balancer in the targeted compute availability_zone in nova.
When using spare pools, it will create spares in each AZ. For the amphora driver, if no
[nova] availability_zoneis configured and availability zones are used, results may be slightly unpredictable.
Note (for the
amphoradriver): if it is possible for an amphora to change availability zone after initial creation (not typically possible without outside intervention) this may affect the ability of this feature to function properly.
After this upgrade, users will no longer be able use network resources they cannot see or “show” on load balancers. Operators can revert this behavior by setting the “allow_invisible_reourece_usage” configuration file setting to
Any amphorae running a py3 based image must be recycled or else they will eventually fail on certificate rotation.
Python 2.7 support has been dropped. The minimum version of Python now supported by Octavia is Python 3.6.
A new amphora image is required to fix the potential certs-ramfs race condition.
Previously, if a user knew or could guess the UUID for a network resource, they could use that UUID to create load balancer resources using that UUID. Now the user must have permission to see or “show” the resource before it can be used with a load balancer. This will be the new default, but operators can disable this behavior via the setting the configuration file setting “allow_invisible_resource_usage” to
True. This issue falls under the “Class C1” security issue as the user would require a valid UUID.
Correctly require two-way certificate authentication to connect to the amphora agent API (CVE-2019-17134).
A race condition between the certs-ramfs and the amphora agent may lead to tenant TLS content being stored on the amphora filesystem instead of in the encrypted RAM filesystem.
Resolved broken certificate upload on py3 based amphora images. On a housekeeping certificate rotation event, the amphora would clear out its server certificate and return a 500, putting the amphora in ERROR status and breaking further communication. See upgrade notes.
Fixes an issue where load balancers with more than one TLS enabled listener, one or more SNI enabled, may load certificates from other TLS enabled listeners for SNI use.
Fixed a potential race condition with the certs-ramfs and amphora agent services.
Fixes an issue where load balancers with more than one TLS enabled listener, using client authentication and/or backend re-encryption, may load incorrect certificates for the listener.
Fix a bug that could interrupt resource creation when performing a graceful shutdown of the house keeping service and leave resources such as amphorae in a BOOTING status.
Fixed an issue where load balancers would go into ERROR when setting data not visible to providers (e.g. tags).
Fixes the ability to filter on the provider flavor capabilities API.
Fixed code that configured the CentOS/Red Hat amphora images to use the correct names for the network ‘ifcfg’ files for static routes and routing rules. It was using the wrong name for the routes file, and did not support IPv6 in either file. For more information, see https://storyboard.openstack.org/#!/story/2007051
Fix a bug that could interrupt resource creation when performing a graceful shutdown of the controller worker and leave resources in a PENDING_CREATE/PENDING_UPDATE/PENDING_DELETE provisioning status. If the duration of an Octavia flow is greater than the ‘graceful_shutdown_timeout’ configuration value, stopping the Octavia worker can still interrupt the creation of resources.
Delay between checks on UDP healthmonitors was using the incorrect config value
timeout, when it should have been
Amphorae that are booting for a specific loadbalancer will now be linked to that loadbalancer immediately upon creation. Previously this would not happen until near the end of the process, leaving a gap during booting during which is was difficult to understand which booting amphora belonged to which loadbalancer. This was especially problematic when attempting to troubleshoot loadbalancers that entered ERROR status due to boot issues.