2023.2 Series Release Notes¶
Fixes a failure case where downloads would not be retried when the checksum fails verification. the agent now includes the checksum activity as part of the file download operation, and will automatically retry downloads when the checksum fails in accordance with the existing download retry logic. This is largely in response to what appears to be intermittent transport failures at lower levels which we cannot otherwise detect.
The default timeout value for the agent to lookup itself in an Ironic deployment has been extended to 600 seconds from 300 seconds. This is to provide better stability for Ironic deployments under heavy load which may be unable to service new requests. This is particularly true when the backing database is SQLite for Ironic due to the limited write concurrency of the database.
The error handling of the multipathd service startup/discovery process. IPA handles both scenario when the multipathd service is already started and the scenario when the service has not been started and in the second scenario IPA will try to start the service. IPA is not pre checking whether multipathd is running already or not, it will start the multipathd service even if it is already running and expects 0 error code . It has been noticed that with certain combinations of Linux distros and multipathd versions the error code is not 0 when IPA tries to start multipathd in case an instance of multipathd is already running. When the expected return code is not 0 an exception will be thrown and that will cause the multipath device discovery to terminate prematurely and if the selected root device is a multipath device then IPA won’t be able to provision. This fix discards the exception that is caused by the non 0 error code returned by the multipathd startup process. In case there is a genuine issue with the multipath service, that would be caught when the actual multipath device listing command is executed (multipath -ll).
Fixes an issue with rebuilding instances on Software RAID with RAIDed ESP partitions.
Adds a new
serviceextension which facilitates command handling for Ironic to retrieve a list of service steps.
Adds a new base method to base HardwareManager,
get_service_stepswhich works the same as
get_deploy_steps. These methods can be extended by hardware managers to permit them to signal what steps are permitted.
Extends reasonable deploy/clean steps to also be service steps which are embedded in the Ironic agent. For example, CPU, Network, and Memory burnin steps are available as service steps, but not the disk burnin step as that would likely result in the existing disk contents being damaged.
Fixes a failure case where a deployed instance may be unable to access the configuration drive post-deployment. This can occur when block devices only support 4KB IO interactions. When 4KB block IO sizes are in use, the ISO9660 filesystem driver in Linux cannot be used as it is modeled around a 2KB block. We now attempt to verify, and rebuild the configuration drive on a FAT filesystem when we cannot mount the supplied configuration drive. Operators can force the agent to write configuration drives using the FAT filesystem using the
Fixes, or at least lessens the case where a running Ironic agent can stack up numerous lookup requests against an Ironic deployment when a node is locked. In particular, this is beause the lookup also drives generation of the agent token, which requires the conductor to allocate a worker, and generate the token, and return the result to the API client. Ironic’s retry logic will now wait up to
60seconds, and if an HTTP Conflict (409) message is received, the agent will automatically pause lookup operations for thirty seconds as opposed continue to attempt lookups which could create more work for the Ironic deployment needlessly.
Fixes timeout declarations for Bandit 1.7.5 rule additions.
Adds a new configuration option
http_request_timeoutto allow for operators to set the amount of time to wait for a new request socket to wait. This helps prevent prevent a possible hanged connection should the initial packets be lost in tranist.
Fixes the nvidia hardware manager firmware upgrade support to permit URLs with an “https” schema.
Adds a new configuration option
http_request_timeoutwhich is also accessible utilizing the kernel command line option
ipa-http-request-timeout. This option helps prevent failed connections from hanging the agent. The default is 30 seconds.
The agent now logs the size of data transferred when downloading images, which can be helpful in troubleshooting image download issues.
The hardware inventory now contains information about the system firmware: vendor, version and the build date.
ironic-python-agentwill now attempt to determine a checksum type by evaluating the length of the supplied checksum. This allows SHA512 (SHA-2) and SHA256 (SHA-2) checksums to be identified and utilized without an explicit declaration of the checksum type utilizing the
Improved parsing of checksum files.
Added support for the
ALGORITHM (FILENAME) = CHECKSUMformat used by CentOS Stream.
Lines starting with
#are ignored as comments.
If checksum file contain only the checksum itself, the content is validated to ensure it is one of the known checksum types.
Adds a new inspection collector
lldpthat collects LLDP information into the
The hardware inventory now contains supported network interface speed in Mbit/s.
Support for MD5 checksums have been deprecated and will be removed after the 2024 Release.
The LLDP information as part of the general inventory is deprecated. Use the new
lldpinspection collector to retrieve it.
ipa-collect-lldpkernel parameter and the corresponding option are now deprecated.
Fixes UEFI NVRAM record handling with efibootmgr so we can accept and handle UTF-16 encoded data which is to be expected in UEFI NVRAM as the records are UTF-16 encoded.
Fixes handling of UEFI NVRAM records to allow for unexpected characters in the response, so it is non-fatal to Ironic.