Victoria Series Release Notes

6.4.1

Bug Fixes

  • Fixes the write_image deploy step to actually check and return any errors during its execution.

  • Avoids a traceback when using install_bootloader with whole disk images. If the root UUID cannot be detected, don’t try to call grub.

6.4.0

New Features

  • Enables support in IPA for hosting the API server over TLS. Using this support requires setting [DEFAULT]listen_tls to True, and then setting [ssl]cert_file, [ssl]key_file, and optionally [ssl]ca_file to files embedded in the ramdisk IPA runs inside.

  • When a recent enough version of ironic is detected and listen_tls is False, agent will now generate a self-signed TLS certificate and send it to ironic on heartbeat. This ensures encrypted communication from ironic to the agent. Set enable_auto_tls to False to disable this behavior.

  • The logs inspection collector is now enabled by default, change ipa-inspection-collectors to disable.

Upgrade Notes

  • IPA heartbeat intervals now rely on accurate clock time. Any clean or deploy steps which attempt to sync the clock may cause heartbeats to not be emitted. IPA syncs time at startup and shutdown, so these steps should not be required.

Bug Fixes

  • Fixes an issue with nodes undergoing fast-track from introspection to deployment where the agent internal cache of the node may be stale. In particular, this can be observed if node does not honor a root device hint which is saved to Ironic’s API after the agent was started. More information can be found in story 2008039.

  • Fixes a minor incorrect keyword argument that was matching between the method caller and the unit test but not the actual method, unit test, and caller. This was a non-fatal issue, and should now permit the agent to attempt to lookup the node one last time before deploying the instance image to pick-up a root device hint.

  • Fixes an issue with the IntelCnaHardwareManager which prevented hardware managers with lower priority to be executed and therefore may blocked the initialization and collection of hardware these managers are supposed to take care of.

  • Fixes a bug where the partitions created during software RAID setup are cleaned too early and therefore may prevent the proper cleaning of the md superblocks. Leaving superblocks behind will impact the creation of new md devices later on.

  • Detects md component devices by their UUID, rather than by scanning the output of mdadm. This will prevent that devices miss md superblock cleanup when they are currently not part of an array.

Other Notes

  • Adds an explicit capture of connectivity failures in the heartbeat process to provide a more verbose error message in line with what is occuring as opposed to just indicating that an error occured. This new exception is called HeartbeatConnectionError and is likely only going to be visible if there is a local connectivity failure such as a router failure, switchport in a blocking state, or connection centered transient failure.

6.3.0

New Features

  • The new kernel parameter ipa-advertise-protocol can be used to change the protocol of the callback URL to https.

  • The deploy.erase_devices_metadata clean step can now also be used as a deploy step.

  • Introspection of PCI devices now collects PCI class, revision and PCI bus.

  • Adds a Poll extension which provides the ability to retrieve hardware information as well as set node data from API. This feature is required for poll mode deployment driven by ironic.

Bug Fixes

  • Fixes the return value of the apply_configuration deploy step: the agent RAID interface expects the final RAID configuration to be returned.

  • Fixes an issue where the bootloader installation can fail on a software RAID volume when no root_device hint is set. See Story 2007905

  • Fixes retry logic issues with the Agent Lookup which can result in the lookup failing prematurely before being completed, typically resulting in an abrupt end to the agent logging and potentially weird errors like TypeError being reported on the agent process standard error output. For more information see bug 2007968.

  • Fixes an issue with the ironic-python-agent where we would call to setup the bootloader, which is necessary with software raid, but also attempt to clean up iSCSI. This can cause issues when using the direct deploy_interface. Now the agent will only clean up iSCSI connections if iSCSI was explicitly started. For more information, please see story 2007937.

6.2.0

Bug Fixes

  • Fixes deployment failures when the image download is interrupted mid-stream while the contents are being downloaded. Previously retries were limited to only opening the initial connection.

  • Fixes the short timeout retries interval, which was previously 5 seconds, to a length that will allow the agent to retry after a network interruption. The time between retries is now 10 seconds, and the number of retries are set to 9 to help ensure intermittent network outages do not cause recoverable failures.

  • Fixes an issue with high cpu usage caused by ironic-python-agent greenthread eventlent implementation.

    Using eventlet.sleep(0.1) instead of eventlet.sleep(0) gives other processes of IPA more cpu time to run.

  • Speeds up going from inspection to cleaning with fast-track enabled by caching hardware information between the steps.

  • Fixes serializing exceptions originating from ironic-lib. Previously an attempt to do so would result in a TypeError, for example: Object of type ‘InstanceDeployFailure’ is not JSON serializable.

  • Fixes failure to detect a hung file download connection in the event that the kernel has not rapidly detected that the remote server has hung up the socket. This can happen when there is intermittent and transient connectivity issues such as those that can occur due to LACP failure response hold-downs timers in switching fabrics.