Ussuri Series Release Notes (6.0.0 - 6.1.x)

6.1.0-18

Bug Fixes

  • Fixes deployment failures when the image download is interrupted mid-stream while the contents are being downloaded. Previously retries were limited to only opening the initial connection.

  • Fixes the return value of the apply_configuration deploy step: the agent RAID interface expects the final RAID configuration to be returned.

  • Fixes the short timeout retries interval, which was previously 5 seconds, to a length that will allow the agent to retry after a network interruption. The time between retries is now 10 seconds, and the number of retries are set to 9 to help ensure intermittent network outages do not cause recoverable failures.

  • Fixes an issue with high cpu usage caused by ironic-python-agent greenthread eventlent implementation.

    Using eventlet.sleep(0.1) instead of eventlet.sleep(0) gives other processes of IPA more cpu time to run.

  • Speeds up going from inspection to cleaning with fast-track enabled by caching hardware information between the steps.

  • Fixes serializing exceptions originating from ironic-lib. Previously an attempt to do so would result in a TypeError, for example: Object of type ‘InstanceDeployFailure’ is not JSON serializable.

  • Fixes an issue with the ironic-python-agent where we would call to setup the bootloader, which is necessary with software raid, but also attempt to clean up iSCSI. This can cause issues when using the direct deploy_interface. Now the agent will only clean up iSCSI connections if iSCSI was explicitly started. For more information, please see story 2007937.

  • Fixes failure to detect a hung file download connection in the event that the kernel has not rapidly detected that the remote server has hung up the socket. This can happen when there is intermittent and transient connectivity issues such as those that can occur due to LACP failure response hold-downs timers in switching fabrics.

6.1.0

New Features

  • Adds support for the agent to receive, store, and return an agent token from the Ironic deployment to help secure use of the ironic API /v1/heartbeat endpoint, as well as the API of the ironic-python-agent ramdisk.

  • Target devices for software RAID can now be specified in the form of device hints (same as for root devices) in the physical_disks parameter of a logical disk configuration.

  • Adds a feature where IPA will utilize a [DEFAULT]ntp_server or ipa-ntp-server kernel command line argument to cause the agent to attempt to sync the clock to the NTP source. The agent also attempts to sync the software clock to the NTP time source, and assert an update to the hardware clock prior to powering the machine off. Please note, if your system clock is set to local time as opposed to UTC, this may result in undesirable behavior.

  • Adds UEFI boot support for Software RAID, and for partition table creation based upon boot mode in use.

Upgrade Notes

  • The minimum supported versions of the ironic API is now 1.31, corresponding to the latest available in the Ocata release. All versions before that one are not supported anymore.

  • The type of the partition table created for Software RAID is now based upon the boot mode in use (GPT for UEFI or if explicitly passed via the instance’s capabilities or the node’s properties, otherwise MSDOS). The amount of reserved space on the drives now also depends on the boot mode (128MiB for UEFI/GPT, 8MiB for BIOS/GPT, and one sector otherwise).

Security Issues

  • The salt was generated using random and the module it’s not in compliance with FIPS 140-2. Now we let the salt be automatically generated by the crypt function (it will use the strongest method available).

Bug Fixes

  • Fixes an issue with deployment ramdisks running in UEFI boot mode where dual-boot images may cause the logic to prematurely exit before UEFI parameters can be updated. Internal checks for a BIOS bootloader will always return False now when the machine is in UEFI mode.

  • Fixes an issue where secondary GPT partition tables were not being updated after the ironic-python-agent wrote the disk image to the target disk. The agent now unconditionally attempts to repair the secondary partition table. Previously, software RAID volumes would report errors upon restart.

  • Fixes error handling if efibootmgr is not present in ramdisk. See story for more details.

  • Provides timeout and retries when establishing a connection to download an image in the standby extension. Reduces probability of an image download getting stuck in the event of network problems.

    The default timeout is 60 seconds and can be set via the ipa-image-download-connection-timeout kernel parameter. The default number of retries is 2 and can be set via the ipa-image-download-connection-retries parameter.

  • Fixes risk of potential active node thundering heard by introducing jitter handling into the ironic-collect-introspection-data. By default, the jitter will cause the introspection_daemon_post_interval configuration parameter based time value to be honored between in a range of 70% to 120% of the desired time window.

    Should failures occur after the initial connection and start of the daemon mode for introspection data collection, the fallback is a maximum of 400% of the introspection daemon post interval.

  • The salt now will be automatically generated by the crypt function.

  • Rescans partitions on a software RAID device that gets restarted when installing boot loader.

  • Fixes an issue where the agent was failing to rescan the device deployed upon before checking uefi contents. This would occur with an iSCSI based deployment, as partition management operations are performed by the conductor, and not locally.

  • No longer tries to use GRUB2 for configuring boot for whole disk images with an EFI partition present but only marked as boot (not esp).

6.0.0

New Features

  • Allows reading the root_device from instance_info, overriding the value in properties.

  • Adds a new field configuration to the introspection data collected by the default collector. It contains two fields:

    • collectors - list of the enabled inspection collectors.

    • managers - list of the enabled hardware managers in their priority order.

  • Adds support to allow selection of RAID 5 and RAID 6 protection levels for software RAID support. This may only be the secondary volume, as these volume types of software RAID volumes cannot be used to directly boot an operating system.

Upgrade Notes

  • Python 2.7 support has been dropped. Last release of ironic-python-agent to support Python 2.7 is OpenStack Train. The minimum version of Python now supported by ironic-python-agent is Python 3.6.

  • For Software RAID, the IPA will no longer assume that the root file system is in the first partition of the deployed image. Instead, the IPA will use the UUID passed from the conductor to identify the root partition. Before upgrades, the root file system UUID needs hence to be set as part of the node’s driver_internal_info or as rootfs_uuid image metadata.

Security Issues

  • Enables pre-hashed passwords to be supplied to the rescue extension. See story 2006777 for more information.

Bug Fixes

  • Fixes the workflow for wholedisk images when using UEFI boot mode, when possible it will use efibootmgr instead of grub2 to update the NVRAM.

  • Fixes an issue with the tinyIPA CI testing image by providing a fallback root volume uuid detection method via the findfs utility, which is also already packaged in most distributions with lsblk.

    This fallback was necesary as the lsblk command in TinyCore Linux, upon which TinyIPA is built, does not return data as expected for volume UUID values.

  • Fixes an issue where metadata erasure cleaning would fail on devices that are read-only at the hardware level. Typically these are virtual devices being offered to the operating system for purposes like OS self-installation.

    In the case of full device erasure, this is explicitly raised as a hard failure requiring operator intervention.

  • Fixes an issue in fallback error handling where native iSCSI controls are unavailable due to the composition of the IPA ramdisk and where direct tgtadm commands also fails.

    Before fallback error handling was added, the teardown was skipped completely in the event of the native iSCSI controls being unavailable. The end user behavior is now as it was previously prior to the fallback error handling being added, but IPA will still continue to attempt to clean up the iSCSI session.

  • Skips NIC numa_node discovery if it’s not assigned to a numa_node as in some rare case, such as a VM with virtual NUMA node, NICs might not be in a NUMA node and this breaks numa-topology discovery.

  • Fixes the numa-topology inspection collector to be compatible with Pint < 0.5.2.

  • Fixes an issue where wholedisk images are requested for deployment and the bootloader is overridden. IPA now explicitly looks for the boot partition, and examines the contents if the disk appears to be MBR bootable. If override/skip bootloader installation does not apply if UEFI or PREP boot partitions are present on the disk.

Other Notes

  • Increases the default value for the ipa-ip-lookup-attempts kernel argument to 6, adding extra time for networking to be set up before giving up.

  • The output of lsblk and the contents of /proc/mdstat are now collected with the ramdisk logs for debugging.

  • The sample configuration file etc/ironic_python_agent/ironic_python_agent.conf.sample is no longer shipped with the source code. It can be generated locally with:

    tox -egenconfig