Wallaby Series (6.5.0 - 7.0.x) Release Notes¶
Heartbeats to the conductor are grouped when they are scheduled or requested within a time interval of five seconds to avoid sending them in quick succession.
Adds the capability into the agent to read and act upon bootloader CSV files which serve as authoritative indicators of what bootloader to load instead of leaning towards utilizing the default.
If multiple bootloader CSV files are present on the EFI filesystem, the first CSV file discovered will be utilized. The Ironic team considers multiple files to be a defect in the image being deployed. This may be changed in the future.
Fixes an issue with bootloader installation on a software RAID by checking if the ESP is already mounted.
Fixes an issue where a quick succession of heartbeats exposes a race condition in the conductor’s RPC handling.
Fixes fall-back to sysrq when powering off or rebooting the node from inside a container.
Fixes an error with UEFI based deployments where using a partition image a NVMe device was previously failing due to the different device name pattern.
Fixes an issue where the NTP time sync at the IPA startup via chronyd is not immediate (which can break time sensitive components such as the generation of a TLS certificate).
Fixes failures with disk image conversions which result in memory allocation or input/output errors due to memory limitations by limiting the number of available memory allocation pools to a non-dynamic reasonable number which should not exceed the available system memory.
The lshw package version B.02.19.2-5 on CentOS 8.4 and 8.5 contains a bug that prevents the size of individual memory banks from being reported, with the result that the total memory size would be reported as 0 in some places. The total memory size is now taken from lshw’s total memory size output (which does not suffer from the same problem) when available.
Mirrors the previously disconnected EFI system partitions (ESPs) in UEFI software RAID setups. Disconnected ESPs can lead to nodes booting with outdated kernel parameters or the UEFI firmware not finding bootable kernels at all.
Fixes nodes failing after deployment completes due to issues in the Grub2 EFI loader entry addition where a
BOOT.CSVfile provides the authoritative pointer to the bootloader to be used for booting the OS. The base issue with Grub2 is that it would update the UEFI bootloader NVRAM entries with whatever is present in a vendor specific
BOOTX64.CSVfile. In some cases, a baremetal machine can crash when this occurs. More information can be found at story 2008962.
Fixes initial logging before configuration is loaded to re-log anything recorded for the purposes of troubleshooting. This is necessary as systemd does not report stdout from a process launch as part of the process’s logging. Now messages will be re-logged once the configuration has been loaded.
No longer crashes if MAC address cannot be determined for one of the network interfaces.
Adds a call to “udevadm settle” in write_image.sh. After GPT and MBR are destroyed systemd-udevd gets triggered which may hold /dev/sda open preventing qemu-img from writting its image.
Adds support for NVMe-specific storage cleaning to IPA. Currently this is implemented by using nvme-cli format functionality. Crypto Erase is used if supported by the device, otherwise the code falls back to User Data Erase. The operators can control NVMe cleaning by using deploy.enable_nvme_erase config option which controls
agent_enable_nvme_eraseinternal setting in driver_internal_info.
Adds a new deploy step
deploy.inject_filesto inject arbitrary files into the instance. See the hardware managers documentation for details.
Logic around virtual media device validation is now much more strict, and may not work in all cases. Should you discover a case, please provide the output from
lsblk -P -Owith a virtual media device attached to the Ironic development community via Storyboard.
Internal logic to copy configuration data from virtual media now requires the
boot_method=vmediaflag to be set on the kernel command line of the bootloader for the virtual media. Operators crafting custom boot ISOs, should ensure that the appropriate command line is being added in any custom build processes.
It is no longer possible to enable the so called standalone mode, in which the agent does not communicate with ironic. This mode is only useful for local testing, enabling it on production is always wrong. The ironic team does not support using ironic-python-agent as a standalone application outside of the normal workflow.
Addresses a potential vector in which an system authenticated malicious actor could leveraged data left on disk in some limited cases to make the API of the
ironic-python-agentattackable, or possibly break cleaning processes to prevent the machine from being able to be returned to the available pool. Please see story 2008749 for more information.
Adds validation of Virtual Media devices in order to prevent existing partitions on the system from being considered as potential sources of IPA configuration data.
Adds check into the configuration load from virtual media, to ensure it only occurs when the machine booted from virtual media.
IPA will now successfully clean configuration when it encounters a software RAID array that was previously created using entire devices instead of partitions.
IPA now properly checks if the root partition is already mounted. See Story 2008631 for details.
Fixes an issue where metadata erasure cleaning fails for partitions because the read-only file isn’t found, while it is available at the base device. Adds a check for the base device file on failure. See story 2008696.
Fixes incorrect root partition UUID after streaming a raw partition image.
Increase memory usage limit for
qemu-img convertcommand to 2 GiB. See Story 2008667 for details.
The kernel parameter
lldp-timeout(deprecated during the Newton development cycle) has been removed, please use
Fix UEFI boot entry creation for aarch64 when using diskimage-builder created whole disk images.
Provides a more specific error message if a UEFI-incompatible image is used in the UEFI mode.
Adds UUID of the disks to the inventory of block devices that is collected during inspection.
Adds the ability to bring up VLAN interfaces and include them in the introspection report. A new kernel params field is added -
ipa-enable-vlan-interfaces, which defines either the VLAN interface to enable, the interface to use, or ‘all’ - which indicates all interfaces. If the particular VLAN is not provided, IPA will use the LLDP information for the interface to determine which VLANs should be enabled. See story 2008298.
Adds a clean step to erase the Linux kernel’s pstore. The step is disabled by default.
Adds an configuration option which can be encoded into the ramdisk itself or the PXE parameters being provided to instruct the agent to ignore bootloader installation or configuration failures. This functionality is useful to work around well-intentioned hardware which is auto-populating all possible device into the UEFI nvram firmware in order to try and help ensure the machine boots. Except, this can also mean any explict configuration attempt will fail. Operators needing this bypass can use the
ipa-ignore-bootloader-failureconfiguration option on the PXE command line or utilize the
ignore_bootloader_failureoption for the Ramdisk configuration. In a future version of ironic, this setting may be able to be overriden by ironic node level configuration.
Deployers in highly-secure environments can now manually set Ironic API version instead of relying on unauthenticated autodetection via the
ipa-ironic-api-versionon the kernel command line. This is not a recommended configuration.
For Software RAID, the IPA will use partition LABEL along with UUID and PARTUUID passed from the conductor to identify the root partition. The root file system LABEL can be set as value of the
rootfs_uuidimage metadata property.
If enabled, the new clean step ‘erase_pstore’ removes all pstore entries (the oops/panic logs from a failing kernel) upon cleaning. This is to reduce the risk that potentially sensitive data is preserved across instantiations (and therefore different users) of a bare metal node.
Fixes an issue where intermittent or transitory connection issues can cause inspection to fail. The ramdisk now retries to report to inspector a total of five times.
The system file system configuration file for Linux machines, the
/etc/fstabfile is now updated to include a reference to the EFI partition in the case of a partition image base deployment. Without this reference, images deployed using partition images could end up in situations where upgrading the bootloader could fail.
Automatically generated TLS certificates now have their validity starting in the past (1 hour by default) to allow for clock skew.
Fixes the agent process for determining what partition label type to utilize when writing partition images. In many cases, this could fallback to
msdosif the instance flavor was not properly labeled.
Fixes issue where the running system operating mode was not taken into account when writing partition images. The agent now utilises a helper instead of explicitly expecting the flavor derived information to supply all deployment context.
Fixes an issue where deployments of Fedora or Centos can hang when using grub2 with the execution of the
grub2-mkconfigcommand not returning before the deployment process times out. This is because
os-proberwhich can take an extended period of time to evaluate additional unrelated devices for dual-boot scenarios. Since operators are not dual booting their machines enrolled in ironic, it seems like an un-necessary scan and has thus been disabled.
Correctly decodes error messages from ironic API.
mdadmutility is no longer a hard requirement. It’s still required if software RAID is used (even when not managed by ironic).
write_imagedeploy step to actually check and return any errors during its execution.
Fixes the agent’s EFI boot handling such that EFI assets from a partition image are preserved and used instead of overridden. This should permit operators to use Secure Boot with partition images IF the assets are already present in the partition image.
Upon the creation of Software RAID devices, component devices are sometimes kicked out immediately (for no apparent reason). This fix re-adds devices in such cases in order to prevent the component to be missing next time the device is assembled, which, for instance may prevent the UEFI ESPs to be installed properly.
Avoids a traceback when using
install_bootloaderwith whole disk images. If the root UUID cannot be detected, don’t try to call grub.
Agent configuration files found on attached virtual media or config drive devices are now copied to the ramdisk and loaded on start up.