Victoria Series (15.1.0 - 16.0.x) Release Notes¶
16.0.5-15¶
Upgrade Notes¶
On Victoria release, to use certification file on HTTPS connection, iRMC driver requires python-scciclient version to be one of >=0.8.2,<0.9.0, or >=0.9.5,<0.10.0 and packaging >=16.5
Security Issues¶
Modifies the
irmc
hardware type to include a capability to control enforcement of HTTPS certificate verification. By default this is enforced. python-scciclient version must be one of >=0.8.2,<0.9.0 or >=0.9.5,<0.10.0 Or certificate verification will not occur.
Bug Fixes¶
Fixes Ironic integration with Cinder because of changes which resulted as part of the recent Security related fix in bug 2004555. The work in Ironic to track this fix was logged in bug 2019892. Ironic now sends a service token to Cinder, which allows for access restrictions added as part of the original CVE-2023-2088 fix to be appropriately bypassed. Ironic was not vulnerable, but the restrictions added as a result did impact Ironic’s usage. This is because Ironic volume attachments are not on a shared “compute node”, but instead mapped to the physical machines and Ironic handles the attachment life-cycle after initial attachment.
Fixes rebooting into the agent after changing BIOS settings in fast-track mode with the
redfish-virtual-media
boot interface. Previously, the ISO would not be configured.
Adds
driver_info/irmc_verify_ca
option to specify certification file. Default value of driver_info/irmc_verify_ca is True.
16.0.5¶
Bug Fixes¶
Fixes connection caching issues with Redfish BMCs where AccessErrors were previously not disqualifying the cached connection from being re-used. Ironic will now explicitly open a new connection instead of using the previous connection in the cache. Under normal circumstances, the
sushy
redfish library would detect and refresh sessions, however a prior case exists where it may not detect a failure and contain cached session credential data which is ultimately invalid, blocking future access to the BMC via Redfish until the cache entry expired or theironic-conductor
service was restarted. For more information please see story 2009719.
16.0.4¶
Security Issues¶
Fixes an issue with the
/v1/nodes/detail
endpoint where an authenticated user could explicitly ask for aninstance_uuid
lookup and the associated node would be returned to the user with sensitive fields redacted in the result payload if the user did not explicitly haveowner
orlessee
permissions over the node. This is considered a low-impact low-risk issue as it requires the API consumer to already know the UUID value of the associated instance, and the returned information is mainly metadata in nature. More information can be found in Storyboard story 2008976.
Bug Fixes¶
If the agent accepts a command, but is unable to reply to Ironic (which sporadically happens before of the eventlet’s TLS implementation), we currently retry the request and fail because the command is already executing. Ironic now detects this situation by checking the list of executing commands after receiving a connection error. If the requested command is the last one, we assume that the command request succeeded.
Fixes fast-track to prevent marking the agent as alive if trying to rebuild a node before the fast-track timeout has expired.
Fixes potential cache coherency issues by caching the AgentClient per task, rather than globally.
Fixes the
[deploy]configdrive_use_object_store
option that was broken during the Python 3 transition.
Fixes an issue with the
/v1/nodes/detail
endpoint where requests for an explicitinstance_uuid
match would not follow the standard query handling path and thus not be filtered based on policy determined access level and node levelowner
orlessee
fields appropriately. Additional information can be found in story 2008976.
Fixes recognition of a busy agent to also handle recognition during deployment steps by more uniformly detecting and identifying when the
ironic-python-agent
service is busy.
Fixes the problem about grub2 config file. Some higher versions of grub2 (e.g. 2.05 or 2.06-rc1) use grub.cfg-01-MAC, while another lower versions of grub2 (e.g. 2.04) use MAC.conf, so we generate both paths in order to be compatible with both.
Fixes
idrac-wsman
management interfaceset_boot_device
method that would fail deployment when there are existing jobs present with error “Failed to change power state to ‘’power on’’ by ‘’rebooting’’. Error: DRAC operation failed. Reason: Unfinished config jobs found: <list of existing jobs>. Make sure they are completed before retrying.”. Now there can be non-BIOS jobs present during deployment. This will still fail for cases when there are BIOS jobs present. In such cases should consider moving toidrac-redfish
that does not have this limitation when setting boot device.
Fixed an issue where provisioning/cleaning would fail on IPv6 routed provider networks. See bug: 2009773.
Fixes
idrac-wsman
BIOSapply_configuration
andfactory_reset
clean and deploy steps to fail correctly in case of error when checking completed jobs. Before the fix when BIOS job failed, then node clean or deploy failed with timeout instead of actual error in cleaning or deploying step.
Fixes redfish firmware update for ilo5 based hardware by making necessary changes to check whether sushy_task.messages is present, since in case of iLo task data does not contain messages attribute. Also it was not calling prepare_ramdisk() before rebooting the system to update the firmware which has been fixed in this patch.
Fixes
idrac-wsman
power interface to wait for the hardware to reach the target state before returning. For systems where soft power off at the end of deployment to boot to instance failed and forced hard power off was used, this left node successfully deployed in off state without any errors. This broke other workflows expecting node to be on booted into OS at the end of deployment. Additional information can be found in story 2009204.
Correctly wipes agent token on inspection start and abort.
Calculating the ipmitool -N and -R arguments from ironic.conf [ipmi] command_retry_timeout and min_command_interval now takes into account the 1 second interval increment that ipmitool adds on each retry event.
Failure-path ipmitool run duration will now be just less than command_retry_timeout instead of much longer.
Adds handling of Redfish BMC’s which lack a
BootSourceOverrideMode
flag, such that it is no longer a fatal error for a deployment if the BMC does not support this field. This most common on BMCs which feature only a partial implementation of theComputerSystem
resourceboot
, but may also be observable on some older generations of BMCs which recieved updates to have partial Redfish support.
The
redfish-virtual-media
boot interface no longer passes validation for Dell nodes. Theidrac-redfish-virtual-media
boot interface must be used for these nodes instead.
The fix for story 2008252 synced the boot mode after changing the boot device because Supermicro nodes reset the boot mode if not included in the boot device set. However this can cause a problem on Dell nodes when changing the mode uefi->bios or bios->uefi, see story 2008712 for details. Restrict the syncing of the boot mode to Supermicro.
Retries virtual media insert on failure to allow for an eject that may not have finished. https://storyboard.openstack.org/#!/story/2008504
Fixes a bug where a conductor could fail to complete a deployment if there was contention on a shared lock. This would manifest as an instance being stuck in the “deploying” state, though the node had in fact started or even completed its final boot.
When Ironic configures the BootSourceOverrideTarget setting via Redfish, on Supermicro BMCs it must always configure BootSourceOverrideEnabled or that will revert to default (Once) on the BMC, see story 2008547 for details. This is different than what is currently implemented for other BMCs in which the BootSourceOverrideEnabled is not configured if it matches the current setting (see story 2007355).
This requires that node.properties[‘vendor’] be ‘supermicro’ which will be set by Ironic from the Redfish system response or can be set manually.
Introduces lazy-loading of ports, portgroups, volume connections and volume targets in task manager to fix performance issues. For periodic tasks which create a task manager object but don’t require the aforementioned data (e.g. power sync), this change should reduce the number of database interactions by around two thirds, speeding up overall execution.
Fixes an issue of powering off with the
idrac-wsman
management interface while the execution of a clear job queue cleaning step is proceeding. Prior to this fix, the clean step would fail when powering off a node.
16.0.3¶
Upgrade Notes¶
An automated detection of a IPMI BMC hardware vendor has been added to appropriately handle IPMI BMC variations. Ironic will now query this and save this value if not already set in order to avoid querying for every single operation. Operators upgrading should expect an elongated first power state synchronization if for nodes with the
ipmi
hardware type.
Bug Fixes¶
Fixes
idrac-wsman
RAIDcreate_configuration
clean step,apply_configuration
deploy step anddelete_configuration
clean and deploy step to fail correctly in case of error when checking completed jobs. Before the fix when RAID job failed, then node cleaning or deploying failed with timeout instead of actual error in clean or deploy step.
Fixes issues when
UEFI
boot mode has been requested with persistent boot toDISK
where some versions ofipmitool
do not properly handle multiple options being set at the same time. While some of this logic was addressed in upstream ipmitool development, new versions are not released and vendors maintain downstream forks of the ipmitool utility. When considering vendor specific selector differences along with the current stance of new versions from the upstreamipmitool
community, it only made sense to handle this logic with-in Ironic. In part this was because if already set the selector value would not be updated. Now ironic always transmits the selector value forUEFI
.
Fixes handling of Supermicro
UEFI
supporting BMCs with theipmi
hardware type such that an appropriate boot device selector value is sent to the remote BMC to indicate boot from local storage. This is available for both persistent and one-time boot applications. For more information, please consult story 2008241.
Fixes handling of the
ipmi
hardware type whereUEFI
boot mode and “one-time” boot to PXE has been requested. As Ironic now specifically transmits the raw commands, this setting should be properly appied where previously PXE boot operations may have previously occured inLegacy BIOS
mode.
Fixes cleaning with the
ramdisk
deploy interface by reusing the same procedure as for thedirect
deploy interface.
Boot mode is now correctly handled when using
redfish-virtual-media
boot with locally booted images.
Failed cleaning no longer results in maintenance mode if no clean step is running, e.g. on PXE timeout or failed clean steps validation.
Fixes permission issues when injecting network data into a virtual media.
Other Notes¶
Adds a
detect_vendor
management interface method to theipmi
hardware type. This method is being promoted as a higher level interface as the fundamental need to be able to have logic aware of the hardware vendor is necessary with vendor agnostic drivers where slight differences require slightly different behaviour.
The
configdrive
argument to some utils inironic.common.images
andironic.drivers.modules.image_utils
has been replaced with a newinject_files
argument. The previous approach did not really work in all situations and we don’t expect 3rd party drivers to use it.
16.0.2¶
Known Issues¶
When
redfish-virtual-media
is used, fast-track mode will not work as expected, nodes will be rebooted between operations.
Upgrade Notes¶
The default value of
[api]api_workers
is now limited to 4. Set it explicitly if you need a higher value.
Bug Fixes¶
No longer launches too many API workers on systems with a lot of CPU cores by default.
Correctly handles the node’s custom network data when the
noop
network interface is used. Previously it was ignored.
Fixes incorrect injected network data location when using virtual media.
Fixes
redfish
BIOSapply_configuration
clean and deploy step to fail correctly in case of error when checking if BIOS updates are successfully applied. Before the fix when BIOS updates were unsuccessful, then node cleaning or deploying failed with timeout instead of actual error in clean or deploy step.
When configured to use json-rpc, the
[DEFAULT].host
configuration option to ironic-conductor can now be set to an IPv6 address. Previously it could only be an IPv4 address or a DNS name.
No longer tries to pass
BOOTIF=None
as a kernel parameter when using virtual media. This could break inspection.
Fixes the issue that when the MAC address of a port group is not set and been attached to instance, the landed bond port cannot get IP address due to inconsistent MAC address between the tenant port and the initially allocated one in the config drive.
After changing the boot device via Redfish, check that the boot mode being reported matches what is configured and, if not, set it to the configured value. Some BMCs change the boot mode when the device is set via Redfish, see story 2008252 for details.
The virtual media ISO image building process now respects the
default_boot_mode
configuration option.
Fixes timeout in fast-track mode with
redfish-virtual-media
when running one operation after another (e.g. cleaning after inspection).
16.0.1¶
Bug Fixes¶
Fix an issue when using idrac with vmedia and trying to inspect a node.
Fixes wiping agent token on rebooting via API.
16.0.0¶
Prelude¶
The Ironic team is proud to announce the release of Ironic 16.0.
For over six years, the contributors to this project have continued to drive forth and provide what we collectively feel is the best platform for managing and deploying bare metal hardware.
The innovation, the drive, and the pursuit of improving infrastructure operators’ lives has yet to cease, and has no signs of stopping anytime soon.
As with any release, we have some things we are particularly proud of:
Support for TLS encryption of Agent communications.
Support for in-band deployment steps enabling software RAID to be configured at deployment time.
Ramdisk/Virtual Media pass-through of ISO images.
BMC-less
agent
power control, so BMC’s are not required for deployments.Network configuration injection with virtual media based ramdisks.
Integrated basic authentication for standalone Ironic operators.
And with any major release, a number of bugs have been fixed. Cross-vendor features see increased parity. Every contributor has something to be proud of in this release. And with that, we hope you enjoy it!
New Features¶
Adds
ilo-uefi-https
boot interface toilo5
hardware type. This boot interface leverages the iLO UEFI firmware capability to boot from given HTTPS URLs hosted securely over HTTPS webserver with standard/custom certificates.
Adds functionality to the
ilo
andilo5
hardware types by enabling virtual media boot without user-built deploy/rescue/boot ISO images. Instead, ironic will build necessary images out of common kernel/ramdisk pair (though user needs to provide ESP image). User provided deploy/rescue/boot ISO images are also supported.
Adds support of DHCP less deploy to
ilo
andilo5
hardware types by using thenetwork_data
property on the node field, operators can now apply network configuration to be embedded in iLO based Virtual Media based deployment ramdisks which include networking configuration enabling the deployment to operate without the use of DHCP.
Adds an ability to accept a custom TLS certificate in the heartbeat API.
Adds a configuration option
webserver_verify_ca
to support custom certificates to validate URLs hosted on a HTTPS webserver.
Using the
network_data
property on the node field, operators can now apply network configuration to be embedded in Redfish based Virtual Media based deployment ramdisks which include networking configuration enabling the deployment to operate without the use of DHCP. See Redfish driver documentation for more information.
file://
images are now supported in thedirect
deploy interface.
Adds a new possible value for
image_download_source
:local
. When used, evenhttp://
images are downloaded, converted to RAW if needed and served from the conductor’s HTTP server. This feature targets primarily nodes with low RAM.
Adds support in
idrac-wsman
inspect hardware interface for reporting number of GPU devices connected to a system. This information is advertised through capabilitypci_gpu_devices
, which can be used to make scheduling decisions for the node. Currently, NVIDIA Tesla T4 GPU devices are reported.
Adds support for managing BIOS settings via the Redfish out-of-band (OOB) management protocol to the
idrac
hardware type. The new hardware BIOS interface implementation which offers it is namedidrac-redfish
.The
idrac
hardware type declares support for that new interface implementation, in addition to all BIOS interface implementations it has been supporting. The highest priority BIOS interface remains the same, the one which relies on the Web Services Management (WS-Man) OOB management protocol. The newidrac-redfish
immediately follows it. It now supports the following BIOS interface implementations, listed in priority order from highest to lowest:idrac-wsman
,idrac-redfish
, andno-bios
.For more information, see story 2008100.
Adds a new configuration option
[ilo]verify_ca
and a newdriver_info
parameterilo_verify_ca
to enhance certificate verification for hardware type ilo and ilo5 which can take directory and bolean values apart from file.
Adds functionality to perform out-of-band one button secure erase operation for iLO5 based HPE Proliant servers as a
management
clean stepone_button_secure_erase
forilo5
hardware type.
The
image_download_source
configuration option can now also be set per node in theinstance_info
ordriver_info
(the former having the highest priority).
Allows configuring IPMI cipher suite via the new
driver_info
parameteripmi_cipher_suite
.
Adds
driver_internal_info
field to the node-related notificationbaremetal.node.provision_set.*
, new payload version 1.16.
Adds support for performing firmware updates using the
redfish
andidrac
hardware types.A new firmware update cleaning step has been added to the
redfish
hardware type. Theidrac
hardware type also automatically gains this capability through inheritance.
A new configuration option
[agent]require_tls
allows rejecting ramdisk callback URLs that don’t use thehttps://
schema.
Supports the Fujitsu
irmc
hardware type again. The Third Party CI for the driver has started to work correctly in September 2020.
Upgrade Notes¶
The
one_button_secure_erase
clean step in theilo5
hardware type requiresproliantutils
version2.10.0
. Please upgrade this library to leverage this feature.
The default value of the configuration option
[agent]image_download_source
has been changed tohttp
to simplify transition from theiscsi
deploy interface. Set it toswift
explicitly to maintain the previous behavior.
The deprecated
iscsi
deploy interface is no longer enabled by default, setenabled_deploy_interfaces
to override. It is also no longer the first in the list of deploy interface priorities, so it has to be requested explicitly if thedirect
deploy is also enabled.
Since the
direct
deploy interface is now used by default, you need to configure[deploy]http_url
and[deploy]http_root
to point at a local HTTP server or configure access to Swift.
Support for token-less agents has been removed as the token-less agent support was deprecated in the Ussuri development cycle. The ironic-python-agent must be updated to 6.1.0 or higher to support communicating with the Ironic deployment after upgrade. This will generally require deployment, cleaning, and rescue kernels and ramdisks to be updated. If this is not done, actions such as cleaning and deployment will time out as the agent will be unable to record heartbeats with Ironic. For more information, please see the agent token documentation.
The
redfish-virtual-media
boot interface is now the last in the list of priorities from theredfish
hardware type. This means that new nodes will be created withipxe
orpxe
boot if they are enabled. The reason for this change is limited support for pure Redfish virtual media from hardware vendors.To use virtual media with Redfish, please provide an explicit
boot_interface
parameter when creating nodes. If you enable only theredfish
hardware type, you can also set thedefault_boot_interface
configuration option toredfish-virtual-media
.
Deprecation Notes¶
The
[ilo]ca_file
configuration option is deprecated for removal, please use[ilo]verify_ca
instead which can take directory and boolean values apart from file for certificate verification.
The
iscsi
deploy interface is now deprecated,direct
oransible
deploy should be used instead. We expected the complete removal of theiscsi
deploy code to happen in the “X” release.
With the switch from neutronclient to openstacksdk the
[neutron]/retries
option has been deprecated, use[neutron]/status_code_retries
and[neutron]/status_code_retry_delay
instead.
Security Issues¶
Ramdisks supporting agent token are now globally required by Ironic. As this is a core security mechanism, it cannot be disabled and support for the
[DEFAULT]require_agent_token
configuration parameter has been removed as tokens are now always required by Ironic. For more information, please see the agent token documentation.
Bug Fixes¶
Fixes compatability with some hardware that requires the file name of any virtual media to end with the suffix “.iso” when Ironic generates a virtual media image. We recommend operators generating their own virtual media files to name the files with proper extensions.
Fixes the deployment failure with Ussuri (and older) ramdisks that happens when another IPA command runs after
prepare_image
.
Fixes an issue with the
ansible
deployment interface where automatic root deviec selection would accidently choose the system CD-ROM device, which was likely to occur when the ansible deployment interface was used with virtual media boot. Theansible
deployment interface now ignores all Ramdisks, Loopbacks, CD-ROMs, and floppy disk devices.
Fixes an issue that caused in-band deploy steps inserted before
write_image
to be skipped when fast-track is used.
Fixes an issue where in-band deploy and clean steps were being cached across reboots of the agent.
Fixed iRMC inspection for getting MAC address.
Fixes an issue with agent token handling where the agent has not been upgraded resulting in an AgentAPIError, when the token is not required. The conductor now retries without sending an agent token.
Fixes a potential race in the hash ring code that could result in the hash rings never updated after their initial load.
Fixes the deprecated
idrac
hardware interface implementation__init__
methods to call their base class__init__
methods before emitting a log message warning about their deprecation. For more information, see story 2008197.
Fixes an issue where agent heartbeats would be queued if a pre-existing lock was being held for the node which performed a heartbeat operation. The agent heartbeat implementation will no longer retry attempts to acquire an exclusive lock.
Prevents a take over from happening in the middle of a deploy step processing. This could happen if the RPC call
continue_node_deploy
is routed to a different conductor.
Fixes wiping the agent secret token on manual power off or reboot. Also makes sure to remove the agent URL since it may potentially change.
Fixes HTTP 500 when trying to unset the
protected
attribute via the CLI.
Fixes cleaning and managed inspection not respecting the
default_boot_mode
configuration option.
Fixes cleaning and managed inspection not following the standard boot mode handling logic, particularly, not trying to assert the requested boot mode if the driver allows it.
Fixes
redfish
BIOS interfaceapply_configuration
cleaning/deploy step to work with Redfish Services that must be supplied the Distributed Management Task Force (DMTF) Redfish standard@Redfish.SettingsApplyTime
annotation [1] to specify when to apply the requested settings, such as the Dell EMC integrated Dell Remote Acesss Controller (iDRAC).For more information, see story 2008163.
[1] http://redfish.dmtf.org/schemas/DSP0266_1.11.0.html#settings-resource
No longer silently ignores exceptions that happen when trying to run the next clean or deploy step.
Other Notes¶
The ironic conductor internal logic has been updated to return an error if no agent version has been submitted during a heartbeat. This is because versions have been transmitted by the agents for quite some time and support for the default use of agent token forces all agents to be updated. As such redundant code been removed and tests updated accordingly.
Communication with neutron is now using openstacksdk, removing the dependency on neutronclient.