Current Series Release Notes¶
httpboot interface, based upon the
pxeboot interface which informs the DHCP server of an HTTP URL to boot the machine from, and then requests the BMC boot the machine in UEFI HTTP mode.
http-ipxeboot interface, based upon the
ipxeboot interface which informs the DHCP server of an HTTP URL to boot the machine from, and then requests the BMC boot the machine in UEFI HTTP mode.
Adds node auto-discovery support to the
Add support for ovn vtep switches. Operators will be able to use logical and physical switches. Minimally tested in production.
Testing of the
httpboot interface with Ubuntu 22.04 provided Grub2 yielded some intermittent failures which appear to be more environmental in nature as the signed Shim loader would start, then load the GRUB loader, and then some of the expected files might be attempted to be accessed, and then fail due to an apparent transfer timeout. Consultation with some grub developers concur this is likely environmental, meaning the specific grub build or CI performance related. If you encounter any issues, please do not hestitate to reach out to the Ironic developer community.
Adds an online migration to the new inspection interface. If the
agentinspection is enabled and the
inspectorinspection is disabled, the
inspect_interfacefield will be updated for all nodes that use
inspectorand are currently not on inspection (i.e. not in the
If some nodes may be inspecting during the upgrade, you may want to run the online migrations several times with a delay to finish migrating all nodes.
Provides a fix for
servicerole support to enable the use case where a dedicated service project is used for cloud service operation to facilitate actions as part of the operation of the cloud infrastructure.
OpenStack clouds can take a variety of configuration models for service accounts. It is now possible to utilize the
[DEFAULT] rbac_service_role_elevated_accesssetting to enable users with a
servicerole in a dedicated
serviceproject to act upon the API similar to a “System” scoped “Member” where resources regardless of
lesseesettings are available. This is needed to enable synchronization processes, such as
networking-baremetalML2 plugin to perform actions across the whole of an Ironic deployment, if desirable where a “System” scoped user is also undesirable.
This functionality can be tuned to utilize a customized project name aside from the default convention
service, for example
admin, utilizing the
Operators can alternatively entirely override the
service_roleRBAC policy rule, if so desired, however Ironic feels the default is both reasonable and delineates sufficiently for the variety of Role Based Access Control usage cases which can exist with a running Ironic deployment.
Adds the capability to define a
default_conductor_groupsetting which allows operators to assign a default conductor group to new nodes created in Ironic if they do not otherwise have a
conductor_groupset upon creation. By default, this setting has no value.
Adds support for Redfish based HTTPBoot, which leveragings the DMTF Redfish
ComputerSystemresource in a BMC, to assert the URL for the next boot operation. This requires Sushy 4.7.0 as the minimum version.
Adds a new capability allowing to attach or detach generic iso images as virtual media devices after a node has been provisioned.
Previously the key for building temporary URLs from Swift was taken from the x-account-meta-temp-url-key header in the object store account. Now the header x-account-meta-temp-url-key-2 is also checked, which allows password rotation to occur without breaking old URLs.
This applies to the following temporary URL scenarios:
Temp URL image transfer from Glance (when [glance]swift_temp_url_key is not set)
Publishing an image with the Swift publisher ([redfish]use_swift=True or [ilo]use_web_server_for_images=False)
Storing the config drive in Swift ([deploy]configdrive_use_object_store=True)
Fetching Swift stored firmware update payloads.
Introducing basic authentication and configurable authentication strategy support for image and image checksum download processes. This feature introduces 3 new configuration variables that could be used to select the authentication strategy and provide credentials for authentication strategies. The 3 variables are structured in way that 1 of them
[deploy]image_server_auth_strategy(string) provides the ability to select between authentication strategies by specifying the name of the authentication strategy.
Currently the only supported authentication strategy is the
http-basicwhich will make IPA use HTTP(S) basic authentication also known as the
RFC 7617standard. The other 2 variables are
[deploy]image_server_userprovide username and password credentials for image download processes. The
[deploy]image_server_userare not strategy specific and could be reused for any username + password based authentication strategy, but for the moment these 2 variables are only used for the
[deploy]image_server_auth_strategydoesn’t just enable the feature but enforces checks on the values of the 2 related credentials. When the
http-basicstrategy is enabled for image server download workflow the download logic will make sure to raise an exception in case any of the credentials are None or an empty string.
Example of activating the
http-basicstrategy can be found in HTTP(s) Authentication strategy for user image servers section of the admin guide.
The Ironic service API Role Based Access Control policy has been updated to disable the legacy RBAC policy by default. The effect of this is that deprecated legacy roles of
baremetal_observerare no longer functional by default, and policy checks may prevent actions such as viewing nodes when access rights do not exist by default.
This change is a result of the new policy which was introduced as part of Secure Role Based Access Control effort along with the Consistent and Secure RBAC community goal and the underlying
[oslo_policy] enforce_new_defaultssettings being changed to
The Ironic project believes most operators will observe no direct impact from this change, unless they are specifically running legacy access configurations utilizing the legacy roles for access.
Operators which are suddenly unable to list or deploy nodes may have a misconfiguration in credentials, or need to allow the user’s project the ability to view and act upon the node through the node
lesseefields. By default, the Ironic API policy permits authenticated requests with a
systemscoped token to access all resources, and applies a finer grained access model across the API for project scoped users.
Ironic users who have not already changed their
nova-computeservice settings for connecting to Ironic may also have issues scheduling Bare Metal nodes. Use of a
systemscoped user is available, by setting
[ironic] system_scopeto a value of
allin your nova-compute service configuration, which can be done independently of other services, as long as the credentials supplied are also valid with Keystone for system scoped authentication.
Heat users which encounter any issues after this upgrade, should check their user’s roles. Heat’s execution and model is entirely project scoped, which means users will need to have access granted through the
lesseefield to work with a node.
Operators wishing to revert to the old policy configuration may do so by setting the following values in
[oslo_policy] enforce_new_defaults=False enforce_scope=False
Operators who revert the configuration are encourated to make the necessary changes to their configuration, as the legacy RBAC policy will be removed at some point in the future in alignment with 2024.1-Release Timeline. Failure to do so will may force operators to craft custom policy override configuration.
Removes the sphinxcontrib-seqdiag dependency as the Pillow upgrade to version 10.x (from OpenStack upper constraints) breaks its usage. seqdiag has not been maintained for the last 3 years, hence the upgrade causes it to break. In the ironic docs (source) rst files, adds references to svg files, and keeps the svg files in the doc/source/images/ directory, alongside their associated .diag files as backup.
The default value of the configuration option
Truefor the newer
agentinspect interface. The older
inspectorimplementation is not affected. Operators with deployments that support unmanaged inspection must set this value to
python-swiftclient is no longer a dependency, all OpenStack Swift operations are now down using openstacksdk.
Configuration option [swift]swift_max_retries has been removed and any custom value will no longer have any effect on failed object-store operations.
rescue_ramdiskconfiguration options, incorrectly deprecated in the 2023.2 release series, are no longer deprecated.
idrachardware type management interface steps
export_configurationsteps are deprecated, and will be removed once a formalized generic step templating mechanism has been created within Ironic. The Ironic community is open to reconsidering this decision should the overall bulk configuration reset/templating model become adopted by DMTF Redfish as a standardized cross-vendor feature.
ibmchardware type is deprecated due to a lack of upstream communication, driver maintenance, and a recognition that the Redfish hardware type likely works for the users at this point. This driver is expected to be removed during the
xclarityhardware type is deprecated due to a lack of upstream communication, driver maintenance, and a recognition that the Redfish hardware type is suitable for Lenovo hardware users moving forward. This driver is expected to be removed during the
idrac-wsmaninterfaces on the
idrachardware type are deprecated due to a lack of upstream communiation, and the decision of the driver’s maintainer in the past to move in to the direction of using Redfish for driver interactions. These driver interfaces are expected to be removed during the
Rootwrap support is deprecated since Ironic no longer runs any commands as root. Files
ironic-rootwrapcommand will be removed in a future release.
Firmware components are now also cached on the transition to the
manageablestate in addition to cleaning. This is consisent with how BIOS settings, vendor and boot mode are cached.
Fixes the behavior of
file:///image URLs pointing at a symlink. Ironic no longer creates a hard link to the symlink, which could cause confusing FileNotFoundError to happen if the symlink is relative.
Nodes no longer get stuck in cleaning when the firmware components caching code raises an unexpected exception.
Prevents a database constraints error on caching firmware components when a supported component does not have the current version.
Fixes an issue when listing allocations as a project scoped user when the legacy RBAC policies have been disabled which forced an HTTP 406 error being erroneously raised. Users attempting to list allocations with a specific owner, different from their own, will now receive an HTTP 403 error.
In case the lldp raw data collected by the inspection process includes non utf-8 information, the parser fails breaking the inspection process. This patch works around that excluding the malformed data and adding an entry in the logs to provide information on the failed tlv.
Fixes an issue where a System Scoped user could not trigger a node into a
manageablestate with cleaning enabled, as the Neutron client would attempt to utilize their user’s token to create the Neutron port for the cleaning operation, as designed. This is because with requests made in the
systemscope, there is no associated project and the request fails.
Ironic now checks if the request has been made with a
systemscope, and if so it utilizes the internal credential configuration to communicate with Neutron.
When configured to listen on a unix socket, Ironic will now properly cleanup the unix socket on a clean service stop.
idrachardware type is now compatible with the
redfishfirmware interface. The link between them was missing initially.
Fixes the inspection lookup to consider all nodes with the same BMC hostname, as can happen with Redfish. In this case, the nodes are distinguished by MAC addresses.
Fixes getting details of a conductor if it uses a non-standard JSON RPC port or an IPv6 address as the name, e.g.
GET /v1/conductors/[2001:db8::1]:8090. Previously, it would result in a HTTP error 400.
enable_netboot_fallbackto write out pxe config on adopt.
When configuring secure boot via Redfish, internal server errors are now retried for a longer period than by default, accounting for the SecureBoot resource unavailability during configuration on some hardware.
Fixes Raid creation issue in iLO6 and other BMC with latest schema by removing ‘VolumeType’, ‘Encrypted’ and changing placement of ‘Drives’ to inside ‘Links’.
Fixes the payload format required to query physical storage drives using redfish, when configuring RAID using redfish.
Uses the volume_name provided in the target_raid_config field of a node to set the storage volume name when configuring RAID with the redfish driver (instead of discarding the volume_name given in target_raid_config)
Use the ‘volume_name’ field from the logical_disk in the target_raid_config field of a node, instead of just ‘name’ (which is incorrect as per the Ironic API expectation), to create the RAID volume using the Redfish driver
ilohardware types may be deprecated in the future for removal or major changes, however our last communication with the maintainers as of the
2024.1Project Teams Gathering sessions indiated they were still working to determine their own forward path with a strong emphasis on the use of Redfish.
SIGUSR2to a conductor process will now trigger a drain shutdown. This is similar to a
SIGTERMgraceful shutdown but the timeout is determined by
[DEFAULT]drain_shutdown_timeoutwhich defaults to
1800seconds. This is enough time for running tasks on existing reserved nodes to either complete or reach their own failure timeout.
During the drain period the conductor will be removed from the hash ring to prevent new tasks from starting. Other conductors will no longer fail reserved nodes on the draining conductor, which previously appeared to be orphaned. This is achieved by running the conductor keepalive heartbeat for this period, but setting the
While Ironic has not explicitly added support for OVN, because that is in theory a Neutron implementation detail, we have added some basic testing and are pleased to announce that you can use OVN’s DHCP service for IPv4 based provisioning with OVN v23.06.00 and beyond. This is not without issues, and we’ve added ovn documentation as a result to help provide as much Ironic operator clarity as possible.
Use of OVN may require disabling SNAT for provisioning with IPv4 when using TFTP. This is due to the Linux Kernel, and how IP packet handling occurs with OVN. No solution is known to this issue, and use of provisioning technologies which do not use TFTP is also advisable.
Use of OVN may require careful attention to the MTUs of networks. Oversized packets and networking may be dropped. That being said this is more likely an issue for testing than with actual physical baremetal in a production deployment.
Use of OVN for IPv6 based PXE/iPXE is not supported by Neutron. The Ironic project expects this to be addressed during the Caracal (2024.1) development cycle.
When configuring a single-conductor environment, make sure the number of worker pools (
[conductor]worker_pool_size) is larger than the maximum parallel deployments (
[conductor]max_concurrent_deploy). This was not the case by default previously (the options used to be set to 100 and 250 accordingly).
Because of a fix in the internal worker pool handling, you may now start seeing requests rejected with HTTP 503 under a very high load earlier than before. In this case, try increasing the
[conductor]worker_pool_sizeoption or consider adding more conductors.
The default worker pool size (the
[conductor]worker_pool_sizeoption) has been increased from 100 to 300. You may want to consider increasing it even further if your environment allows that.
parent_nodefield, a newly added API field, has been constrained to store UUIDs over the names of nodes. When names are used, the value is changed to the UUID of the node.
Properly eject the virtual media from a DVD device in case this is the only MediaType available from the Hardware, and Ironic requested CD as the device to be used. See bug 2039042 for details.
When Ironic hits the limit on the number of the concurrent deploys (specified in the
[conductor]max_concurrent_deployoption), the resulting HTTP code is now 503 instead of the more generic 500.
external_http_urlsetting in the driver info is now used for a boot ISO. Previously this setting was only used for a config floppy.
Fixes issue of changing or getting state of indicator LED of attached disk caused by misunderstanding SimpleStorage provides this functionality but actually Storage resource does.
Fixes handling new requests when the maximum number of internal workers is reached. Previously, after reaching the maximum number of workers (100 by default), we would queue the same number of requests (100 again). This was not intentional, and now Ironic no longer queues requests if there are no free threads to run them.