Yoga Series (19.0.0 - 20.1.x) Release Notes¶
20.1.2¶
Upgrade Notes¶
Adds
sha256
,sha384
andsha512
as supported SNMPv3 authentication protocols to iRMC driver.
Bug Fixes¶
Fixes Ironic integration with Cinder because of changes which resulted as part of the recent Security related fix in bug 2004555. The work in Ironic to track this fix was logged in bug 2019892. Ironic now sends a service token to Cinder, which allows for access restrictions added as part of the original CVE-2023-2088 fix to be appropriately bypassed. Ironic was not vulnerable, but the restrictions added as a result did impact Ironic’s usage. This is because Ironic volume attachments are not on a shared “compute node”, but instead mapped to the physical machines and Ironic handles the attachment life-cycle after initial attachment.
When aborting cleaning, the
last_error
field is no longer initially empty. It is now populated on the state transition toclean failed
.
When cleaning or deployment fails, the
last_error
field is no longer temporary set toNone
while the power off action is running.
Fixes an issue where if selinux is enabled and enforcing, and the published image is a hardlink, the source selinux context is preserved, causing access denied when retrieving the image using hardlink URL.
Fixes bug of iRMC driver in parse_driver_info where, if FIPS is enabled, SNMP version is always required to be version 3 even though iRMC driver’s xxx_interface doesn’t use SNMP actually.
Fixes
'NoneType' object is not iterable
in conductor logs forredfish
andidrac-redfish
RAID clean and deploy steps. The message should no longer appear. For affected nodes re-create the node or deleteraid_configs
entry fromdriver_internal_info
field.
Fixes an issue in the online upgrade logic where database models for Node Traits and BIOS Settings resulted in an error when performing the online data migration. This was because these tables were originally created as extensions of the Nodes database table, and the schema of the database was slightly different enough to result in an error if there was data to migrate in these tables upon upgrade, which would have occured if an early BIOS Setting adopter had data in the database prior to upgrading to the Yoga release of Ironic.
The online upgrade parameter now subsitutes an alternate primary key name name when applicable.
Fixes SNMPv3 message authentication and encryption functionality of iRMC driver. The SNMPv3 authentication between iRMC driver and iRMC was only by the security name with no passwords and encryption. To increase security, the following parameters are now added to the node’s
driver_info
, and can be used for authentication:irmc_snmp_user
irmc_snmp_auth_password
irmc_snmp_priv_password
irmc_snmp_auth_proto
(Optional, defaults tosha
)irmc_snmp_priv_proto
(Optional, defaults toaes
)
irmc_snmp_user
replacesirmc_snmp_security
.irmc_snmp_security
will be ignored ifirmc_snmp_user
is set.irmc_snmp_auth_proto
andirmc_snmp_priv_proto
can also be set through the following options in the[irmc]
section of/etc/ironic/ironic.conf
:snmp_auth_proto
snmp_priv_proto
Fixes a race condition in PXE initialization where logic to retry what we suspect as potentially failed PXE boot operations was not consulting if an
agent token
had been established, which is the very first step in agent initialization.
Fixes an issue where an agent token was being orphaned if a baremetal node timed out during cleaning operations, leading to issues where the node would not be able to establish a new token with Ironic upon future in some cases. We now always wipe the token in this case.
Other Notes¶
Updates the minimum version of
python-scciclient
library to0.12.2
.
20.1.1¶
Known Issues¶
When using
jsonschema
4.0.0 or newer, make sure to include a proper$schema
field in your custom network data or RAID schemas.
Security Issues¶
Modifies the
irmc
hardware type to include a capability to control enforcement of HTTPS certificate verification. By default this is enforced. python-scciclient version must be one of >=0.8.2,<0.9.0, >=0.9.4,<0.10.0, >=0.10.1,<0.11.0, >=0.11.3,<0.12.0 or >=0.12.0,<0.13.0 Or certificate verification will not occur.
Bug Fixes¶
Fixes detecting of allowable values for a BIOS settings enumeration in the
redfish
BIOS interface when onlyValueDisplayName
is provided.
The combined
ironic
executable now starts the API only after the built-in conductor starts. This avoids error 500 on requests while the conductor is starting.
Fixes an issue where a conductor would attempt local takeover. In case of heartbeat failure due to resource starvation, the current conductor was detected as offline when querying the database. In this scenario the conductor would forcibly remove reservations of it’s own and initiate takeover. Current conductor is now excluded from the list of offline conductors, so that local takeover does not occur for this case. A warning is logged to highlight the potential resource starvation issue. See bug: 2010016.
Fixes rebooting into the agent after changing BIOS settings in fast-track mode with the
redfish-virtual-media
boot interface. Previously, the ISO would not be configured.
Fixes
OSError: [Errno 36] File name too long
when building a virtual media ISO from a long kernel, ramdisk or ESP URL.
Fixes
redfish
andidrac-redfish
RAIDcreate_configuration
,apply_configuration
,delete_configuration
clean and deploy steps to update node’sraid_config
field at the end of the steps.
Fixes
redfish-virtual-media
boot
interface to allow it with iDRAC firmware from 6.00.00.00 (released June 2022) as it has virtual media boot issue fixed that prevented iDRAC firmware to work withredfish-virtual-media
before. Consider upgrading iDRAC firmware if not done already, otherwise will still get an error when trying to useredfish-virtual-media
with iDRAC.
Adds
driver_info/irmc_verify_ca
option to specify certification file. Default value of driver_info/irmc_verify_ca is True.
Fix a bug when configuring RAID caused by not converting the port value to int type when the node managed by the irmc hardware type.
Fixes API error messages with jsonschema>=4.8. A possible root cause is now detected for generic schema errors.
Fixes compatibility with
jsonschema
package version 4.0.0 or newer by providing a proper schema version (Draft-07 currently).
When the
ramdisk
deploy interface is used and automated cleaning is disabled, thepxe
,ipxe
andredfish-virtual-media
boot interfaces no longer require a deploy kernel/ramdisk to be provided.
Fixes an issue where the Redfish session cache would continue using an old session when a password for a Redfish BMC was changed. Now the old session will not be found in this case, and a new session will be created with the latest credential information available.
Resolved clear_job_queue and reset_idrac verify step failures which occur when the functionality is not supported by the iDRAC. When this condition is detected, the code in the step handles the exception and logs a warning and completes successfully in case of verification steps but fails in case of cleaning steps.
Other Notes¶
Known issue when using iDRAC with Swift to stage firmware update files in Management interface
firmware_update
clean step ofredfish
oridrac
hardware type has been fixed in iDRAC firmware 6.00.00.00. Upgrade when possible or use HTTP service to stage firmware files for iDRAC.
20.1.0¶
Prelude¶
The Ironic community is pleased to announce the release of Ironic 20.1.
During the Yoga cycle, we had forty-three contributors. They are responsible for more than 35,000 lines of code and more than twenty new features that will improve the experience of our end-users! Please reach out to our community if you have any questions or feedback!
New Features¶
For
redfish
andidrac-redfish
management interfacefirmware_update
clean step adds Swift, HTTP service and file system support to serve and Ironic’s HTTP and Swift service to stage files. Also adds mandatory parameterchecksum
for file checksum verification.
Adds support for
idrac-wsman
RAID, BIOS and management clean steps to be run without IPA when disabling ramdisk during cleaning.
Supports listening on a Unix socket instead of a normal TCP socket. This is useful with an HTTP server such as nginx in proxy mode.
Known Issues¶
When using iDRAC with Swift to stage firmware update files in Management interface
firmware_update
clean step ofredfish
oridrac
hardware type, the cleaning fails with error “An internal error occurred. Unable to complete the specified operation.” in iDRAC job. Until this is fixed, use HTTP service to stage firmware files for iDRAC.
Upgrade Notes¶
For
redfish
andidrac-redfish
management interfacefirmware_update
clean step there is now mandatorychecksum
parameter necessary. Update existing clean steps to include it, otherwise clean step will fail with error “‘checksum’ is a required property”.
Deprecation Notes¶
Booting final instances via network (as opposed to via a local bootloader) is now deprecated, except for the cases of booting from volume or the ramdisk deploy interface.
Network boot for whole disk images only works reliable for legacy (BIOS) boot. In case of partition images, there is no way to update the kernel, which makes this approach insecure.
Users of partition images must ensure that they either contain the
grub-install
binary, enough EFI artifacts to boot the operating system or a legacy boot partition.
Bug Fixes¶
The anaconda deploy interface was treating the config drive as a dict, whereas it could be a dict or in iso6600 format, gzipped and base64-encoded. This has been fixed.
The anaconda deploy interface was adding commands that deal with the config drive, to the end of the kickstart config file. Which means that they are handled after an ironic API request is sent (to the conductor) to indicate that the node has been provisioned and is ready to be rebooted. Which means that there is a possible race condition wrt these commands being completed before the node is powered off. A sync is added to ensure that all modifications have been written to disk, before the API request is sent – as the last thing.
Extra newlines (’n’) were incorrectly added to the user data content. This broke the content-type decoding and cloud-init was unable to proces them. The extra newlines have been removed.
Fixes the logic for the anaconda deploy interface. If the ironic node’s instance_info doesn’t have both ‘stage2’ and ‘ks_template’ specified, we weren’t using the instance_info at all. This has been fixed to use the instance_info if it was specified. Otherwise, ‘stage2’ is taken from the image’s properties (assumed that it is set there). ‘ks_template’ value is from the image properties if specified there (since it is optional); else we use the config setting ‘[anaconda] default_ks_template’.
For the anaconda deploy interface, the ‘stage2’ directory was incorrectly being created using the full path of the stage2 file; this has been fixed.
The anaconda deploy interface expects the node’s instance_info to be populated with the ‘image_url’; this is now populated (via PXEAnacondaDeploy’s prepare() method).
For the anaconda deploy interface, when the deploy was finished and the bm node was being rebooted, the node’s provision state was incorrectly being set to ‘active’ – the provisioning state-machine mechanism now handles that.
For the anaconda deploy interface, the code that was doing the validation of the kickstart file was incorrect and resulted in errors; this has been addressed.
For the anaconda deploy interface, the ‘%traceback’ section in the packaged ‘ks.cfg.template’ file is deprecated and fails validation, so it has been removed.
The anaconda deploy interface was saving internal information in the node’s
instance_info
, in the user-facingstage2
andks_template
fields. This broke rebuilds using a different image with differentstage2
or template specified in the image properties. This has been fixed by saving the information in the node’sdriver_internal_info
instead.
Fixes the
redfish
hardware type RAID device creation and deletion when creating or deleting more than 1 logical disk on RAID controllers that require rebooting and do not allow more than 1 running task per RAID controller. Before this fix 2nd logical disk would fail to be created or deleted. With this change it is now possible to useredfish
raid
interface on iDRAC systems.
20.0.0¶
New Features¶
The configuration options
enabled_power_interfaces
andenabled_management_interfaces
are now empty by default. If left empty, their values will be calculated based onenabled_hardware_types
.
The Bare Metal service is now capable of calculating the default value for any
enabled_***_interfaces
based onenabled_hardware_types
.
Fast track mode can now be enabled or disabled per node:
baremetal node set <node> --driver-info fast_track=true
Adds support for
idrac-redfish
RAID and management clean steps to be run without IPA when disabling ramdisk during cleaning.
Introduces a new explicit
instance_info
parameterimage_type
, which can be used to distinguish between partition and whole disk images instead of akernel
/ramdisk
pair.Adding
kernel
andramdisk
is no longer necessary for partition images ifimage_type
is set topartition
and local boot is used.The corresponding Image service property is called
img_type
.
Manually copying the initial grub config for grub network boot is no longer necessary, as this file is now written to the TFTP root directory on conductor startup. A custom template can be used to generate this file with config option
[pxe]initial_grub_template
.
The new
[glance] swift_account_prefix
parameter has been added. This parameter be set according to thereseller_prefix
parameter inproxy-server.conf
of Swift.
Now the export configuration step from the
idrac-redfish
management interface does not export iDRAC BMC connection settings to avoid overwriting those in another system when using unmodified configuration mold in import step. For import step it is still possible to add these settings back manually.
Upgrade Notes¶
Bootloader installation failures are now fatal for whole disk images. Previously these failures were ignored to facilitate backwards compatibility with older Ironic Python Agents, however we can now rely on having a sufficiently modern IPA.
The configuration option
[inspector]power_off
is now ignored for nodes that have fast-track enabled. These nodes are never powered off.
Foreign keys are now enabled when SQLite is used as a database.
Bug Fixes¶
No longer validates boot interface parameters when adopting a node that uses local boot.
Fixes pagination for the following collections:
/v1/allocations /v1/chassis /v1/conductors /v1/deploy_templates /v1/nodes/{node}/history
The
next
link now contains a valid URL.
Fix Node Console Duplicate Sol Session. This patch adds “sol deactivate” action before start node console, to make sure the current console connection always a success. See story 2009762.
Fixed an issue where duplicate extra DHCP options was passed in the port update request to the Networking service. The duplicate DHCP options caused an error in the Networking service and node provisioning would fail. See bug: 2009774.
Fixed an issue where provisioning/cleaning would fail on IPv6 routed provider networks. See bug: 2009773.
Fixes validation of input argument
firmware_images
ofredfish
hardware type clean stepupdate_firmware
. Now it validates the argument at the beginning of clean step. Prior to this fix issues were determined at the time of executing firmware update or not at all (for example, mistyping optional field ‘wait’).
Fixes
redfish
hardware typeupdate_firmware
cleaning step to work with Sushy version 4.0.0 or greater.
Fixes hardware type
redfish
RAID interface deploy steps when completion requires rebooting system for non-immediate configuration application. Prior to this fix, such nodes would remain inwait call-back
state indefinitely.
Fixes the determination of a failed RAID configuration task in the
redfish
hardware type. Prior to this fix, the tasks that have failed were reported as successful.
Fixes an issue where clients would get a 404 due to the node pagination breaking at max_limit due to an uninitialised resource_url.
Fixes an issue where clients would get a 404 due to the port and portgroups pagination breaking at max_limit due to an uninitialised resource_url.
Fixes the
initrd
kernel parameter when booting ramdisk directly from Swift/RadosGW using iPXE. Previously it was alwaysdeploy_ramdisk
, even when the actual file name is different.
Inspection no longer fails when one of the NICs reports NIC address that is not a valid MAC (e.g. a WWN).
The image cache now respects the
Cache-Control: no-store
header for HTTP(s) images.
File images are no longer cached in the image cache to avoid unnecessary consumption of the disk space.
Services (
ironic
,ironic-api
,ironic-conductor
) now correctly return a non-zero exit code on start-up failures.
The
ironic
andironic-conductor
services now wait for the conductor manager to start before notifying systemd about the successful start-up.
Other Notes¶
The
fake-hardware
hardware types now explicitly declares support for all generic interface implementations, so that they can be used in the defaults calculation.
19.0.0¶
New Features¶
Adds support for running
management.clear_job_queue
,management.reset_idrac
andmanagement.known_good_state
methods as verify steps on iDRAC hardware type, for bothidrac-wsman
andidrac-redfish
interfaces. In order to use this feature,[conductor]verify_step_priority_override
needs to be used to set non-zero step priorities for the desired verify steps.
Adds support for verify steps - a mechanism for running optional, actions pre-defined in the driver while the node is in transition from enroll to managable state, prior to inspection.
Adds a new executable
ironic
that starts both API and conductor in the same process. Calls between the API and conductor instances in the same process are not routed through the RPC.
The
ipxe
boot interface is now enabled by default.
Adds a new configuration option
[pxe]ipxe_fallback_script
which allows iPXE boot to fall back to e.g. ironic-inspector iPXE script.
ISO images provided via
instance_info/boot_iso
orinstance_info/deploy_iso
are now cached in a similar way to normal instance images. Set[deploy]iso_master_path
to an empty string to disable.
All image caches are now cleaned up periodically, not only when used. Set
[conductor]cache_clean_up_interval
to tune the interval or disable.
The
redfish
hardware type is now enabled by default along with all its supported hardware interfaces.
Adds a new
none
RPC transport that can be used together with the combinedironic
executable to completely disable the RPC bus.
Adds new configuration option:
[snmp]power_action_delay
This option will add a delay in seconds before a snmp power on and after power off. Which may be needed with some PDUs as they may not honor toggling a specific power port in rapid succession without a delay. This option may be useful if the attached physical machine has a substantial power supply to hold it over in the event of a brownout.
The default deployment boot mode is now UEFI. Legacy BIOS is still supported, however operators who require BIOS nodes will need to set their nodes, or deployment, appropriately.
Upgrade Notes¶
The
ipxe
boot interface is now enabled and will have priority overpxe
by default. If you rely on the default value of theenabled_boot_interfaces
option to not containipxe
, you need to set it explicitly.
The default boot mode has been changed and is now UEFI. Operators who were explicitly relying upon BIOS based deployments in the past, may wish to consider setting an explicit node level override for the node to only utilize BIOS mode. This can be configured at a conductor level with the
[deploy]default_boot_mode
. Options to set this at a node level can be found in the Ironic Installation guide - Advanced features documentation.
Bug Fixes¶
Fixes a bug in the anaconda deploy interface where the
ks_options
key was not found when rendering the default kickstart template.
Fixes an issue where PXEAnacondaDeploy interface’s deploy() method did not return states.DEPLOYWAIT so the instance went straight to
active
instead ofwait call-back
.
Fixes an issue where the anaconda deploy interface mistakenly expected
squashfs_id
instead ofstage2_id
property on the image.
Fixes the heartbeat mechanism in the default kickstart template ks.cfg.template as the heartbeat API only accepts
POST
and expects a mandatorycallback_url
parameter.
Fixes handling of tarball images in anaconda deploy interface. Allows user specified file extensions to be appended to the disk image symlink. Users can now set the file extensions by setting the
disk_file_extension
property on the OS image. This enables users to deploy tarballs with anaconda deploy interface.
Fixes an issue where automated cleaning was not supported when anaconda deploy interface is used.
Fixes
idrac-wsman
management interfaceset_boot_device
method that would fail deployment when there are existing jobs present with the error:"Failed to change power state to ''power on'' by ''rebooting''. Error: DRAC operation failed. Reason: Unfinished config jobs found: <list of existing jobs>. Make sure they are completed before retrying."
Now there can be non-BIOS jobs present during deployment. This will still fail for cases when there are BIOS jobs present. In such cases it’s advised to use the
idrac-redfish
interface that does not have this limitation when setting boot device.
Fixes
File name too long
in the image caching code when a URL contains a long query string.
Fixes an issue with a node’s instance_info interface override caused when Ironic uses the interface attribute directly. Does so by adding a
get_interface
method to a node, and updating the Ironic code to use it where needed.
Fixes an issue where the port value is not converted to int type when nodes are managed by the
irmc
hardware type.
Fixes an issue where cleaning continuously repeats due to the value of
fgi_status
not being updated correctly when obtaining the RAID configuration status of nodes managed by theirmc
hardware type.
When configuring RAID on iRMC machines through ironic, polling is not set when RAID is created. After creating the RAID, set up polling will notify ironic to wait for the RAID configuration to complete before proceeding to the next step instead of check IPA.
Fixes connection caching issues with Redfish BMCs where AccessErrors were previously not disqualifying the cached connection from being re-used. Ironic will now explicitly open a new connection instead of using the previous connection in the cache. Under normal circumstances, the
sushy
redfish library would detect and refresh sessions, however a prior case exists where it may not detect a failure and contain cached session credential data which is ultimately invalid, blocking future access to the BMC via Redfish until the cache entry expired or theironic-conductor
service was restarted. For more information please see story 2009719.
Removing ?filename=file.iso suffix from the virtual media image URL when the image is a regular file due to incompatibility with SuperMicro X12 machines which do not accept special characters such as = or ? in the URL. Historically, this suffix was being added to improve compatibility with those BMCs which require .iso suffix in the URL while using swift as the image store. Old behaviour will remain for swift backed images.
Fixes restricted allocation creation for old policy defaults. This involves a check that ensures that the user is not trying to create an allocation with an owner other than themselves. This check is updated to also see if the user is actually trying to set an allocation owner.
Other Notes¶
The agent deploy and cleaning code no longer uses an RPC call to the same conductor when proceeding to the next deploy or clean step.