Xena Series (18.0.0 - 18.2.x) Release Notes

18.2.0-10

Bug Fixes

  • Fixes File name too long in the image caching code when a URL contains a long query string.

  • Fixed the bug of repeated resume cleaning due to the value of fgi_status not being updated correctly when obtaining the RAID configuration status of the node managed by the irmc hardware type.

  • When configuring RAID on iRMC machines through ironic, polling is not set when RAID is created. After creating the RAID, set up polling will notify ironic to wait for the RAID configuration to complete before proceeding to the next step instead of check IPA.

  • Removing ?filename=file.iso suffix from the virtual media image URL when the image is a regular file due to incompatibility with SuperMicro X12 machines which do not accept special characters such as = or ? in the URL. Historically, this suffix was being added to improve compatibility with those BMCs which require .iso suffix in the URL while using swift as the image store. Old behaviour will remain for swift backed images.

18.2.0

Prelude

The Ironic team hearby announces the release of Ironic 18.2.

During the Xena development cycle, thirty eight contributors collaborated together, and with our adjacent communities to support the needs of our end users in all the many forms they take. Over 48,000 lines of code were modified, and twenty two new features made it into Ironic along with a number of bug fixes. We sincerely hope you enjoy!

New Features

  • Adds support for fields selector in driver api. See story 1674775.

    • GET /v1/drivers?fields=...

    • GET /v1/drivers/{driver_name}?fields=...

  • Adds API version 1.78 which provides the capability to retrieve node history events which may have been recorded in the process of management of the node, which may be aid in troubleshooting or identifying a problem area with a specific node or configuration which has been supplied.

  • Adds a capability to allow bootloaders to be copied into the configured network boot path. This capability can be opted in by using the [pxe]loader_file_paths by being set to a list of key, value pairs of destination filename, and source file path.

    [pxe]
    loader_file_paths = bootx64.efi:/path/to/shimx64.efi,grubx64.efi:/path/to/grubx64.efi
    
  • Manual clean step clear_ca_certificates is added to remove the CA certificates from iLO.

  • Adds endpoints to change boot mode and secure boot state of node.

    • PUT /v1/nodes/{node_ident}/states/boot_mode

    • PUT /v1/nodes/{node_ident}/states/secure_boot

    The API will respond with 202 (Accepted) on validating the request and accepting to process it. Changes occur asynchronously in a background task. The user can then poll the states endpoint /v1/nodes/{node_ident}/states for observing current status of the requested change.

  • Allows limiting the number of parallel downloads for cached images (instance and TFTP images currently).

  • Adds support to specify HttpHeaders when creating a subscription via redfish vendor passthru.

Upgrade Notes

  • The parallel_image_downloads option is now set to True by default. Use the new image_download_concurrency option to tune the behavior, the default concurrency is 20.

  • In-band cleaning has been fixed for ramdisk and anaconda deploy interfaces. If you rely on actual clean steps not running, you need to disable cleaning instead for the relevant nodes:

    baremetal node set <node> --no-automated-clean
    

Deprecation Notes

  • Ironic previously announced the default for the [deploy]default_boot_mode would be changing “in a future release”. This was announced during the Stein development cycle. Ironic will change This default to uefi during the Yoga development cycle.

  • The parallel_image_downloads option is deprecated in favour of the new image_download_concurrency option that allows more precise tuning.

Bug Fixes

  • Fixes a regression in the ramdisk deploy where custom kernel parameters were not used during inspection and cleaning.

  • Resolve issue where [conductor]clean_step_priority_override values are applied too late, after disabled steps have been already filtered out. With this change, priority overrides are applied prior to filtering out disabled steps, so that this configuration option can use used to enable or disable steps (in particular clean steps) in addition to changing priorities they are run with.

  • The validation for create_subscription now uses the default values from Redfish for Context and Protocol to avoid None. The fields returned by create_subscription and get_subscription are now filtered by the common fields between vendors. Deleting a subscription that doesn’t exist will return 404 instead of 500.

  • Fixes an issue in db schema version testing where objects with a initial version, e.g. “1.0”, are allowed to not already have their DB tables pre-exist when performing the pre-upgrade compatability check for the database. This allows the upgrade to proceed and update the database schema without an explicit known list having to be maintained in Ironic.

  • Handles excessively long errors when the status upgrade check is executed, and simply indicates now if a table is missing, suggesting to update the database schema before proceeding.

  • Fixes issue in idrac-redfish clean/deploy step import_configuration where partially successful jobs were treated as fully successful. Such jobs, completed with errors, are now treated as failures.

  • Fix idrac-redfish clean/deploy step import_configuration to handle completed import configuration tasks that are deleted by iDRAC before Ironic has checked task’s status. Prior iDRAC firmware version 5.00.00.00 completed tasks are deleted after 1 minute in iDRAC Redfish. That is not always sufficient to check for their status in periodic check that runs every minute by default. Before this fix node got stuck in wait mode forever. This is fixed by failing the step with error informing to decrease periodic check interval or upgrade iDRAC firmware if not done already.

  • Fixes idrac-redfish RAID interface delete_configuration clean/deploy step for controllers having foreign physical disks. Now foreign configuration is cleared after deleting virtual disks.

  • Fixes idrac-redfish RAID interface in create_configuration clean step and apply_configuration deploy step when there are drives in non-RAID mode. With this fix, non-RAID drives are converted to RAID mode before creating virtual disks.

  • Fixes idrac-wsman BIOS and RAID interface steps to correctly check status of iDRAC job that completed with errors. Now these jobs are treated as failures. Before this fix node stayed in wait state as it was only checking for “Completed” or “Failed” job status, but not “Completed with Errors”.

  • Fixes idrac-wsman power interface to wait for the hardware to reach the target state before returning. For systems where soft power off at the end of deployment to boot to instance failed and forced hard power off was used, this left node successfully deployed in off state without any errors. This broke other workflows expecting node to be on booted into OS at the end of deployment. Additional information can be found in story 2009204.

  • When an http(s):// image is used, the cached copy of the image will always be updated if the HTTP server does not provide the last modification date and time. Previously the cached image would be considered up-to-date, which could cause invalid behavior if the image is generated on fly or was modified while being served.

  • Fixes the pattern of execution for periodic tasks such that the majority of drivers now evaluate if work needs to be performed in advance of creating a node task. Depending on the individual driver query pattern, this prevents excess database queries from being triggered with every task execution.

  • Fixes in-band cleaning for the ramdisk and anaconda deploy interfaces. Previously no in-band steps were fetched from the ramdisk.

  • Retries ssl.SSLError when connecting to the agent.

Other Notes

  • Removes a NEW_MODELS internal list from the dbsync utility which helped the tool navigate new models, however it was never used. Instead the tool now utilizes the database version and appropriate base version to make the appropriate decision in pre-upgrade checks.

  • The cleaning code has been moved from AgentDeployMixin to AgentBaseMixin. Most of 3rd party deploy interfaces will need to include both anyway.

18.1.0

New Features

  • The anaconda deploy interface now handles config drive. The config drive contents are written to the disk in the /var/lib/cloud/seed/config_drive directory by the driver via kickstart files %post section. cloud-init picks up the the config drive information and processes them. Because the config drive is extracted on to disk as plain text files, tools like glean do not work with this deploy interface.

  • The configuration option [deploy]ramdisk_image_download_source now supports a new value swift. If set, and if boot_iso or deploy_iso is a Glance image, the image is exposed via a Swift temporary URL. For other types of images the new value works the same way as the existing http value.

  • The ilo-virtual-media deploy interface now supports file:/// URLs for boot_iso and deploy_iso.

  • The ilo-virtual-media deploy interface now supports the [deploy]ramdisk_image_download_source configuration option.

  • Adds boot_mode and secure_boot fields to node. These indicate the boot mode (bios/uefi) and secure boot state (True/False) detected in the most recent power sync or during transition to the manageable state. If underlying driver does not support detecting these, they shall be populated with null values. These fields are also available under a node’s states endpoint: /v1/nodes/{node_ident}/states

  • For the ramdisk deploy interface, the ramdisk_image_download_source option can now be provided in the node’s instance_info in addition to the global configuration.

  • Provides new vendor passthru methods for Redfish to create, delete and get subscriptions for BMC events.

Upgrade Notes

  • The query pattern for the database when lists of nodes are retrieved has been changed to a more efficient pattern at scale, where a list of nodes is generated, and then additional queries are executed to composite this data together. This is from a model where the database client in the conductor was having to deduplicate the resulting data set which is overall less efficent.

  • The default UEFI iPXE bootloader, [pxe]uefi_ipxe_bootfile_name used by the ipxe boot interface, has been changed from ipxe.efi to snponly.efi. This is because most deployments actually need to use snponly.efi as it contains support for the UEFI integrated network stack, where as ipxe.efi does not and only contained compiled in network drivers, for UEFI, there happen to be few as the UEFI standard requires networking to be handled by the UEFI firmware.

  • Since ilo-virtual-media deploy interface now respects the [deploy]ramdisk_image_download_source configuration options, its default caching behavior has changed. Now HTTP boot_iso/deploy_iso are cached locally and served from the conductor’s HTTP server instead of passing them directly to the BMC. Glance images are also cached locally. To revert to the previous behavior, set the [deploy]ramdisk_image_download_source option to swift.

  • The minimum requirement for the oslo.db library is now version 9.1.0 to address duplicate key error changes in MySQL 8.0.19. oslo.db version 9.1.0 fixes issue of idrac driver node inspection for inspect interface as idrac-redfish. Issue was introduced since MySQL 8.0.19 where duplicate key error information is extended to include the table name in the key.

    For more information, see story 2008901.

Deprecation Notes

  • The driver_info property ilo_boot_iso has been renamed to just boot_iso.

  • The following driver_info parameters have been renamed with deprecation:

    • ilo_deploy_kernel -> deploy_kernel

    • ilo_deploy_ramdisk -> deploy_ramdisk

    • ilo_deploy_iso -> deploy_iso

    • ilo_rescue_kernel -> rescue_kernel

    • ilo_rescue_ramdisk -> rescue_ramdisk

    • ilo_rescue_iso -> rescue_iso

    • ilo_bootloader -> bootloader

  • The driver_info properties irmc_deploy_iso and irmc_rescue_iso have been renamed to just deploy_iso and rescue_iso.

  • The instance_info property irmc_boot_iso has been renamed to just boot_iso.

  • The [pxe]ip_version setting has been deprecated and is anticipated to be removed in the Y* release of OpenStack. This option effectively has had no operational impact since the Ussuri release of OpenStack where dual stack IPv4 and IPv6 support was added to Ironic.

Critical Issues

  • Fixes upgrade failure caused by the missing version of BIOSSetting database objects.

Security Issues

  • Fixes an issue with the /v1/nodes/detail endpoint where an authenticated user could explicitly ask for an instance_uuid lookup and the associated node would be returned to the user with sensitive fields redacted in the result payload if the user did not explicitly have owner or lessee permissions over the node. This is considered a low-impact low-risk issue as it requires the API consumer to already know the UUID value of the associated instance, and the returned information is mainly metadata in nature. More information can be found in Storyboard story 2008976.

Bug Fixes

  • Skips port creation during redfish inspect for devices reported without a MAC address.

  • Fixes potential cache coherency issues by caching the AgentClient per task, rather than globally.

  • Fixes an issue with the /v1/nodes/detail endpoint where requests for an explicit instance_uuid match would not follow the standard query handling path and thus not be filtered based on policy determined access level and node level owner or lessee fields appropriately. Additional information can be found in story 2008976.

  • Slow database retrieval of nodes has been addressed at the lower layer by explicitly passing and handling only the requested fields. The result is excess discarded work is not performed, making the overall process more efficent. This is particullarly beneficial for OpenStack Nova’s syncronization with Ironic.

  • Fixes configuring Redfish RAID using interface_type when error “failed to find matching physical disks for all logical disks” occurs.

  • The ilo-virtual-media deploy interface no longer requires the Image service backend to be Swift for Glance images in boot_iso and deploy_iso.

  • Improves record retrieval performance for baremetal nodes by enabling ironic to not make redundant calls as part of generating API result sets for the baremetal nodes endpoint.

  • Fixes handling of single-value (non-key-value) parameters in the [inspector]extra_kernel_params configuration options.

  • The ramdisk deploy interface no longer requires a fake image_source value to be provided when boot_iso is not used.

  • Removes unused local images after ejecting a virtual media device via the eject_vmedia vendor passthru call of the redfish vendor interface.

  • In Redfish RAID clean and deploy steps skip non-RAID storage controllers for RAID operations. In Redfish systems that do not implement SupportedRAIDTypes they are still processed and could result in unexpected errors.

  • Fixes an issue of powering off with the idrac-wsman management interface while the execution of a clear job queue cleaning step is proceeding. Prior to this fix, the clean step would fail when powering off a node.

  • Fixes issue arose during inspection of iDRAC node with inspect-interface as idrac-redfish. Earlier, inspection of node was getting failed with error port already exists. This issue arose since MySQL 8.0.19 where duplicate key error information is extended to include the table name in the key. Previously, duplicate key error information included only the key value and key name.

    For more information, see story 2008901.

  • Fixes overriding agent_verify_ca with False via driver_info.

Other Notes

  • The default database query pattern has been changed which will result in additional database queries when compositing lists of nodes by separately querying traits and tags. Previously this was a joined query which requires deduplication of the result set before building composite objects.

  • Deprecation warnings for the legacy RBAC policies are now suppressed, as the OpenStack community is coalescing around what appears will be a longer deprecation cycle and process than would typcially be undertaken due to the nature and impact of policy changes. The community as a whole is expecting to make RBAC policy work changes a community goal during the Y* release development cycle, which means the earliest legacy policy support may be removed is likely now the Z* development cycle.

18.0.0

New Features

  • Adding new clean steps to ilo and ilo5 hardware type - security_parameters_update, update_minimum_password_length, and update_auth_failure_logging_threshold which allows users to modify ilo system security settings.

  • Provides the registry fields in the BIOS setting API and in the BIOS setting list when detail is requested. Also adds fields selector to query API. See story 2008571.

  • Adds a new deploy verb as an alias to active and undeploy verb as an alias to deleted. See story 2007551.

  • Adds a new deploy interface custom-agent that can be used when all necessary deploy steps to provision an image are provided in the agent ramdisk. The default write_image deploy step is not present.

  • Get the BIOS Registry from Sushy and store the fields in the Ironic DB with the BIOS setting. See story 2008571.

  • The irmc-virtual-media boot interface now supports setting kernel parameters via the kernel_append_params option in both the node’s driver_info and instance_info. This only applies when an image is built from a kernel and an initramfs, not when a pre-built ISO is used.

  • Adds support for setting kernel parameters for PXE and iPXE boot through the new kernel_append_params setting in the node’s driver_info or instance_info.

  • The redfish-virtual-media, ilo-virtual-media and idrac-redfish-virtual-media boot interfaces now support kernel_append_params not only in the node’s instance_info, but also driver_info. This only applies when the boot image is built from a kernel and an initramfs, not when a pre-built ISO is used.

Upgrade Notes

  • The deprecated iscsi deploy interface has been removed. Please update to a different deploy interface before upgrading.

  • The ironic-dbsync upgrade command for this version of ironic will add additional database indexes on the nodes table columns below. Depending on database size and complexity, this will take time to complete for every single index to be created. On MySQL or MariaDB, these indexes will only be created if an index does not already exist matching the field name with “_idx” appended.:

    • owner

    • lessee

    • driver

    • provision_state

    • reservation

    • conductor_group

    • resource_class

    An example of the SQL commands to generate these indexes can be found in the tuning documentation.

    In testing with mock data, each column took approximately about 4 seconds per column to be indexed on a database with 115,000 rows. The existing database size and underlying server load will cause this time to vary. Sample queries also reduced result generation from an average of 0.40 seconds to an average of 0.02 seconds with a test data set.

  • Removes support for deploy interfaces that do not use deploy steps and rely on the monolithic deploy call instead.

  • Removes support for ironic-python-agent Victoria or older.

Deprecation Notes

  • Using [pxe]kernel_append_params for the iRMC boot interface is now deprecated, please use [irmc]kernel_append_params.

  • The configuration option [pxe]pxe_append_params has been renamed to [pxe]kernel_append_params. The old name is now deprecated.

  • The ilo-virtual-media boot interface has previously declared support for the ilo_kernel_append_params option in driver_info. It has never been implemented in reality and has been replaced by the new kernel_append_params.

  • The node’s driver_info parameters redfish_deploy_iso and redfish_rescue_iso have been renamed to deploy_iso and rescue_iso accordingly. The old names are deprecated.

Bug Fixes

  • If the agent accepts a command, but is unable to reply to Ironic (which sporadically happens before of the eventlet’s TLS implementation), we currently retry the request and fail because the command is already executing. Ironic now detects this situation by checking the list of executing commands after receiving a connection error. If the requested command is the last one, we assume that the command request succeeded.

  • Adds bios_interface to the node list and node show api-ref.

  • Adds bios_interface to the node validate api-ref.

  • When local boot is used (e.g., by default), the instance image validation now happens only in the deploy interface, not in the boot interface (as before). Meaning that the boot interface validation will now pass in many cases where it would previously fail.

  • Fixes the idrac-wsman BIOS factory_reset clean and deploy step to indicate success and update the cached BIOS settings to their defaults only when the BIOS settings have actually been reset. See story 2008058 for more details.

  • Removes temporary cleaning information on starting or restarting cleaning.

  • Improves lower-level performance issues with database activity where some often queried columns were not indexed when the database model was created, or as the model evolved. Operators seeking to pre-create these indexes may do so prior to upgrading. Please consult the tuning documentation in the Administrator’s guide for the queries to leverage.

  • No longer masks configdrive when sending the node’s record to in-band deploy steps.

  • Removes unnecessary delay before the start of the cleaning process when fast-track is used.

  • Correctly processes in-band deploy steps on fast-track deployment.

  • Correctly wipes agent token on inspection start and abort.

  • The behavior when a bootable iso ramdisk is provided behind an http server is to download and serve the image from the conductor; the image is removed only when the node is undeployed. In certain cases, for example on large deployments, this could cause undesired behaviors, like the conductor nodes running out of disk storage. To avoid this event we provide an option [deploy]ramdisk_image_download_source to be able to tell the ramdisk interface to directly use the bootable iso url from its original source instead of downloading it and serving it from the conductor node. The default behavior is unchanged.

  • Fixes providing agent tokens with pre-built ISO images and the redfish-virtual-media boot interface.

  • Fixes sub-optimal Ironic API performance where Secure RBAC related field level policy checks were executing without first checking if there were field results. This helps improve API performance when only specific columns have been requested by the API consumer.

Other Notes

  • Configuration drives are now stored in their JSON representation and only rendered when needed. This allows deploy steps to access the original JSON representation rather than only the rendered ISO image.

  • A new class ironic.drivers.modules.agent.CustomAgentDeploy can be used as a base class for deploy interfaces based on ironic-python-agent.