Stein Series (12.0.0 - 12.1.x) Release Notes

12.1.6-3

Bug Fixes

  • Introduces lazy-loading of ports, portgroups, volume connections and volume targets in task manager to fix performance issues. For periodic tasks which create a task manager object but don’t require the aforementioned data (e.g. power sync), this change should reduce the number of database interactions by around two thirds, speeding up overall execution.

12.1.6

Bug Fixes

  • Cleans up nodes stuck in the deleting state on conductor restart.

  • Fixes an issue where ironic-conductor initialization could return a NodeNotLocked error for requests requiring locks when the conductor was starting. This was due to the conductor removing locks after beginning accepting new work. The lock removal has been moved to after the Database connectivity has been established but before the RPC bus is initialized.

12.1.5

New Features

  • Adds a new [ipmi]debug option that allows users to explicitly turn IPMI command debugging on, as opposed to relying upon the system debug setting [DEFAULT]debug. Users wishing to continue to log this output should set [ipmi]debug to True in their ironic.conf.

Upgrade Notes

  • Debug logging control has been moved to the [ipmi]debug configuration setting as opposed to the “conductor” [DEFAULT]debug setting as the existing ipmitool output can be extremely misleading for users. Operators who wish to continue to log ipmitool verbose output in their logs should explicitly set the [ipmi]debug command to True.

Bug Fixes

  • Fixes an issue with the agent client code where checks of the agent command status had no logic to prevent an intermittent or transient connection failure from causing the entire operation to fail.

  • Fixes ‘Invalid parameter value for SpanLength’ when configuring RAID using Python 3. This passed incorrect data type to iDRAC, e.g., instead of 2 it passed 2.0. See story 2004265.

  • Fixes vague node last_error field reporting upon deploy step failure by providing the exception error message in addition to the step that failed.

  • Fixed a bug where rebooting a node managed by the idrac hardware type when using the WS-MAN power interface sometimes fails with a The command failed to set RequestedState error. See bug 2007487 for details.

12.1.4

Bug Fixes

  • Now passing proper flags during clean up of iPXE boot environments, so that no leftovers are left after node tear down.

  • Add timeout when querying agent for commands status. Without it, node can lock up for a quite long time and ironic will not allow to perform any operations with it.

  • When installing a whole disk image using iscsi, set up the bootloader even if a root partition can not be found. The bootloaders will be located on the disk.

12.1.3

Security Issues

  • Node secrets (such as BMC credentials) are no longer logged when JSON RPC is used and DEBUG logging is enabled.

Bug Fixes

  • An issue regarding the ansible deploy interface. The discovery playbook used to gather wwn and serials was broken for python3 due to the dict().keys() object not being a list in python3.

  • Fixes an issue with using serial number as root device hints with the ansible deploy interface.

  • Fixes an issue regarding the ansible deploy interface. Node deployment was broken for any image that was not public because the original request context was not available anymore at the time some image information was fetched.

  • Fixes spurious deployment warnings being logged by the ironic-conductor service indicating that the heartbeats from the deployment ramdisk could not be processed in DEPLOYWAIT state.

  • Fixes issue where the resource list API returned results with requested fields only until the API MAX_LIMIT. After the API MAX_LIMIT is reached the API started ignoring user requested fields. This fix will make sure that the next url generated by the pagination code will include the user requested fields as query parameter.

  • Fixes an issue where baremetal node deployment would fail on clouds with a high number of security groups. Listing the security groups took too long. Instead of listing all security groups, a query filter was added to list only the security groups to be used for the network. (See bug 2006256.)

  • Fixed the issue with node being locked for longer than [console]subprocess_timeout seconds when shellinabox process fails to start before the specifed timeout elapses.

  • Fixes a possible console lockup issue in case of PID file not being yet created while daemon start has call already returned success return code.

  • Fixes an issue wherein asynchronous out-of-band deploy steps in deployment template fails to execute. See story 2006342 for details.

  • Fixes a bug with the grub ramdisk boot template handling, such that the template now properly references the user provided kernal and ramdisk. Previously the deployment ramdisk and kernel was referenced in the template.

  • Fixes an issue where clean steps of redfish BIOS interface do not boot up the IPA ramdisk after cleaning reboot. See story 2006217 for details.

  • Fixes an issue in updating firmware using update_firmware_sum clean step from management interface of ilo hardware type with an error stating that unable to connect to iLO address due to authentication failure. See story 2006223 for details.

  • Fixes an issue in powering-on of server in ilo hardware type. Server was failing to return success for power-on operation if no bootable device was found. See story 2006288 for details.

  • Fixes an issue in creation of RAID for ilo5 RAID interface wherein second time RAID creation fails. See story 2006321 for details.

12.1.2

Upgrade Notes

  • iRMC hardware type deals with iPXE boot interface incompatibility. To iPXE boot with ipxe boot interface, (1) add ipxe to enabled_boot_interfaces in ironic.conf, (2) set up tftp & http server following Ironic document on iPXE boot configuration <https://docs.openstack.org/ironic/latest/install/configure-pxe.html>, then (3) create/set baremetal node with --boot-interface ipxe.

Bug Fixes

  • Fixes a deployment issue encountered during deployment, more precisely during the configdrive partition creation step. On some specific devices like NVMe drives, the created configdrive partition could not be correctly identified (required to dump data onto it afterward). https://storyboard.openstack.org/#!/story/2005764

  • Fixes traceback on cleaning of nodes with the redfish hardware type if their BMC does not support BIOS settings.

  • iRMC hardware type deals with iPXE boot interface incompatibility. From Stein, [pxe]ipxe_enabled option has been deprecated and will be removed in preference to ipxe boot interface in Train cycle. Till then, iRMC hardware type supports iPXE boot through [pxe]ipxe_enabled option. To cope with this incompatibility, iRMC hardware type supports ipxe boot interface.

  • Fixes the duplication of the “ipxe” tag when using IPv6, which leads to the dhcp server possibly returning an incorrect response to the DHCPv6 client.

12.1.1

Bug Fixes

  • Fixes an issue regarding the ansible deployment interface cleaning workflow. Handling the error in the driver and returning nothing caused the manager to consider the step done and go to the next one instead of interrupting the cleaning workflow.

  • Fixes an issue with the ansible deployment interface where raw images could not be streamed correctly to the host.

  • Fixes deployment with the ansible deploy interface and instance images with GPT partition table.

  • Fixes an issue where the sensor data parsing method for the ipmitool interface lacked the ability to handle the automatically included ipmitool debugging information when the debug option is set to True in the ironic.conf file. As such, extra debugging information supplied by the underlying ipmitool command is disregarded. More information can be found in story 2005331.

  • Fixes an issue where deploy fails during node preparation if the node capabilities are passed as string.

  • The internal JSON RPC server now binds to :: by default, allowing it to work correctly with IPv6.

  • No longer tries to create a temporary URL with zero lifetime if the deploy_callback_timeout option is set to zero. The default of 1800 seconds is used in that case. Use the new configdrive_swift_temp_url_duration option to override.

12.1.0

Prelude

The Bare Metal as a Service team joyfully announces our OpenStack Stein release of ironic 12.1.0. While no steins nor speakers were harmed during the development of this release, we might have suffered some hearing damage after we learned that we could increase the volume well past eleven!

Notable items include:

  • Increased parallelism of power synchronization to improve overall conductor efficiency.

  • API fields to support node description and owner values.

  • HPE iLO ilo5 and Huawei ibmc hardware types.

  • Allocations API interface to enable operators to find and select bare metal nodes for deployment.

  • JSON-RPC can now be used for ironic-api to ironic-conductor communication as opposed to using an AMQP messaging provider.

  • Support for customizable PXE templates and streamlined deployment sequences.

  • Initial support for the definition of “deployment templates” to enable operators to define and match customized deployment sequences.

  • Initial work for supporting SmartNIC configuration is included, however the Networking Service changes required are not anticipated until sometime during the Train development cycle.

  • And numerous bug fixes, including ones for IPv6 and IPMI.

This release includes the changes in ironic’s 12.0.0 release which was also released during the Stein development cycle and includes a number of improvements for Bare Metal infrastructure operators. More about our earlier stein release can be found in our release notes.

New Features

  • Adds option [ansible]default_python_interpreter to choose the python interpreter that ansible uses on managed machines. By default, ansible uses /usr/bin/python as interpreter, making the assumption that that path is always present on remote managed systems. This might not be always the case, for example in custom build images or Python 3 native distributions. With this option the operator has the ability to set the absolute path of the python interpreter on the remote machines, for example /usr/bin/python3. The same interpreter will be used in all operations that use the ansible deploy interface. It is also possible to override the value set in the configuration for a node by passing ansible_python_interpreter in its driver_info.

  • Adds currently used boot mode into node properties/capabilities upon redfish inspect interface run. The idea behind this change is to align with the in-band inspector behavior.

  • Adds a description field to the node object to enable operators to store any information related to the node. The field is up to 4096 UTF-8 characters.

  • Adds capability to control the persistency of boot order changes during instance deployment via (i)PXE on a per-node level. The option ‘force_persistent_boot_device’ in the node’s driver info for the (i)PXE drivers is extended to allow the values ‘Default’ (make all changes but the last one upon deployment non-persistent), ‘Always’ (make all changes persistent), and ‘Never’ (make all boot order changes non-persistent).

  • Adds API version 1.50 which allows for the storage of an owner field on node objects. This is intended for either storage of human parsable information or the storage of a tenant UUID which could be leveraged in a future version of the Bare Metal as a Service API.

  • Parallelizes periodic power sync calls by running up to ironic configuration [conductor]/sync_power_state_workers simultaneously. The default is to run up to 8 workers. This change should let larger-scale setups running power syncs more frequently and make the whole power sync procedure more resilient to slow or dead BMCs.

  • Adds an is_smartnic field to the port object in REST API version 1.53.

    is_smartnic field indicates if this port is a Smart NIC port, False by default. This field may be set by operator to use baremetal nodes with Smart NICs as ironic nodes.

    The REST API endpoints related to ports provide support for the is_smartnic field. The ironic admin documentation provides information on how to configure and use Smart NIC ports.

  • Add a new field pxe_template that can be set at driver-info level. This will specify a path for a custom pxe boot template. If present, this template will be read and will have priority in front of the per-arch and general pxe templates.

  • Adds support to enable deployment workflow changes necessary to support the use of Smart NICs in the ansible, direct, iscsi and ramdisk deployment interfaces. Networking service integration for this functionality is not anticipated until the Train release of the Networking service.

  • Introduces allocation API. This API allows finding and reserving a node by its resource class, traits and optional list of candidate nodes. Introduces new API endpoints:

    • GET/POST /v1/allocations

    • GET/DELETE /v1/allocations/<ID or name>

    • GET/DELETE /v1/nodes/<ID or name>/allocation

  • Adds support for building config drives. Starting with API version 1.56, the configdrive parameter of /v1/nodes/<node>/states/provision can be a JSON object with optional keys meta_data (JSON object), network_data (JSON object) and user_data (JSON object, array or string). See story 2005083 for more details.

  • Allows the user to supply EFI system partition image to ironic, for building UEFI-bootable ISO images, in form of a local file or UUID or URI reference. The new [conductor]esp_image option can be used to configure ironic to use local file.

  • Adds the deploy templates API. Deploy templates can be used to customise the node deployment process, each specifying a list of deploy steps to execute with configurable priority and arguments.

    Introduces the following new API endpoints, available from Bare Metal API version 1.55:

    • GET /v1/deploy_templates

    • GET /v1/deploy_templates/<deploy template identifier>

    • POST /v1/deploy_templates

    • PATCH /v1/deploy_templates/<deploy template identifier>

    • DELETE /v1/deploy_templates/<deploy template identifier>

  • Adds a new feature called fast-track which allows an operator to optionally configure the Bare Metal API Service and the Bare Metal conductor service to permit lookup and heartbeat for nodes that are in the process of being enrolled and created.

    These nodes can be left online, from a process such as discovery. If ironic-python-agent has communicated with the Bare Metal Service API endpoint with-in the last 300 seconds, then setup steps that are normally involved with preparing to launch a ramdisk on the node, are skipped along with power operations to enable a baremetal node to undergo discovery through to deployment with a single power cycle. Fast track functionality may be enabled through the [deploy]fast_track option.

  • Adds a new hardware type ibmc for HUAWEI 2288H V5, CH121 V5 series servers. This hardware type supports PXE based boot using HUAWEI iBMC RESTful APIs. The following driver interfaces are supported:

    • management: ibmc

    • power: ibmc

    • vendor: ibmc

  • Adds new hardware type ilo5. Including all other hardware interfaces ilo hardware type supports, this has one new RAID interface ilo5.

  • Adds functionality to perform out-of-band RAID operation for iLO5 based HPE Proliant servers.

  • New property ipmi_hex_kg_key for the ipmi based interfaces. The property enables user to set the Kg key for IPMIv2 authentication in hexadecimal format. This value is provided to ipmitool as the -y argument.

  • Adds the ability to use JSON RPC for communication between API and conductor services. To use it set the new rpc_transport configuration options to json-rpc and configure the credentials and the host_ip in the json_rpc section. Hostnames of all conductors must be resolvable for this implementation to work.

  • Adds a [DEFAULT]/versioned_notifications_topics configuration option. This enables operators to configure the topics used for versioned notifications.

  • Notification events for metrics data now contains a node_name field to assist operators with relating metrics data being transmitted by the conductor service.

  • Set boot_mode in node properties during OOB Introspection for idrac hardware type.

Known Issues

  • As good security practice[0], in Ubuntu Bionic the nf_conntrack_helper is disabled. This causes an issue when using the pxe boot interface with the PXE environment that breaks some of the Ironic CI tests, since Ironic needs conntrack for TFTP traffic. It’s still possible to use Ironic with PXE on Ubuntu Xenial, and it’s also possible to use Ironic with PXE on Ubuntu Bionic using a workaround based on custom firewall rules as shown in [0].

    [0] https://home.regit.org/netfilter-en/secure-use-of-helpers/

Upgrade Notes

  • Adds an is_smartnic field to the port object in REST API version 1.53.

    Upgrading to this release will set is_smartnic to False for all ports.

  • Adds a check to the ironic-status upgrade check command, to check for compatibility of the object versions with the release of ironic.

  • The create_raid_configuration, delete_raid_configuration and read_raid_configuration interfaces of ‘proliantutils’ library has been enhanced to support out-of-band RAID operation for ilo5 hardware type. To leverage this feature, the ‘proliantutils’ library needs to be upgraded to version ‘2.7.0’.

  • Removes deprecated driver_info["drac_host"] property for idrac hardware type that was marked for removal in Pike. Please use driver_info["drac_address"] instead.

Deprecation Notes

  • The values ‘True’/’False’ for the option ‘force_persistent_boot_device’ in the node’s driver info for the (i)PXE drivers are deprecated and support for them may be removed in a future release. The former default value ‘False’ is replaced by the new value ‘Default’, the value ‘True’ is replaced by ‘Always’.

  • The Cisco cisco-ucs-managed and cisco-ucs-standalone drivers have been deprecated due to a lack of reporting third-party CI and vendor maintenance of the driver code. In the present state of these drivers, they would have been removed as part of the eventual removal of support for Python2. These drivers should be anticipated to be removed prior to the final Train release of the Bare Metal service. More information can be found here.

  • The “hash_distribution_replicas” configuration option is now deprecated. If specified in the config file, a warning is logged.

Bug Fixes

  • A bug has been fixed in the node update code that could cause the nodes to become not updatable if their driver is no longer available.

  • Fixes an issue where setting the conductor_group for a node was not entirely case-sensitive, in that this could fail if case-sensitivity did not match between the conductor configuration and the API request.

  • Makes ironic building UEFI-only bootable ISO image (when being asked to build a UEFI-bootable image) rather than building a hybrid BIOS/UEFI-bootable ISO.

  • Fixes an issue that node list with conductor fails if any of the nodes has an invalid hardware type, which may happen when some conductor is out of service.

  • Fixes an issue in the idrac RAID interface seen when creating RAID configurations using python-dracclient version 2.0.0 or higher.

  • Fixes an issue where the master TFTP image cache could not be disabled. The configuration option [pxe]/tftp_master_path may now be set to the empty string to disable the cache. For more information, see story 2004608.

  • Fixes an issue where xclarity management interface fails to get boot order. Now the driver correctly gets boot device and this has been verified in the 3rd party CI. See story 2004576 for details.

  • Advances required python-dracclient version to 1.5.0 and later. That version is required by the fix to the idrac hardware type’s bug 2004340.

  • Makes all ilo driver BIOS interface clean steps as asynchronous. This is required to ensure the settings on the baremetal node are consistent with the settings stored in the database irrespective of the node clean step status. Refer bug 2004066 for details.

  • Fixes the IPMI console implementation to respect all supported IPMI driver_info and configuration options, particularly ipmi_port.

  • Fixes an issue has been corrected where hosts executing iPXE to boot would error indicating that no configuration was found for networks where IPv6 is in use. This has been remedied through a minor addition to the Networking service in the Stein development cycle. For more information please see story 2004502.

  • Notification event types now include the hardware type name string as opposed to a static string of “ipmi”. This allows event processors and operators to understand what the actual notification event data source is as opposed to having to rely upon fingerprints of the data to make such determinations.

  • Fixes a bug where cinder block storage service volumes volume fail to attach expecting a mountpoint to be a valid string. See story 2004864 for additional information.

  • Fixes an issue where the socat process would exit on client disconnect, which would (a) leave a zombie socat process in the process table and (b) disable any subsequent serial console connections. This issue was addressed by updating ironic to call socat with the fork,max-children=1 options, which makes socat persist and accept multiple connections (but only one at a time). Please see story 2005024 for additional information.

  • Fixes an issue with the ipmi hardware type where node['driver_info']['ipmi_force_boot_device'] could be interpreted as True when set to values such as “False”.

  • Returns the correct error message on providing an invalid reference to image_source. Previously an internal error was raised.

  • The instance_info[root_gb] property is no longer required for whole-disk images. It has always been ignored for them, but the validation code still expected it to be present.

Other Notes

  • The Bare Metal service now builds UEFI-only bootable ISO image (when being asked to build a UEFI-bootable image) rather than building a hybrid BIOS/UEFI-bootable ISO.