Rocky Series (11.0.0 - 11.1.x) Release Notes¶
11.1.4-10¶
Bug Fixes¶
Fixes ‘Invalid parameter value for SpanLength’ when configuring RAID using Python 3. This passed incorrect data type to iDRAC, e.g., instead of 2 it passed 2.0. See story 2004265.
Cleans up nodes stuck in the
deleting
state on conductor restart.
Fixes vague node
last_error
field reporting upon deploy step failure by providing the exception error message in addition to the step that failed.
Kill
ipmitool
process invoked by ironic to read node’s power state ifipmitool
process does not exit after configured timeout expires. It appears pretty common foripmitool
to run for five minutes (with current ironic defauls) once it hits a non-responsive bare metal node. This could slow down the management of other nodes due periodic tasks slots exhaustion. The new behaviour could is enabled by default, but could be disabled via the[ipmi]kill_on_timeout
ironic configuration option.
Fixed a bug where rebooting a node managed by the
idrac
hardware type when using the WS-MAN power interface sometimes fails with aThe command failed to set RequestedState
error. See bug 2007487 for details.
Adds
command_timeout
andmax_command_attempts
configuration options to IPA, so when connection errors occur the command will be executed again.
Fixes an issue where
ironic-conductor
initialization could return aNodeNotLocked
error for requests requiring locks when the conductor was starting. This was due to the conductor removing locks after beginning accepting new work. The lock removal has been moved to after the Database connectivity has been established but before the RPC bus is initialized.
11.1.4¶
Bug Fixes¶
Fixes a deployment issue encountered during deployment, more precisely during the configdrive partition creation step. On some specific devices like NVMe drives, the created configdrive partition could not be correctly identified (required to dump data onto it afterward). https://storyboard.openstack.org/#!/story/2005764
Fixes an issue with using serial number as root device hints with the
ansible
deploy interface.
Fixes an issue regarding the
ansible
deploy interface. Node deployment was broken for any image that was not public because the original request context was not available anymore at the time some image information was fetched.
Fixes issue where the resource list API returned results with requested fields only until the API MAX_LIMIT. After the API MAX_LIMIT is reached the API started ignoring user requested fields. This fix will make sure that the next url generated by the pagination code will include the user requested fields as query parameter.
Fixes an issue where the pagination marker was not being set if
uuid
was not in the list of requested fields when executing a list query. The affected API endpoints were: port, portgroup, volume_target, volume_connector, node and chassis. See story 2003192 for more details.
Fixes an issue where baremetal node deployment would fail on clouds with a high number of security groups. Listing the security groups took too long. Instead of listing all security groups, a query filter was added to list only the security groups to be used for the network. (See bug 2006256.)
Fixes a bug with the grub ramdisk boot template handling, such that the template now properly references the user provided kernal and ramdisk. Previously the deployment ramdisk and kernel was referenced in the template.
Fixes an issue in updating firmware using
update_firmware_sum
clean step from management interface ofilo
hardware type with an error stating that unable to connect to iLO address due to authentication failure. See story 2006223 for details.
11.1.3¶
Deprecation Notes¶
Using the
fake
management interface with themanual-management
hardware type is deprecated, please usenoop
instead. Existing nodes will have to be updated after the upgrade.
Bug Fixes¶
Fixes an issue regarding the
ansible deployment interface
cleaning workflow. Handling the error in the driver and returning nothing caused the manager to consider the step done and go to the next one instead of interrupting the cleaning workflow.
Fixes an issue with the ansible deployment interface where raw images could not be streamed correctly to the host.
Fixes deployment with the
ansible
deploy interface and instance images with GPT partition table.
Fixes an issue where the sensor data parsing method for the
ipmitool
interface lacked the ability to handle the automatically included ipmitool debugging information when thedebug
option is set toTrue
in the ironic.conf file. As such, extra debugging information supplied by the underlyingipmitool
command is disregarded. More information can be found in story 2005331.
Fixes an issue where deploy fails during node preparation if the node
capabilities
are passed as string.
Fixes an issue for validating checksum when trying to calculate the actual checksum and failing with UnicodeDecode Error. The fix uses the oslo_utils library for calculating the actual checksum.
The
manual-management
hardware type now defaults to thenoop
management interface. Unlike thefake
management interface, it does not fail on attempt to set the boot device to the local disk.
Fixes a bug where cinder block storage service volumes volume fail to attach expecting a mountpoint to be a valid string. See story 2004864 for additional information.
Returns the correct error message on providing an invalid reference to
image_source
. Previously an internal error was raised.
Reverts the fix to the
idrac
hardware type creating port objects during inspection withpxe_enabled
fields not set to reflect the configuration of the physical ports. It is inconsistent with the stable branch policy [1]. It requirespython-dracclient
version 1.5.0 and greater; however,driver-requirements.txt
specifies version 1.3.0 and greater can be used on this branch.[1] https://docs.openstack.org/project-team-guide/stable-branches.html
11.1.2¶
Bug Fixes¶
A bug has been fixed in the node update code that could cause the nodes to become not updatable if their driver is no longer available.
Fixes an issue where the master instance image cache could not be disabled. The configuration option
[pxe]/instance_master_path
may now be set to the empty string to disable the cache.
Fixes an issue where the master TFTP image cache could not be disbled. The configuration option
[pxe]/tftp_master_path
may now be set to the empty string to disable the cache. For more information, see story 2004608.
Fixes a bug where ironic port is not updated in node introspection as per PXE enabled setting for
idrac
hardware type. See bug 2004340 for details.
11.1.1¶
New Features¶
Setting these configuration options to 0 will disable the periodic tasks:
[conductor]sync_power_state_interval: sync power states for the nodes
[conductor]check_provision_state_interval:
check deployments and time out if the deployment takes too long
check the status of cleaning a node and time out if it takes too long
check the status of inspecting a node and time out if it takes too long
check for and handle nodes that are taken over by new conductors (if an old conductor disappeared)
[conductor]send_sensor_data_interval: send sensor data to ceilometer
[conductor]sync_local_state_interval: refresh a conductor’s copy of the consistent hash ring. If any mappings have changed, determines which, if any, nodes need to be “taken over”. The ensuing actions could include preparing a PXE environment, updating the DHCP server, and so on.
[oneview]periodic_check_interval:
check for nodes taken over by OneView users
check for nodes freed by OneView users
Known Issues¶
Building RAID1 is known to not work with Dell BOSS cards using python-dracclient 1.4.0 or earlier. Upgrade to python-dracclient 1.5.0 to use this feature.
Upgrade Notes¶
The
hash_ring_reset_interval
configuration option was changed from 180 to 15 seconds. Previously, this option was essentially ignored on the API side, becase the hash ring was reset on each API access. The lower value minimizes the probability of a request routed to a wrong conductor when the ring needs rebalancing.
If you are doing a minor version upgrade, please re-run the
ironic-dbsync online_data_migrations
command to properly update the versions of the Objects in the database. Otherwise, the next major upgrade may fail.
Critical Issues¶
The
ironic-dbsync online_data_migrations
command was not updating the objects to their latest versions, which could prevent upgrades from working (i.e. when running the next release’sironic-dbsync upgrade
). Objects are updated to their latest versions now when running that command. See story 2004174 for more information.
Bug Fixes¶
Fixes an issue with a baremetal node that times out during cleaning. The ironic-conductor was attempting to change the node’s provision state to ‘clean failed’ twice, resulting in the node’s
last_error
being set incorrectly. This no longer happens. For more information, see story 2004299.
Fixes an issue where setting these configuration options to 0 caused a ValueError exception to be raised. You can now set them to 0 to disable the associated periodic tasks. (For more information, see story 2002059.):
[conductor]sync_power_state_interval: sync power states for the nodes
[conductor]check_provision_state_interval:
check deployments and time out if the deployment takes too long
check the status of cleaning a node and time out if it takes too long
check the status of inspecting a node and time out if it takes too long
check for and handle nodes that are taken over by new conductors (if an old conductor disappeared)
[conductor]send_sensor_data_interval: send sensor data to ceilometer
[conductor]sync_local_state_interval: refresh a conductor’s copy of the consistent hash ring. If any mappings have changed, determines which, if any, nodes need to be “taken over”. The ensuing actions could include preparing a PXE environment, updating the DHCP server, and so on.
[oneview]periodic_check_interval:
check for nodes taken over by OneView users
check for nodes freed by OneView users
Fixes an issue where Neutron ports would be left with a baremetal MAC address associated after an instance is deleted from a baremetal host. This caused problems with MAC address conflicts in follow up deployments to the same baremetal host. bug 2004428.
Fixes an issue where a flat Neutron port would be left with a host ID associated with it after an instance is deleted from a baremetal host. This caused problems with reusing the same port for a new instance as it is already bound to the old instance.
Fixes a bug where the number of CPU sockets was being returned by the
idrac
hardware type during introspection, instead of the number of virtual CPUs. See bug 2004155 for details.
Fixes a race condition in the hash ring implementation that could cause an internal server error on any request. See story 2003966 for details.
Properly reports an error when the image cache and the image HTTP or TFTP location are on different file system, causing hard link to fail.
Fixes an issue where iSCSI based deployments fail if the
cpu_arch
property is not specified on a node.
Fixes
redfish
hardware type to reuse HTTP session tokens when talking to BMC using session authentication. Prior to this fixredfish
hardware type never tried to reuse session token given out by BMC during previous connection what may sometimes lead to session pool exhaustion with some BMC implementations.
Fixes an issue wherein provisioning fails if ironic node is configured with
ramdisk
deploy interface. See bug 2003532 for more details.
The IPMI hardware type unconditionally instructed the BMC to not automatically clear boot flag valid bit if Chassis Control command not received within 60-second timeout (countdown restarts when a Chassis Control command is received). Some BMCs do not support setting this; if sent it causes the boot to be aborted instead. For IPMI hardware type a new driver option
node['driver_info']['ipmi_disable_boot_timeout']
can be specified. It isTrue
by default; set it toFalse
to bypass sending this command. See story 2004266 for additional information.
11.1.0¶
Prelude¶
Ironic 11.1… Where the volume dial turned more!
While Pixie Boots has rocked out to Rock and Roll, the Bare Metal as a Service team has wrapped up our Rocky release with 11.1. This new release contains a number of major features that we hope will improve the lives of bare metal operators everywhere!
Conductor grouping enabling nodes to be assigned to groups of different conductors.
Deployment steps framework enabling greater flexibility for deployers to request specific steps.
Bios setting interfaces for the
ilo
andirmc
hardware types.Ramdisk deployment interface for disk-less deployments.
Capability to reset nodes to their default interfaces via the API when resetting the node’s driver.
New Features¶
Added support for local booting a partition image for ppc64* hardware. If a PReP partition is detected when deploying to a ppc64* machine, the partition will be specified to IPA causing the bootloader to be installed there directly. This feature requires a ironic-python-agent ramdisk with ironic-lib >=2.14.
Adds new optional
snmp_community_read
andsnmp_community_write
properties tosnmp
driver configuration (specified via a node’sdriver_info
field). If present, the value(s) will be used respectively for SNMP reads and/or writes to the PDU. When not present,snmp_community
value will be used instead.
The iRMC driver can now automatically update the node.traits field with CUSTOM_CPU_FPGA value based on information provided by the node during node inspection.
Adds a
ramdisk
deploy interface for deployments that wish to network boot to a ramdisk, as opposed to perform a complete traditional deployment to a physical media. This may be useful in scientific use cases or where ephemeral baremetal machines are desired.The
ramdisk
deploy interface is intended for advanced users and has some particular operational caveats that the users should be aware of prior to use, such as network access list requirements and configuration drive architectural restrictions and the inability to leverage configuration drives.
Adds a new configuration option
[pxe]pxe_config_subdir
to allow operators to define the specific directory that may be used inside of/tftpboot
or/httpboot
for a boot loader to locate the configuration file for the node. This option defaults topxelinux.cfg
which is the directory that the Syslinux pxelinux.0 bootloader utilized. Operators may wish to change the directory name if they are using other boot loaders such as GRUB or iPXE.
Conductors and nodes may be arbitrarily grouped to provide a basic level of affinity between conductors and nodes. Conductors use the
[conductor]/conductor_group
configuration option to set the group which they belong to. The same value may be set on one or more nodes in theconductor_group
field (available in API version 1.46), and these will be matched such that only conductors with a given group will manage nodes with the same group.A group name may be up to 255 characters containing
a-z
,0-9
,_
,-
, and.
. The group is case-insensitive. The default group is the empty string (""
).The “node list” API endpoint (
GET /v1/nodes
) may also be filtered by conductor group in API version 1.46.
The framework for deployment steps is in place. All in-tree drivers (DeployInterfaces) have one (big) deploy step; the conductor executes this step when deploying a node.
Starting with the Bare Metal REST API version 1.44, the current deploy step (if any) being executed is available in a node’s
deploy_step
field in the responses for the following queries:GET /v1/nodes/<node identifier>
GET /v1/nodes/detail
GET /v1/nodes?fields=deploy_step,...
Implements
bios
interface forilo
hardware type. Adds the list of supported bios interfaces for the ilo hardware type. Adds manual cleaning stepsapply_configuration
andfactory_reset
which support managing the BIOS settings for the iLO servers using ilo hardware type.
Adds support for the new
noop
interface to theipmi
hardware type. This interface targets hardware that does not correctly change boot mode via the IPMI protocol. Using it requires pre-configuring the boot order on a node to try PXE, then fall back to local booting.
Adds new
bios
interface toirmc
hardware type. This provides out-of-band BIOS configuration solution for iRMC driver which makes the functionality available via manual cleaning.
Adds out-of-band RAID configuration solution for the iRMC driver which makes the functionality available via manual cleaning. See iRMC hardware type documentation for more details.
Starting with API version 1.45, PATCH requests to
/v1/nodes/<NODE>
accept the new query parameterreset_interfaces
. It can be provided whenever thedriver
field is updated. If set to ‘true’, all hardware interfaces wil be reset to their defaults, except for ones updated in the same request.
Upgrade Notes¶
Operators utilizing
grub
for PXE booting, typically with UEFI, should change their deployed master PXE configuration file provided for nodes PXE booting using grub. Ironic 11.1 now writes both MAC address and IP address based PXE confiuration links for network booting viagrub
. The grub variable should be changed from$net_default_ip
to$net_default_mac
. IP address support is deprecated and will be removed in the Stein release.
The minimum required version of pysnmp has been bumped to 4.3. This pysnmp version introduces simpler, faster and more functional high-level SNMP API on which ironic snmp driver has been migrated.
The minimum required version of the
osprofiler
library is now 1.5.0. This is now a new dependency, ironic has not been able to start with 1.4.0 since the Pike release when this dependency was introduced.
The
swift/endpoint_type
configuration option is now removed. python-swiftclient 3.2.0 (Ocata) and above removed support for the native URL type used by radosgw. Since using aswift/endpoint_type
value ofradosgw
would fail anyway, it is removed. Deployers must now configure ceph withrgw swift account in url = True
. This must be set before upgrading to this release.
The
snmp
hardware type now uses thenoop
management interface instead offake
used previously. Support forfake
is left for backward compatibility.
Deprecation Notes¶
All drivers must implement their deployment process using deploy steps. Out-of-tree drivers without deploy steps will be supported until the Stein release. For more details, see story 1753128.
The
xclarity
hardware type, as well as the supporting driver interfaces have been deprecated and are scheduled to be removed from ironic in the Stein development cycle. This is due to the lack of operational Third Party testing to help ensure that the support for Lenovo XClarity is functional.The
xclarity
hardware type was introduced at the end of the Queens development cycle. During implementation of Third Party CI, the Lenovo team encountered some unforseen delays. Lenovo is continuing to work towards Third Party CI, and upon establishment and verification of functional Third Party CI, this deprecation will be rescinded.
Support for ironic to link PXE boot configuration files via the assigned interface IP address has been deprecated. This option was only the case when
[pxe]ipxe_enabled
was set tofalse
and the node was being deployed using UEFI.
Using the
fake
management interfaces with thesnmp
hardware type is now deprecated, please usenoop
instead.
Bug Fixes¶
Better handles the case when an operator attempts to perform an upgrade from a release older than Pike, directly to a release newer than Pike, skipping one or more releases in between (i.e. a “skip version upgrade”). Instead of crashing, the operator will be informed that upgrading from a version older than the previous release is not supported (skip version upgrades) and that (as of Pike) all database migrations need to be performed using the previous releases for a fast-forward upgrade. [Bug 2002558]
Fixes support for
grub
based UEFI PXE booting by enabling links to the PXE configuration files to be written using the MAC address of the node in addition to the interface IP address. If the[dhcp]dhcp_provider
option is set tonone
, only the MAC based links will be created.
Fixes an issue that caused the integrated Dell Remote Access Controller (iDRAC)
management
hardware interface implementation,idrac
, to fail to boot nodes in Unified Extensible Firmware Interface (UEFI) boot mode. That interface is supported by theidrac
hardware type. The issue is resolved for Dell EMC PowerEdge 13th and 14th generation servers. It is not resolved for PowerEdge 12th generation and earlier servers. For more information, see story 1656841.
If a node gets stuck in one of the states
deploying
,cleaning
,verifying
,inspecting
,adopting
,rescuing
,unrescuing
for some reason (eg. conductor goes down when executing a task), it will be moved to an appropriate failure state in the next time the conductor starts.
Changes the iPXE behavior to retry a total of 10 times with an increasing backoff time between each retry in order to not create a Denial of Service situation with the iPXE HTTP server. Should the retries fail, the node will be powered-off after a warning is displayed on the console for 30 seconds. For more information, see story.
The cleaning operation may fail, if an in-band clean step were to execute after the completion of out-of-band clean step that performs reboot of the node. The failure is caused because of race condition where in cleaning is resumed before the Ironic Python Agent(IPA) is ready to execute clean steps. This has been fixed. For more information, see bug 2002731.
Other Notes¶
The deprecated configuration option
[ipmi]retry_timeout
was removed, use[ipmi]command_retry_timeout
instead.
11.0.0¶
Prelude¶
I R O N I C turns the dial to 11
In preparation for the OpenStack Rocky development cycle release, the “ironic” Bare Metal as a Service team announces the release of version 11.0. While it is not quite like a volume knob, this release lays the foundation for features coming in future releases and user experience enhancements.
Some of these include the BIOS configuration framework, power fault recovery, additonal error handling, refactoring, removal of classic drivers, and many bug fixes.
New Features¶
Adds the healthcheck middleware from oslo, configurable via the
[healthcheck]/enabled
option. This middleware adds a status check at /healthcheck. This is useful for load balancers to determine if a service is up (and add or remove it from rotation), or for monitoring tools to see the health of the server. This endpoint is unauthenticated, as not all load balancers or monitoring tools support authenticating with a health check endpoint.
Adds support to abort the inspection of a node in the
inspect wait
state, as long as this operation is supported by the inspect interface in use. A node in theinspect wait
state accepts theabort
provisioning verb to initiate the abort process. This feature is supported by theinspector
inspect interface and is available starting with API version 1.41.
Adds support for reading and changing the node’s
bios_interface
field and enables the GET endpoints to check BIOS settings, if they have already been cached. This requires a compatiblebios_interface
to be set. This feature is available starting with API version 1.40.
The new ironic configuration setting
[deploy]/default_boot_mode
allows the operator to set the default boot mode when ironic can’t pick boot mode automatically based on node configuration, hardware capabilities, or bare-metal machine configuration.
Adds support to the
redfish
management interface for reading and setting bare metal node’s boot mode.
Adds new Power Distribution Unit (PDU)
snmp
driver type - BayTech MRP27.
Adds new
auto
type of thedriver_info/snmp_driver
setting which makes ironic automatically select a suitable SNMP driver type based on theSNMPv2-MIB::sysObjectID
value as reported by the PDU being managed.
Adds SNMPv3 message authentication and encryption features to ironic
snmp
hardware type. To enable these features, the following parameters should be used in the node’sdriver_info
:snmp_user
snmp_auth_protocol
snmp_auth_key
snmp_priv_protocol
snmp_priv_key
Also adds support for the
context_engine_id
andcontext_name
parameters of SNMPv3 message at ironicsnmp
hardware type. They can be configured in the node’sdriver_info
.
Add
?detail=
boolean query to the API list endpoints to provide a more RESTful alternative to the existing/nodes/detail
and similar endpoints. The default is False. Now these API requests are possible:/nodes?detail=True
/ports?detail=True
/chassis?detail=True
/portgroups?detail=True
Adds
external
storage interface which is short for “externally managed”. This adds logic to allow the Bare Metal service to identify when a BFV scenario is being requested based upon the configuration set forvolume targets
.The user must create the entry, and no syncronizaiton with a Block Storage service will occur. Documentation has been updated to reflect how to use this interface.
Adds the
[deploy]enable_ata_secure_erase
option which allows an operator to disable ATA Secure Erase for all nodes being managed by the conductor. This setting defaults toTrue
which aligns with the prior behavior of the Bare Metal service.
Adds new parameter fields to driver_info, which will become mandatory in Stein release:
xclarity_manager_ip
: IP address of the XClarity Controller.xclarity_username
: Username for the XClarity Controller.xclarity_password
: Password for XClarity Controller username.xclarity_port
: Port to be used for XClarity Controller connection.
Adds support for the
ipmitool
power interface to theirmc
hardware type.
Adds support for the
fault
field in the node, beginning with API version 1.42. This field records the fault, if any, detected by ironic for a node. If no fault is detected, thefault
isNone
. Thefault
field value is set to one of following values according to different circumstances:power failure
: when a node is put into maintenance due to power sync failures that exceed max retries.clean failure
: when a node is put into maintenance due to failure of a cleaning operation.rescue abort failure
: when a node is put into maintenance due to failure of cleaning up during rescue abort.
The
fault
field will be set toNone
if an operator manually set maintenance toFalse
. Thefault
field can be used as a filter for querying nodes.
Adds power failure recovery to ironic. For nodes that ironic had put into maintenance mode due to power failure, ironic periodically checks their power state, and moves them out of maintenance mode when power state can be retrieved. The interval of this check is configured via
[conductor]power_failure_recovery_interval
configuration option, the default value is 300 (seconds). Set to 0 to disable this behavior.
Adds support for RAID 1 creation on Dell Boot Optimized Storage Solution (BOSS).
Adds support for rescue interface
agent
for theilo
hardware type when the corresponding boot interface being used isilo-virtual-media
. The supported values of the rescue interface for theilo
hardware type areagent
andno-rescue
. The default value isno-rescue
.
Adds support for rescue interface
agent
for theirmc
hardware type when the corresponding boot interface isirmc-virtual-media
. The supported values of rescue interface forirmc
hardware type areagent
andno-rescue
. The default value isno-rescue
.
Issuing a SIGHUP (e.g.
pkill -HUP ironic
) to an ironic-api or ironic-conductor service will cause the service to reload and use any changed values for mutable configuration options. The mutable configuration options are:[DEFAULT]/debug
[DEFAULT]/log_config_append
[DEFAULT]/pin_release_version
Mutable configuration options are indicated as such in the sample configuration file by
Note: This option can be changed without restarting
.A warning is logged for any changes to immutable configuration options.
Upgrade Notes¶
Adds an
inspect wait
state to handle asynchronous hardware introspection. Caution should be taken due to the timeout monitoring is shifted frominspecting
toinspect wait
, please stop all running asynchronous hardware inspection or wait until it is finished before upgrading to the Rocky release. Otherwise nodes in asynchronous inspection will be left atinspecting
state forever unless the database is manually updated.
Extends the
instance_info
column in the nodes table for MySQL/MariaDB from up to 64KiB to up to 4GiB (type is changed from TEXT to LONGTEXT). This upgrade will not be executed on PostgreSQL as its TEXT is unlimited.
To use CoreOS based deploy/cleaning ramdisk built using Ironic Python Agent from the Rocky release, Ironic should be upgraded to the Rocky release if PXE is used. Otherwise, a node cannot be deployed or cleaned because the IPA fails to boot due to an unsupported parameter passed via PXE. See bug 2002093 for details.
With the deploy ramdisk based on Ironic Python Agent version 3.1.0 and beyond, the drivers using
direct
deploy interface performsnetboot
orlocal
boot for whole disk image based on value of boot option setting. When you upgrade Ironic Python Agent in your deploy ramdisk, ensure that boot option is set appropriately for the node. The boot option can be set using configuration[deploy]/default_boot_option
or as aboot_option
capability in node’sproperties['capabilities']
. Also please note that this functionality requireshexdump
command in the ramdisk.
ironic-dbsync online_data_migrations
will migrate any port’s and port group’s extra[‘vif_port_id’] value to their internal_info[‘tenant_vif_port_id’]. For API versions >= 1.28, the ability to attach/detach the VIF via the port’s or port group’s extra[‘vif_port_id’] will not be supported starting with the Stein release.Any out-of-tree network interface implementations that had a different behavior in support of attach/detach VIFs via the port or port group’s extra[‘vif_port_id’] must be updated appropriately.
It is no longer possible to load a classic driver. Only hardware types are supported from now on.
The
/v1/drivers/?type=classic
API always returns an empty list since classic drivers can no longer be loaded.
The deprecated iDRAC classic drivers
pxe_drac
andpxe_drac_inspector
have been removed. Please use theidrac
hardware type.
The deprecated iLO classic drivers
pxe_ilo
,iscsi_ilo
andagent_ilo
have been removed. Please use theilo
hardware type.
The deprecated classic drivers
pxe_ipmitool
andagent_ipmitool
have been removed. Please use theipmi
hardware type instead.
The deprecated classic drivers
pxe_irmc
,agent_irmc
andiscsi_irmc
have been removed. Please use theirmc
hardware type.
The deprecated classic drivers
iscsi_pxe_oneview
andagent_pxe_oneview
have been removed. Please use theoneview
hardware type.
The deprecated
pxe_snmp
classic driver has been removed. Please use thesnmp
hardware type instead.
The deprecated classic drivers
pxe_ucs
andagent_ucs
have been removed. Please use thecisco-ucs-managed
hardware type.
The deprecated classic drivers
pxe_iscsi_cimc
andpxe_agent_cimc
have been removed. Please use thecisco-ucs-standalone
hardware type.
All fake classic drivers, deprecated in the Queens release, have been removed. This includes:
fake
fake_agent
fake_cimc
fake_drac
fake_ilo
fake_inspector
fake_ipmitool
fake_ipmitool_socat
fake_irmc
fake_oneview
fake_pxe
fake_snmp
fake_soft_power
fake_ucs
Please use the
fake-hardware
hardware type instead (you can combine it with any other interfaces, fake or real).
Adds a new configuration option
[disk_utils]partprobe_attempts
which defaults to 10. This is the maximum number of times to try to read a partition (if creating a config drive) via apartprobe
command. Set it to 1 if you want the previous behavior, where no retries were done.
Power failure recovery introduces a new configuration option
[conductor]power_failure_recovery_interval
, which is enabled and set to 300 seconds by default. In case the default value is not suitable for the needs or scale of a deployment, please make adjustment or turn it off during upgrade.
Power failure recovery does not apply to nodes that were in maintenance mode due to power failure before upgrade, they have to be manually moved out of maintenance mode.
Deprecated options
ansible_deploy_username
andansible_deploy_key_file
in node driver_info for theansible
deploy interface were removed and will be ignored. Useansible_username
andansible_key_file
options in the node driver_info respectively.
The behavior for retention of VIF interface attachments has changed.
If your use of the Bare Metal service is reliant upon the behavior of the VIFs being retained, which was introduced as a behavior change during the Ocata cycle, then you must update your tooling to explicitly re-add the VIF attachments prior to deployment.
Deprecated option
[keystone]\region_name
was removed and will be ignored. Instead useregion_name
option in other sections related to contacting other services ([service_catalog]
,[cinder]
,[glance]
,[neutron]
, [swift
] and[inspector]
).As the option
[keystone]\region_name
was the only option in[keystone]
section of ironic configuration file, this section was removed as well.
Deprecation Notes¶
Adds an
inspect wait
state to handle asynchronous hardware introspection. The[conductor]inspect_timeout
configuration option is deprecated for removal, please use[conductor]inspect_wait_timeout
instead to specify the timeout of inspection process.
Deprecates the
snmp_security
field indriver_info
for ironicsnmp
hardware type, it will be removed in Stein release. Please usesnmp_user
field instead.
The
[inspector]enabled
configuration option is deprecated. It only affected classic drivers, and with their removal it no longer has any effect. Use theenabled_inspect_interfaces
option to enable/disable support for ironic-inspector.
The
oneview
hardware type, as well as the supporting driver interfaces have been deprecated and are scheduled to be removed from ironic in the Stein development cycle. This is due to the lack of operational Third Party testing to help ensure that the support for Oneview is functional. Oneview Third Party CI was shutdown just prior to the start of the Rocky development cycle, and at the time of this deprecation the Ironic community has no indication that testing will be restablished. Should testing be restablished, this deprecation shall be rescinded.
Configuration options
[xclarity]/manager_ip
,[xclarity]/username
, and[xclarity]/password
are deprecated and will be removed in the Stein release.
The
enabled_drivers
option is now deprecated. Since classic drivers can no longer be loaded, setting this option to anything non-empty will result in the conductor failing to start.
Security Issues¶
Fixes an issue where an enabled console could be left running after a node was unprovisioned. This allowed a user to view the console even after the instance was gone. Ironic now stops the console during unprovisioning to block this.
Xclarity password specified in configuration file is now properly masked during logging.
Bug Fixes¶
Fixes bug 1749755 causing timeouts to not work properly because an unsupported sqalchemy filter was being used.
Adds more
ipmitool
error messages to be treated as retryable by the ipmitool interfaces (such as power and management hardware interfaces). Specifically,Node busy
,Timeout
,Out of space
andBMC initialization in progress
reporting emitted byipmitool
will cause ironic to retry IPMI command. This change should improve the reliability of IPMI-based communicaton with BMC.
If the bare metal machine’s boot mode differs from the requested one, ironic will now attempt to set requested boot mode on the bare metal machine and fail explicitly if the driver does not support setting boot mode on the node.
The config drive passed to the node can now contain more than 64KiB in case of MySQL/MariaDB. For more details see bug 1596421.
Fixes a bug preventing a node from booting into the user instance after unrescuing if instance netboot is used. See bug 1749433 for details.
Fixes rescue timeout due to incorrect kernel parameter in the iPXE script. See bug 1749860 for details.
Fixes a bug where a node’s hardware type cannot be changed to another hardware type which doesn’t support any hardware interface currently used. See bug 2001832 for details.
Fixes a bug that exposes an internal node ID in an error message when requested to delete a trait which doesn’t exist. See bug 2002062 for details.
When a conductor managing a node dies mid-cleaning the node would get stuck in the CLEANING state. Now upon conductor startup nodes in the CLEANING state will be moved to the CLEANFAIL state.
Fixes an issue where parameters required in driver_info and descriptions in documentation are different.
Fixes an issue with validation of Infiniband ports. Infiniband ports do not require the
local_link_connection
field to be populated as the network topology is discoverable by the Infiniband Subnet Manager. See bug 1753222 for details.
Fixes an issue where RAID 10 creation fails with greater than 16 drives when using the
idrac
hardware type. See bug 2002771 for details.
Adds missed noop implementations (e.g.
no-inspect
) to thefake-hardware
hardware type. This fixes enabling this hardware type without enabling all (even optional)fake
interfaces.
Fixes an issue seen during cleaning when the node being cleaned has one or more traits assigned. This issue caused cleaning to fail, and the node to enter the
clean failed
state. See bug 1750027 for details.
Fixes an issue with iPXE where the incorrect iscsi volume authentication data was being used with boot from volume when multi-attach volumes were present.
Fixes
direct
deploy interface to invokeboot.prepare_instance
irrespective of image type being provisioned. It was callingboot.prepare_instance
only if the image being provisioned is a partition image. See bugs 1713916 and 1750958 for details.
Fixes the HTTP response code for a validation failure when attempting to move an ironic node to the active state. Validation failure in this scenario now responses with a 400 status code correctly indicating a user input error.
Fixes an issue where node ramdisk heartbeat operations would collide with conductor locks and erroniously record an error in node’s
last_error
field.
Fixes collection of periodic tasks from hardware interfaces that are not used in any enabled classic drivers. See bug 2001884 for details.
The periodic tasks for the
inspector
inspect interface are no longer disabled if the[inspector]enabled
option is not set toTrue
. The help string of this option claims that it does not apply to hardware types. In any case, the periodic tasks are only run if any enabled classic driver or hardware interface requires them.
Fixes a compatability issue where the iPXE kernel command line was no longe compatible with dracut. The
ip
parameter has been removed as it is incompatible with theBOOTIF
and missingautoconf
parameters when dracut is used. Further details can be found in storyboard.
Fixes empty
last_error
field on cleaning failures.
Fixes an issue where only nodes in
DEPLOYING
state would have locks cleared for the nodes. Now upon node take over, any locks that are left from the old conductor are cleared by the new one.
Adds a new configuration option
[disk_utils]partprobe_attempts
which defaults to 10. This is the maximum number of times to try to read a partition (if creating a config drive) via apartprobe
command. Previously, no retries were done which caused failures. This addresses bug 1756760.
Fixes rare race condition which resulted in the port list API returning HTTP 400 (bad request) if some nodes were being removed in parallel. See bug 1748893 for details.
Fixes an issue where no error was raised if there were no PXE-enabled ports available for the node, when creating a neutron port. See bug 2001811 for more details.
Fixes potential case of VIF records being orphaned as the service now removes all records of VIF attachments upon the teardown of a deployed node. This is in order to resolve issues related to where it is operationally impossible in some circumstances to remove a VIF attachment while a node is being undeployed as the Compute service will only attempt to remove the VIF for five minutes.
See bug 1743652 for more details.
Ironic API now returns
503 Service Unavailable
for action requiring a conductor when no conductors are online. Bug: 2002600.
Fixes an issue seen during node tear down where a port being deleted by the Bare Metal service could be deleted by the Compute service, leading to an unhandled error from the Networking service. See story 2002637 for further details.
Fixes an issue where the
ilo
hardware type would not properly update the boot mode on the bare metal machine for cleaning as per givenboot_mode
in node’s properties/capabilities. See bug 1559835 for more details.
During node cleaning, the conductor was using a cached copy of the node’s driver_internal_info field. It is possible that the copy is outdated, which would cause issues with the state of the node. This has been fixed. For more information, see bug 2002688.
Fixes an issue where a node’s
instance_info.traits
field could be incorrectly formatted, or contain traits that are not traits of the node. When validating drivers and prior to deployment, the Bare Metal service now validates that a node’s traits include all the traits in itsinstance_info.traits
field. See bug 1755146 for details.
Reverts the fix for orphaned VIF records from the previous release, as it causes a regression. See bug 1750785 for details.
Other Notes¶
Adds an
inspect wait
state to handle asynchronous hardware introspection. ReturningINSPECTING
from theinspect_hardware
method of inspect interface is deprecated,INSPECTWAIT
should be returned instead.
Adds
get_boot_mode
,set_boot_mode
andget_supported_boot_modes
methods to driver management interface. Drivers can override these methods implementing boot mode management calls to the BMC of the baremetal nodes being managed.
Adds new method
validate_rescue()
to boot interface to validate node’s properties related to rescue operation. This method is called by the validate() method of rescue interface.
For out-of-tree drivers that have vendor passthru methods. The
async
parameter of thepassthru
anddriver_passthru
decorators is deprecated and will be removed in the Stein cycle. Please use its replacement instead, theasync_call
parameter. For more information, see bug 1751306.
The conductor no longer tries to collect or report sensors data for nodes in maintenance mode. See bug 1652741.
On taking over nodes in
CLEANING
state, the new conductor moves them to theCLEAN FAIL
state and sets maintenance.
Removes the software metric named
validate_boot_option_for_trusted_boot
. This was the timing for a short-lived, internal function that is already included in thePXEBoot.validate
metric.