Current Series Release Notes¶

15.0.0-46¶

The Applier service now supports running with native threading mode enabled as opposed to the use of the Eventlet library. Note that the use of native threading is still experimental, and is disabled by default. It should not be used in production. To switch from Eventlet to native threading mode, the environment variable OS_WATCHER_DISABLE_EVENTLET_PATCHING=true needs to be added to the applier service configuration. When running in native threading mode, the default workflow engine (Taskflow) will be configured with a serial engine, which will execute the actions sequentially, due to a limitation of the current implementation of watcher services. For more information, please check eventlet removal documentation.

A leader election mechanism has been added to the ServiceMonitoringService so that multiple copies of the decision-engine can be executed simultaneously.

New parameters migration_max_retries and migration_interval have been added to the nova section to define the maximum retries and polling interval in VM migrations. Default values are 180 and 5 seconds.

Timeout of vm resize operations can be configured applying the migration_max_retries and migration_interval in the nova section used for migrations. Default values are 180 and 5 seconds.

The Zone Migration Strategy now accepts audits with either ‘dst_type’ or ‘dst_pool’ as input, but not with both. The strategy won’t support retyping and migrating a volume in the same action plan, so the user should choose the input according to the desired operation.

The APISchedulingService has been removed from the Watcher API service. It is replaced functionally by the ServiceMonitoringService included in the watcher-decision-engine.

Resolve failures caused by temporary connection errors to Nova by adding configurable retries to Nova API calls when connection related errors are found. The retries can be configured via [nova] http_retries (default is 3 retries) and [nova] http_retry_interval (default is 2 seconds).

Fixes a bug in the zone migration strategy where audits would fail due to an unhandled exception when trying to plan instance migration that exist in Nova but not in Watcher’s compute model. The strategy now filters out the elements that are not found in the model, allowing the audit to complete successfully. For more details, please see Bug #2098984.

Previously, when an audit was created with zone_migration strategy and both storage_pools and compute_nodes parameters are passed, the audit did not created the required instances migration actions if any volume migration action was created.

Now, in that situation the audit will create both instance and volume migrations according to the expected behavior and the limits defined by the parallelization parameters.

For more information: https://bugs.launchpad.net/watcher/+bug/2109722

Fixed the source_node parameter value in host maintenance strategy migrate actions to use node hostname instead of node UUID. This bug only appeared during revert of migrate actions, where the revert operation would fail because the migration action expected a hostname for the destination parameter but received a UUID instead.

When running watcher-api as a wsgi server, the decision-engine monitor service was not executed so, in a decision-engine failure scenario, the continuous audits running on it are not reassigned and are not longer running. This patch moves the monitor service to the decision-engine so, it will be executed as soon as any decision-engine is running.

Currently, the APISchedulingService does not migrate any ongoing continuous audits in a failed decision-engine to an alive one when the decision-engine dies while the watcher-api is not running.

This patch fixes those cases by migrating the audits found on a failed decision-engine when the watcher-api is started in addition to when a change in a decision-engine status is detected.

For more details: https://bugs.launchpad.net/watcher/+bug/2127777

Fixed the issue when migrate actions Failed when the migration took more that 120 seconds. After this patch, the default timeout is 900 seconds (15 minutes) which should be a reasonable value for most OpenStack installations.

The CORS middleware has been added to api pipeline, to support Cross-Origin Resource Sharing.

The http_proxy_to_wsgi middleware has been added to the api pipeline. Now setting the [oslo_middleware] enable_proxy_headers_parsing option to true enables parsing the HTTP headers set by forwarders, to detect endpoint urls clients actually use.

Now request id is returned by Watcher API in the X-OpenStack-Request-ID response header.

Fixed the issue when resize actions failed when the resize took more that 120 seconds. After this patch, the default timeout is 900 seconds (15 minutes) which should be a reasonable value for most OpenStack installations.

The zone migration strategy no longer fails when when an audit is created with defined storage_pools, compute_nodes is not provided, and with_attached_volume is set to True. The strategy now creates the required volume migrations, but no instance migrations. Both volumes and instances will only be migrated if the audit parameters have both compute_nodes and storage_pools.

See: https://bugs.launchpad.net/watcher/+bug/2111429 for more details.

Currently, the zone migration strategy has a src_type parameter in the storage_pools input parameter which is ignored, even though it’s required when storage_pools is defined.

This patch makes the src_type parameter optional in the zone migration strategy, and when passed by the user, will use its values to filter the volumes which can be migrated.

For more details: https://launchpad.net/bugs/2111507

Seven unused methods have been removed from the NovaHelper class. These methods had zero production usage and were either remnants from old workflows or never integrated into production code. The removed methods include delete_instance() (dead since the 2018 workflow refactor in commit 4179c352), swap_volume() (removed in 2025 due to security concerns), wait_for_instance_status() (unused since 2015), get_availability_zone_list(), get_instance_by_name(), get_instances_by_node(), and get_service(). This removal eliminates approximately 139 lines of dead code and reduces maintenance burden with no user-facing impact.

The experimental glance client integration has been removed from Watcher. The glance client and create_image_from_instance method became dead code after the legacy workflow removal in commit 4179c3527c, which replaced the snapshot-based cold migration approach with Nova’s native cold migration support. This removal eliminates the python-glanceclient dependency and glance_client configuration options with no user-facing impact.

The experimental neutron client integration has been removed from Watcher. The neutron client and related methods (create_instance, get_security_group_id_from_name, get_network_id_from_name) became dead code after the legacy workflow removal in commit 4179c3527c. This removal eliminates the python-neutronclient dependency and neutron_client configuration options with no user-facing impact.