Rocky Series Release Notes

6.0.0

New Features

  • Added a cluster entity refresh to the cluster action execute wrapper which will make sure the state of the action does not become stale while in queue.

  • Added a scheduler thread pool size.

  • Added a new boolean cluster config option to stop node before delete for all cluster.

  • All REST calls that involve a DB interaction are now automatically retried upon deadlock exceptions.

  • Added operation support to start a docker container.

  • Supported update name operation for docker profile.

  • The engine has been augmented to send event notifications only when a node is active and it has a physical ID associated. This is targeting at the lifecycle hooks and possibly other notifications.

  • Health policy now contains NODE_STATUS_POLL_URL detection type. This detection type queries the URL specified in the health policy for node health status. This allows the user to integrate Senlin health checks with an external health service.

  • Added a new detection type that actively pools the node health using a URL specified in the health policy. That way the user can intergate Senlin’s health policy with another custom or 3rd party health check service.

  • Added dependency relationship between the master cluster and the worker cluster creatd for Kubernetes.

  • New version of deletion policy (v1.1) is implemented which supports the specification of lifecycle hooks to be invoked before shrinking the size of a cluster. For details, please check the policy documentation.

  • A new configuration option is exposed for the message topic to use when sending event notifications.

  • New configuration option “database_retry_limit” is added for customizing the maximum retries for failed operations on the database. The default value is 10.

  • New configuration option “database_retry_interval” is added for specifying the number of seconds between database operation retries. The default value is 0.1.

  • New configuration option “database_max_retry_interval” is added for users to specify the maximum number of seconds between database operation retries. The default value is 2.

  • Added retry logic to post_lifecycle_hook_message when posting a lifecyle hook to Zaqar.

  • The policy attach and detach actions are improved to automatically retry on failed attempts.

  • The action scheduler has been refactored so that no premature sleeping will be performed and no unwanted exceptions will be thrown when shutting down workers.

  • The lifecycle hooks feature added during Queens cycle is improved to handle cases where a node no longer exists. The lifecycle is only effective when the target node exists and active.

  • Added support to lock and unlock a nova server node.

  • Added operation support to migrate a nova server node.

  • Added operation support to pause and unpause a nova server node.

  • Added operation support to rescue and unrescue a nova server node.

  • Added operation support to start and stop a nova server node.

  • Added operation support for suspending and resuming a nova server node.

Known Issues

  • There are cases where the event listener based health management cannot successfully stop all listeners.

Upgrade Notes

  • The API microversion 1.10 has fixed the webhook trigger API for easier integration with Aodh. In previous microversions, the query parameters are used as action inputs. Starting from 1.10, the key-value pairs in the request body are also considered as request inputs.

Bug Fixes

  • The UUID used by the block_device_mapping_v2 in nova.server profile is validated.

  • Fixed cluster lock primary key conflict problem.

  • Fixed the example of “aodh alarm create” command.

  • Senlin API/Function/Integration test were moved to senlin-tempest-plugin project before, fixed doc for this change.

  • Fixed an error when restarting a docker container node.

  • Fixed bug when deleteing node error.

  • Fixed bug in health checking which was introduced by oslo.context hanges.

  • Fixed bug when checking if health policy is attached already.

  • In openstacksdk 0.14.0 release, a bug related to SDK exception was fixed “https://review.openstack.org/#/c/571101/”. With that change a SDK exception will contain the detailed message only if the message string is equal to ‘Error’. Fixed the test_parse_exception_http_exception_no_details to use ‘Error’ as the exception message to make the test case pass.

  • Enable old versions of builtin policy types to be listed and used.

  • Fixed openstack-tox-cover which was broken as part of the switch to stestr.

  • Fixed the error in token generation for kubeadm.

  • Fixed cluster and node lock management so that failed lock acquire operations are automatically retried. This is an important fix for running multiple service engines.

  • Node creation request that might break cluster size constraints now results in node ERROR status.

  • Added exception handling for node-join and node-leave operations.

  • Fixed the return value from a node operation call.

  • Fixed defects in node recover operation to ensure node status is properly handled.

  • Improved logic in rebooting and rebuilding nova server nodes so that exceptions are caught and handled.

  • Fixed the “role” field used when creating/updating a node.

  • Fixed nova profile logic when updating image. We will always use the current image as the effective one.

  • Added scheduler thread pool size configuration value and changed default thread pool size for scheduler from 10 to 1000. This fix prevents problems when a large number of cluster operations are executed simultaneously.

  • Added exception handling for service status update. This is making service management more stable.

  • The data type problem related to action start time and end time is fixed. We now use decimal type instead of float for these columns.

  • The ‘V’ query parameter when triggering a webhook receiver is strictly required.

  • Fixed a bug where API version negotiation is not effective when invoked via OpenStack SDK. The API impacted is limited to webhook triggering.

Other Notes

  • Health policy v1.0 was moved from EXPERIMENTAL to SUPPORTED status.

5.0.0

New Features

  • Support to forced deletion of cluster and nodes.

  • Added support to Octavia as the load-balancer driver.

  • Node details view now includes attached_volumes.

  • Added cluster config property “node.name.format” where users can specify how cluster nodes are automatically named. Users can use placeholders like “$nI” for node index padded with 0s to the left, or “$nR” for random string of length n.

  • Senlin now support policy in code, which means if users didn’t modify any of policy rules, they can leave policy file (in json or yaml format) empty or not deploy it at all. Because from now, Senlin keeps all default policies under senlin/common/policies module. Users can modify/generate policy.yaml file which will override policy rules in code if those rules show in policy.yaml file. Users also still use policy.json file but oslo team recommend that we should use the newer YAML format instead.

  • Added support to unicode availability zone names.

  • Added support to use Unicode string for cluster names.

Upgrade Notes

  • The Octavia service must be properly installed and configured to enable load-balancing policy.

Bug Fixes

  • Fixed a bug related to oslo.versionedobjects change that prevents cluster actions to be properly encoded in JSON requests.

  • Fixed bug related to reacting to nova vm lifecycle event notifications. The recover flow is no longer called twice when a VM is deleted.

  • Fixed various defects in managing node pools for loadbalancer policy.

  • DB lock contentions are alleviated by allowing lock retries.

  • Fixed a bug related to force delete nodes.

  • Fixed an error where action name not passed to backend service.

  • Fixed an error introduced by oslo.versionedobjects change that lead to failures when creating a receiver.

Other Notes

  • Improved Nova VM server health check for cases where physical id is invalid.

  • Default policy.json file is now removed as Senlin now generate the default policies from code. Please be aware that when using that file in your environment.