Pike Series Release Notes

4.0.0

Nouvelles fonctionnalités

  • When a cluster or a node is deleted, the action records associated with them are now automatically deleted from database.

  • A new configuration option check_interval_max is added (default=3600) for cluster health check intervals.

  • The health manager is improved to use dynamic timers instead of fix interval timers when polling cluster’s status.

  • New logics added to event-list operation so that users can specify the name or short-id of a cluster for filtering.

  • A event_purge subcommand is added to senlin-manage tool for purging events generated in a specific project.

  • When a node cannot be added to a load-balancer although desired, or it can not be removed from a load-balancer when requested, the node will be marked as in WARNING status.

  • When an engine is detected to be dead, the actions (and the clusters/nodes locked by those actions) are now unlocked. Such clusters and nodes can be operated again.

  • A new recovery action « REBOOT » has been added to the health policy.

  • Added support to listen to heat event notifications for stack failure detection.

  • The load-balancing policy now properly supports the CLUSTER_RECOVER action and NODE_RECOVER action.

  • Added support to adopt an existing object as Senlin node given the UUID and profile type to use.

  • API microversion 1.6 comes with an optional parameter “check” that tells the engine to perform a health check before doing actual recovery. This applies to both clusters and nodes.

  • Relaxed constraint on node physical_id property. Any string value is now treated as valid value even if it is not an UUID.

  • A new feature is introduced in API microversion 1.6 which permits a cluster update operation to change the profile used by the cluster only without actually updating the existing nodes (if any). The new profile will be used when new nodes are created as members of the cluster.

  • New operation introduced for updating the parameters of a receiver.

  • The numeric properties in the spec for a scaling policy now have stricter validations.

  • New API introduced to list the running service engines.

  • The setup-service script now supports the customization of service project name and service role name.

Notes dépréciées

  • The support to CLUSTER_DELETE action from the experimental batch policy is dropped due to issues on cluster locking. This could be resurected in future when a proper workaround is identified.

  • The support to py3.4 is dropped. Please use py3.5 instead.

Corrections de bugs

  • The bug where the availability zone info from a nova server deployment was not available has been fixed.

  • Fixed cluster-recover operation in engine so that it accepts parameters from API requests in addition to policy decision (if any).

  • Fixed an error in the built-in deletion policy which failed to process NODE_DELETE action.

  • Various bug fixes to the user manual and sample profiles/policies.

  • When an action was marked as RETRY, its status is reset to READY for a reschedule. A bug related to this behavior is now fixed.

  • Fixed immature return from policy cooldown check.

  • Fixed a bug related to desired_capacity when creating a cluster. The old behavior was having it default to 1, however, the correct behavior should be having it default to min_size if provided.

  • Fixed a problem related to duplicated event dumps during action execution.

  • Fixed error in the return value of node-check which prevents node-recover from being triggered.

  • Fixed a problem when claiming a cluster from health registry if service engine is stopped (killed) and restarted quickly.

  • Fixed an error in updating stack tags when the stack joins or leaves a cluster.

  • Fixed an error in the built-in load-balancing policy that caused by regression in getting node details for IP addresses.

  • Fixed various problems in load-balancer policy so that it can handle node-recover and cluster-recover operations properly.

  • Fixed an error in parameter checking logic for node-recover operation which prevented valid parameters from being accepted.

  • Fixed error that raises when no operation is provided during node health recovery.

  • Fixed an error introduced by openstacksdk when checking/setting the availability zone of a nova server.

  • The parameter checking for the cluster update operation may incorrectly parse the provided value(s). This bug has been fixed.

  • When attaching a policy (especially a health policy) to a cluster, users may choose to keep the policy disabled. This has to be considered in the health manager and other places. This issue is fixed.

  • Fixed bugs in deletion zone policy and region policy which were not able to correctly parse node reference.

  • Fixed a bug related to webhook ID in the channel info of a receiver. The channel info now always contains valid webhook ID.

  • A nova server, if booted from volume, will not return a valid image ID. This situation is now taken care of.

Autres notes

  • DB layer operations now feature some retries if there are transient errors.

  • Sample health policy file was using 60 seconds as the interval which could be misleading. This has been tuned to 600 seconds.

  • Built-in policies are optimized for reducing DB transactions.

  • The parameter checking for cluster-resize operation is revised so that min_step will be ignored if the ajustment type is not CHANGE_IN_PERCENTAGE.