Affinity policy violated with parallel requests¶
Parallel server create requests for affinity or anti-affinity land on the same
host and servers go to the
ACTIVE state even though the affinity or
anti-affinity policy was violated.
There are two ways to avoid anti-/affinity policy violations among multiple server create requests.
Create multiple servers as a single request¶
This works because when the batch of requests is visible to
at the same time as a group, it will be able to choose compute hosts that
satisfy the anti-/affinity constraint and will send them to the same hosts or
different hosts accordingly.
Adjust Nova configuration settings¶
When requests are made separately and the scheduler cannot consider the batch
of requests at the same time as a group, anti-/affinity races are handled by
what is called the “late affinity check” in
nova-compute. Once a server
lands on a compute host, if the request involves a server group,
nova-compute contacts the API database (via
nova-conductor) to retrieve
the server group and then it checks whether the affinity policy has been
violated. If the policy has been violated,
nova-compute initiates a
reschedule of the server create request. Note that this means the deployment
scheduler.max_attempts set greater than
3) to handle races.
An ideal configuration for multiple cells will minimize upcalls
from the cells to the API database. This is how devstack, for example, is
configured in the CI gate. The cell conductors do not set
However, if a deployment needs to handle racing affinity requests, it needs to configure cell conductors to have access to the API database, for example:
[api_database] connection = mysql+pymysql://root:email@example.com/nova_api?charset=utf8
The deployment also needs to configure
nova-compute services not to disable
the group policy check upcall by either not setting (use the default)
workarounds.disable_group_policy_check_upcall or setting
False, for example:
[workarounds] disable_group_policy_check_upcall = False
With these settings, anti-/affinity policy should not be violated even when parallel server create requests are racing.
Future work is needed to add anti-/affinity support to the placement service in
order to eliminate the need for the late affinity check in