Distributed OVSDB events handler

This document presents the problem and proposes a solution for handling OVSDB events in a distributed fashion in networking-ovn.

Problem description

In networking-ovn, the OVSDB Monitor class is responsible for listening to the OVSDB events and performing certain actions on them. We use it extensively for various tasks including critical ones such as monitoring for port binding events (in order to notify Neutron/Nova that a port has been bound to a certain chassis). Currently, this class uses a distributed OVSDB lock to ensure that only one instance handles those events at a time.

The problem with this approach is that it creates a bottleneck because even if we have multiple Neutron Workers running at the moment, only one is actively handling those events. And, this problem is highlighted even more when working with technologies such as containers which rely on creating multiple ports at a time and waiting for them to be bound.

Proposed change

In order to fix this problem, this document proposes using a Consistent Hash Ring to split the load of handling events across multiple Neutron Workers.

A new table called ovn_hash_ring will be created in the Neutron Database where the Neutron Workers capable of handling OVSDB events will be registered. The table will use the following schema:

Column name

Type

Description

node_uuid

String

Primary key. The unique identification of a Neutron Worker.

hostname

String

The hostname of the machine this Node is running on.

created_at

DateTime

The time that the entry was created. For troubleshooting purposes.

updated_at

DateTime

The time that the entry was updated. Used as a heartbeat to indicate that the Node is still alive.

This table will be used to form the Consistent Hash Ring. Fortunately, we have an implementation already in the tooz library of OpenStack. It was contributed by the Ironic team which also uses this data structure in order to spread the API request load across multiple Ironic Conductors.

Here’s how a Consistent Hash Ring from tooz works:

from tooz import hashring

hring = hashring.HashRing({'worker1', 'worker2', 'worker3'})

# Returns set(['worker3'])
hring[b'event-id-1']

# Returns set(['worker1'])
hring[b'event-id-2']

How OVSDB Monitor will use the Ring

Every instance of the OVSDB Monitor class will be listening to a series of events from the OVSDB database and each of them will have a unique ID registered in the database which will be part of the Consistent Hash Ring.

When an event arrives, each OVSDB Monitor instance will hash that event UUID and the ring will return one instance ID, which will then be compared with its own ID and if it matches that instance will then process the event.

Verifying status of OVSDB Monitor instance

A new maintenance task will be created in networking-ovn which will update the updated_at column from the ovn_hash_ring table for the entries matching its hostname indicating that all Neutron Workers running on that hostname are alive.

Note that only a single maintenance instance runs on each machine so the writes to the Neutron database are optimized.

When forming the ring, the code should check for entries where the value of updated_at column is newer than a given timeout. Entries that haven’t been updated in a certain time won’t be part of the ring. If the ring already exists it will be re-balanced.

Clean up and minimizing downtime window

Apart from heartbeating, we need to make sure that we remove the Nodes from the ring when the service is stopped or killed.

By stopping the neutron-server service, all Nodes sharing the same hostname as the machine where the service is running will be removed from the ovn_hash_ring table. This is done by handling the SIGTERM event. Upon this event arriving, networking-ovn should invoke the clean up method and then let the process halt.

Unfortunately nothing can be done in case of a SIGKILL, this will leave the nodes in the database and they will be part of the ring until the timeout is reached or the service is restarted. This can introduce a window of time which can result in some events being lost. The current implementation shares the same problem, if the instance holding the current OVSDB lock is killed abruptly, events will be lost until the lock is moved on to the next instance which is alive. One could argue that the current implementation aggravates the problem because all events will be lost where with the distributed mechanism some events will be lost. As far as distributed systems goes, that’s a normal scenario and things are soon corrected.

Ideas for future improvements

This section contains some ideas that can be added on top of this work to further improve it:

  • Listen to changes to the Chassis table in the OVSDB and force a ring re-balance when a Chassis is added or removed from it.

  • Cache the ring for a short while to minimize the database reads when the service is under heavy load.

  • To greater minimize/avoid event losses it would be possible to cache the last X events to be reprocessed in case a node times out and the ring re-balances.