Upgrade OVN to 22.03 on Focal¶
This page has been identified as being affected by the breaking changes introduced between versions 2.9.x and 3.x of the Juju client. Read support note Breaking changes between Juju 2.9.x and 3.x before continuing.
Charmed OpenStack supports OVN version 22.03 starting with OpenStack Ussuri. Clouds running on Focal nodes that are not using this version are strongly recommended to upgrade to it in order to benefit from important bug fixes and software enhancements.
In particular, the procedure described on this page aims to prevent OVN data plane downtime during the upgrade to 22.03. This is an upstream OVN issue that can cause network disruption to all cloud VMs.
Read this entire document before making any changes to your cloud.
It is recommended that the upgrade be tested in a staged environment prior to applying the steps to a production cloud.
Ensure the following prerequisites are satisfied before making any changes.
Juju must be running the latest stable version of its major and minor release (e.g. 2.9.x). This pertains to all three Juju components: client, controller model, and workload model. Juju upgrade documentation is available, but quick guidance is also given below.
First ensure that the client context is the cloud’s controller and model (check with command juju whoami). The essential commands are then:
sudo snap refresh juju juju upgrade-controller juju upgrade-model
OVN must be managed by channel charms (charms ovn-central and ovn-chassis). If it is not, perform the migration away from legacy charms by applying special procedure All charms: migration to channels to those two charms.
As the above migration document states, when performing the migration to channel charms, ensure that the currently running OVN version does not change.
Updated neutron-api-plugin-ovn charm¶
The neutron-api-plugin-ovn charm must support the
option. This works for the
To verify whether the option is available, query for its value:
juju config neutron-api-plugin-ovn ovn-source
Upgrade the charm if the option is not available:
juju refresh neutron-api-plugin-ovn
Set the election timer¶
To minimise timeout issues during the upgrade, set the OVS database server election timer to its maximum value:
juju config ovn-central ovsdb-server-election-timer=30
For background information, see section Raft leader election timeout of this document.
Allow the ovn-central application to settle - use the juju status ovn-central command.
Ensure OVN package requirements¶
Ensure that select packages are up to date on the cloud’s OVN units.
Perform the package upgrades on all OVN units by running commands across the ovn-chassis and ovn-central applications:
juju exec -a ovn-chassis 'apt update && apt -y install \ --only-upgrade openvswitch-common ovn-common' juju exec -a ovn-central 'apt update && apt -y install \ --only-upgrade openvswitch-common ovn-common'
Some clouds may be running ovn-dedicated-chassis as opposed to ovn-chassis.
Fail-safe mode on OVN < 22.03¶
To prevent an OVN data plane outage during the upgrade to 22.03 the
ovn-controller daemon must be placed into fail-safe mode. This section
corresponds to upstream’s documented fail-safe method.
First stop the
juju exec -a ovn-central 'systemctl stop ovn-northd'
Secondly, identify the Southbound database leader unit (see the Querying OVN page for guidance).
Finally, manually set the
northd version to an arbitrary string. The
ovn-controller processes will detect this change and adapt to be able to
understand the data that the upgraded
northd daemon will subsequently
insert into the database (use the Southbound leader unit found above):
juju exec -u <sb-db-leader-unit> 'ovn-sbctl set sb-global . options:northd_internal_version="<string>"'
An example invocation of the above if the Southbound leader unit is
juju exec -u ovn-central/2 'ovn-sbctl set sb-global . options:northd_internal_version="safe"'
The above command contains the string ‘safe’. Any string will suffice provided that it is different from the current OVN version.
Perform the upgrade¶
To ensure a smooth migration, guidance is provided below that includes verification steps.
Prior to upgrading the ovn-central application, change its software sources to ‘distro’ and change the charm’s channel to ‘22.03/stable’:
juju refresh ovn-central --channel 22.03/stable \ --config <(printf "ovn-central:\n source: \"distro\"")
Now upgrade the application by selecting the UCA pocket for OVN 22.03 on Focal:
juju config ovn-central ovn-source=cloud:focal-ovn-22.03
As before, allow the ovn-central application to settle - use the juju status ovn-central command.
Verify: database migration¶
Ensure that the upgraded Northbound and Southbound database schemas match what’s expected (the target version). An example set of commands are provided below.
The Northbound database’s target version and actual version, respectively:
juju exec -a ovn-central 'ovsdb-tool schema-version /usr/share/ovn/ovn-nb.ovsschema' Stdout: | 6.1.0 UnitId: ovn-central/0 Stdout: | 6.1.0 UnitId: ovn-central/1 Stdout: | 6.1.0 UnitId: ovn-central/2 juju exec -a ovn-central 'ovsdb-client get-schema-version unix:/var/run/ovn/ovnnb_db.sock OVN_Northbound' Stdout: | 6.1.0 UnitId: ovn-central/0 Stdout: | 6.1.0 UnitId: ovn-central/1 Stdout: | 6.1.0 UnitId: ovn-central/2
The Southbound database’s target version and actual version, respectively:
juju exec -a ovn-central 'ovsdb-tool schema-version /usr/share/ovn/ovn-sb.ovsschema' Stdout: | 20.21.0 UnitId: ovn-central/0 Stdout: | 20.21.0 UnitId: ovn-central/2 Stdout: | 20.21.0 UnitId: ovn-central/1 juju exec -a ovn-central 'ovsdb-client get-schema-version unix:/var/run/ovn/ovnsb_db.sock OVN_Southbound' Stdout: | 20.21.0 UnitId: ovn-central/0 Stdout: | 20.21.0 UnitId: ovn-central/1 Stdout: | 20.21.0 UnitId: ovn-central/2
If versions do not match it might be that the database migration did not
succeed (see log files under
/var/log/ovn on the ovn-central units).
Verify: cluster status¶
Check the status of the Northbound and Southbound database clusters. It is
expected that one unit has
Role: leader and the others have
follower. An example set of commands are provided below.
The Northbound database cluster:
juju exec -a ovn-central 'ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound' | egrep "Server ID|Role|Leader" Server ID: 2a92 (2a9226b6-7a57-411a-94ee-092aa6a19e40) Role: follower Leader: bc3a Server ID: adb2 (adb28a73-4e21-492c-81d0-f51adc6665a4) Role: follower Leader: bc3a Server ID: bc3a (bc3a26b1-14c0-4133-b2c3-d8f64e4b722d) Role: leader Leader: self
The Southbound database cluster:
juju exec -a ovn-central 'ovs-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound' | egrep "Server ID|Role|Leader" Server ID: 8849 (8849b07b-cc32-47cf-8800-ed89fbc7db94) Role: follower Leader: fa7e Server ID: 50b7 (50b7f34e-b295-4329-8d29-47039f697365) Role: follower Leader: fa7e Server ID: fa7e (fa7e81bb-90e9-4c87-8ce4-cedcd54c6150) Role: leader Leader: self
To upgrade the ovn-chassis application, change the charm’s channel to ‘22.03/stable’ and then select the UCA pocket for OVN 22.03 on Focal:
juju refresh ovn-chassis --channel 22.03/stable juju config ovn-chassis ovn-source=cloud:focal-ovn-22.03
ovn-chassis units settle in the
active/idle state after the config
change, restart OVN Metadata agents with:
juju exec -a ovn-chassis 'systemctl restart neutron-ovn-metadata-agent'
Restart of Neutron OVN metadata agents is especially important when
upgrading from OVN versions lower than 20.09. These versions used table
Chassis in SB database for chassis registration whereas newer versions
Chassis_Private. Without the service restart, metadata agents will
not re-register in the new database table and Neutron will not be able to
detect these agents.
To upgrade OVN packages used by neutron, configure the
neutron-api-plugin-ovn charm to use the overlay repository that contains
the ‘22.03’ release of OVN:
juju config neutron-api-plugin-ovn ovn-source="cloud:focal-ovn-22.03"
Verify: network agents¶
Ensure that all network agents are “alive” and “up”:
openstack network agent list
+---------------+----------------------+---------------+---------------+-------+-------+-------------------------------+ | ID | Agent Type | Host | Avail... Zone | Alive | State | Binary | +---------------+----------------------+---------------+---------------+-------+-------+-------------------------------+ | xxxx-xxxx-... | OVN Controller agent | xxxx-xxxx-... | | :-) | UP | ovn-controller | | xxxx-xxxx-... | OVN Metadata agent | xxxx-xxxx-... | | :-) | UP | networking-ovn-metadata-agent | | xxxx-xxxx-... | OVN Controller agent | xxxx-xxxx-... | | :-) | UP | ovn-controller | | xxxx-xxxx-... | OVN Metadata agent | xxxx-xxxx-... | | :-) | UP | networking-ovn-metadata-agent | +---------------+----------------------+---------------+---------------+-------+-------+-------------------------------+
Raft leader election timeout¶
The Raft leader election timeout is a crucial factor in the upgrade. It is governed by the ovn-central charm’s ovsdb-server-election-timer configuration option, whose default value is ‘4’ (seconds).
The amount of wall clock time a database (Northbound or Southbound) cluster leader consumes during the upgrade process cannot exceed the election timer. If this occurs, the database unit attempting the upgrade (schema conversion) will be evicted from the cluster, thereby preventing its results from being stored. This scenario will lead to an endless retry loop.
Conversion happens on startup of the DB services after package upgrades. To prevent the aforementioned retry loop, the startup scripts have a 30 second hardcoded timeout. Therefore:
the maximum effective value for the
ovsdb-server-election-timeroption is ‘30’
an alternative upgrade path would be needed if the conversion cannot succeed within that maximum
There is no template answer for what the value of the option should be. External factors (e.g. server performance characteristics, load, database size, number of records) all have a role to play.