Beginning in Ocata, OpenStack Glance uses Alembic, which replaced SQLAlchemy Migrate as the database migration engine. Moving to Alembic is particularly motivated by the zero-downtime upgrade work. Refer to [GSPEC1] and [GSPEC2] for more information on zero-downtime upgrades in Glance and why a move to Alembic was deemed necessary.
Stop right now and go read [GSPEC1] and [GSPEC2] if you haven’t done so already. Those documents explain the strategy Glance has approved for database migrations, and we expect you to be familiar with them in what follows. This document focuses on the “how”, but unless you understand the “what” and “why”, you’ll be wasting your time reading this document.
Prior to Ocata, database migrations were conceived as monoliths. Thus, they did not need to carefully distinguish and manage database schema expansions, data migrations, or database schema contractions. The modern database migrations are more sensitive to the characteristics of changes being attempted and thus we clearly identify three phases of a database migration: (1) expand, (2) migrate, and (3) contract. A developer modifying the Glance database must supply a script for each of these phases.
Here’s a quick reminder of what each phase entails. For more information, see [GSPEC1].
Expand migrations MUST be additive in nature. Expand migrations should be seen as the minimal set of schema changes required by the new services that can be applied while the old services are still running. Expand migrations should optionally include temporary database triggers that keep the old and new columns in sync. If a database change needs data to be migrated between columns, then temporary database triggers are required to keep the columns in sync while the data migrations are in-flight.
As mentioned earlier, starting in Ocata Glance database migrations must be written for Alembic. All existing Glance migrations have been ported to Alembic. They can be found here [GMIGS1].
All Glance schema migrations must reside in
glance.db.sqlalchemy.alembic_migrations.versions
package
Every Glance schema migration must be a python module with the following structure
"""<docstring describing the migration>
Revision ID: <unique revision id>
Revises: <parent revision id>
"""
<your imports here>
revision = <unique revision id>
down_revision = <parent revision id>
depends_on = <id of dependent revision or None>
def upgrade():
<your schema changes here>
Identifiers revision
, down_revision
and depends_on
are
elaborated below.
The revision
identifier is a unique revision id for every migration.
It must conform to one of the following naming schemes.
All monolith migrations must conform to:
<release name><two-digit sequence number per release>
And, all expand/contract migrations must conform to:
<release name>_[expand|contract]<two-digit sequence number per release>
Example:
Monolith migration: ocata01
Expand migration: ocata_expand01
Contract migration: ocata_contract01
This name convention is devised with an intention to easily understand the
migration sequence. While the <release name>
mentions the release a
migration belongs to, the <two-digit sequence number per release>
helps
identify the order of migrations within each release. For modern migrations,
the [expand|contract]
part of the revision id helps identify the
revision branch a migration belongs to.
The down_revision
identifier MUST be specified for all Alembic migration
scripts. It points to the previous migration (or revision
in Alembic
lingo) on which the current migration is based. This essentially
establishes a migration sequence very much a like a singly linked list would
(except that we use a previous
link here instead of the more traditional
next
link.)
The very first migration, liberty
in our case, would have
down_revision
set to None
. All other migrations must point to the
last migration in the sequence at the time of writing the migration.
For example, Glance has two migrations in Mitaka, namely, mitaka01
and mitaka02
. The migration sequence for Mitaka should look like:
liberty
^
|
|
mitaka01
^
|
|
mitaka02
The depends_on
identifier helps establish dependencies between two
migrations. If a migration X
depends on running migration Y
first,
then X
is said to depend on Y
. This could be specified in the
migration as shown below:
revision = 'X'
down_revision = 'W'
depends_on = 'Y'
Naturally, every migration depends on the migrations preceding it in the
migration sequence. Hence, in a typical branch-less migration sequence,
depends_on
is of limited use. However, this could be useful for
migration sequences with branches. We’ll see more about this in the next
section.
All schema migration scripts must adhere to the naming convention mentioned below:
<unique revision id>_<very brief description>.py
Example:
Monolith migration: ocata01_add_visibility_remove_is_public.py
Expand migration: ocata_expand01_add_visibility.py
Contract migration: ocata_contract01_remove_is_public.py
To achieve zero-downtime upgrades, the Glance migration sequence has been
branched into expand
and contract
branches. As the name suggests,
the expand
branch contains only the expand migrations and the
contract
branch contains only the contract migrations. As per the
zero-downtime migration strategy, the expand migrations are run first
followed by contract migrations. To establish this dependency, we make the
contract migrations explicitly depend on their corresponding expand
migrations. Thus, running contract migrations without running expansions is
not possible.
For example, the Community Images migration in Ocata includes the
experimental E-M-C migrations. The expand migration is ocata_expand01
and the contract migration is ocata_contract01
. The dependency is
established as below.
revision = 'ocata_contract01'
down_revision = 'mitaka02'
depends_on = 'ocata_expand01'
Every contract migration in Glance MUST depend on its corresponding expand migration. Thus, the current Glance migration sequence looks as shown below:
liberty
^
|
|
mitaka01
^
|
|
mitaka02
^
|
+------------+------------+
| |
| |
ocata_expand01 <------ ocata_contract01
^ ^
| |
| |
pike_expand01 <------ pike_contract01
All Glance data migrations must reside in
glance.db.sqlalchemy.alembic_migrations.data_migrations
package.
The data migrations themselves are not Alembic migration scripts. And, hence they don’t require a unique revision id. However, they must adhere to a similar naming convention discussed above. That is:
<release name>_migrate<two-digit sequence number per release>_<very brief description>.py
Example:
Data Migration: ocata_migrate01_community_images.py
All data migrations modules must adhere to the following structure:
def has_migrations(engine):
<your code to determine whether or not there are any pending rows to be
migrated>
return <boolean>
def migrate(engine):
<your code to migrate rows in the database.>
return <number of rows migrated>
In Ocata and Pike, Glance required every database migration to include both monolithic and Expand-Migrate-Contract (E-M-C) style migrations. In Queens, E-M-C migrations became the default and a monolithic migration script is no longer required.
In Queens, the glance-manage tool was refactored so that the glance-manage
db sync
command runs the expand, migrate, and contract scripts “under
the hood”. From the viewpoint of the operator, there is no difference
between having a single monolithic script and having three scripts.
Since we are using the same scripts for offline and online (zero-downtime) database upgrades, as a developer you have to pay attention in your scripts to determine whether you need to add/remove triggers in the expand/contract scripts. See the changes to the ocata scripts in https://review.openstack.org/#/c/544792/ for an example of how to do this.
Alembic is a database migration engine written for SQLAlchemy. So, any
migration script written for SQLAlchemy Migrate should work with Alembic as
well provided the structural differences above (primarily adding
revision
, down_revision
and depends_on
) are taken care of.
Moreover, it maybe easier to do certain operations with Alembic.
Refer to [ALMBC] for information on Alembic operations.
A given database change may not require actions in each of the expand,
migrate, contract phases, but nonetheless, we require a script for each
phase for every change. In the case where an action is not required, a
no-op
script, described below, MUST be used.
For instance, if a database migration is completely contractive in nature,
say removing a column, there won’t be a need for expand and migrate
operations. But, including a no-op
expand and migrate scripts will make
it explicit and also preserve the one-to-one correspondence between expand,
migrate and contract scripts.
A no-op expand/contract Alembic migration:
"""An example empty Alembic migration script
Revision ID: foo02
Revises: foo01
"""
revision = foo02
down_revision = foo01
def upgrade():
pass
A no-op migrate script:
"""An example empty data migration script""" def has_migrations(engine): return False def migrate(engine): return 0
[GSPEC1] | (1, 2, 3, 4, 5) Database Strategy for Rolling Upgrades |
[GSPEC2] | (1, 2) Glance Alembic Migrations Spec |
[GMIGS1] | Glance Alembic Migrations Implementation |
[ALMBC] | Alembic Operations |
Except where otherwise noted, this document is licensed under Creative Commons Attribution 3.0 License. See all OpenStack Legal Documents.