The general recommendation is to run a galera cluster (using mysql or mariadb) for your database. The active-active features of it allow for a fast and easy failover.
Note that OpenStack does not play well with writing in parallel to multiple galera nodes, see the config recommendation below.
One database to rule them all?¶
You can consider deploying the database in two ways:
one galera cluster for each OpenStack service
one single big galera cluster for all OpenStack services
The recommendation is to split your galera in separate clusters for multiple reasons:
Reduce the impact when a galera cluster is down
Allow intervention on smaller part of infrastructure
Also there is no benefit of colocating multiple services on the same galera cluster.
This section is split into three parts:
the configuration for galera itself
the configuration for the reverse proxy in front of galera
the configuration for the OpenStack services
All of these settings need to be consistent on all nodes of the galera cluster.
Note that this guide does not include the general requirements to get the galera cluster set up in the first place. For this please see https://mariadb.com/kb/en/getting-started-with-mariadb-galera-cluster/
General health configs¶
[mysqld] max_connections=4000 max_statement_time=20
In order to ensure your cluster runs smoothly we recommend you limit the amount of connections and the time statements can be executed.
The value for
max_connections should be set based on actual tests (testing with a lot of idle connections is fine).
The value of
20 seconds for
max_statement_time is enough for all normal use-cases we know of.
You might only run into issues with regular Nova cleanup jobs if they do not run often enough.
When you have a large amount of connections to your galera cluster, these connections might starve the galera replication thread. If the replication thread does not get enough CPU time, the galera cluster will lose its members and break.
This setting sets the replication thread to realtime scheduling on the kernel side.
If you run galera as a non-privileged user (as you hopefully do), galera will need
CAP_SYS_NICE in order to be allowed to change the priority.
If you run inside a container environment, you might need to set
kernel.sched_rt_runtime_us=-1 (although that is suboptimal).
[mysqld] tmp_table_size=64M max_heap_table_size=64M optimizer_switch=derived_merge=off
Temporary tables and the
derived_merge optimizer are an important setting if you have a large amount of Neutron RBAC rules.
Reverse Proxy Configuration¶
You will need to run a reverse proxy in front of your galera cluster to ensure OpenStack only ever communicates with a single cluster node for
This is required because OpenStack does not handle well the deadlocks when writing to different nodes in parallel.
If you choose to run haproxy for this, you can use something like the following config:
defaults timeout client 300s listen db_master bind 0.0.0.0:3306 balance first option mysql-check server server-1 server-1.with.the.fqdn check inter 5s downinter 15s fastinter 2s resolvers cluster id 1 server server-2 server-2.with.the.fqdn check inter 5s downinter 15s fastinter 2s resolvers cluster backup id 2 server server-3 server-3.with.the.fqdn check inter 5s downinter 15s fastinter 2s resolvers cluster backup id 3 listen db_slave bind 0.0.0.0:3308 balance roundrobin option mysql-check server server-1 server-1.with.the.fqdn check inter 5s downinter 15s fastinter 2s resolvers cluster id 3 backup server server-2 server-2.with.the.fqdn check inter 5s downinter 15s fastinter 2s resolvers cluster id 1 weight 10 server server-3 server-3.with.the.fqdn check inter 5s downinter 15s fastinter 2s resolvers cluster id 2 weight 10
By using two blocks, we can separate the
read only SQL requests (listening on port 3308 here) from
requests (listening on port 3306 here) and lower down a little bit the load on the first (master) mysql backend.
You should note the
timeout client setting here, as it is relevant to the OpenStack configuration.
Database Connection Settings¶
The database configuration is normally in the
[database] section of the configuration.
You should set the following:
connection = mysql+pymysql://login:pass@proxy:3306/db?charset=utf8 slave_connection = mysql+pymysql://login:pass@proxy:3308/db?charset=utf8 connection_recycle_time = 280 max_pool_size = 15 max_overflow = 25
connection is used by OpenStack services to do
slave_connection is used by OpenStack services to do
read only requests.
connection_recycle_time should be a bit smaller than the
timeout client in the reverse proxy (5% to 10%).
This ensures connections are recreated on the OpenStack side first before the reverse proxy is forcing the connection to terminate.
max_overflow define the amount of connections an individual thread is allowed to have.
You will need to set this based on experience (although the above should be a good start).
Nova and Cinder use soft deletes inside their database.
This means deleted entries are still persistent in the database and just get a
deleted flag set.
In order to prevent the database tables from growing forever these deleted entries will need to be regularly removed. For this you can use:
nova-manage db archive_deleted_rowsand nova-manage db purge for Nova (https://docs.openstack.org/nova/latest/cli/nova-manage.html)
cinder-manage db purgefor Cinder (https://docs.openstack.org/cinder/latest/cli/cinder-manage.html)
If you did never run these cleanups previously (or if your environment has a high amount of resources being deleted) you might run into
a timeout due to the
max_statement_time on the database cluster.
To work around this the
nova-manage commands support a
For Cinder you might need to run the SQL statements manually and add a
limit 1000 to them (statements are part of the error of the command).