post-deployment

ceph-health

Check the status of the ceph cluster.

Uses ceph health to check if cluster is in HEALTH_WARN state and prints a debug message.

  • hosts: ceph_mon

  • groups: post-deployment, post-ceph

  • parameters:

    • tripleo_delegate_to: {{ groups[‘ceph_mon’] | default([]) }}

    • osd_percentage_min: 0

  • roles: ceph

Role documentation

container-status

Ensure container status.

Detect failed containers and raise an error.

  • hosts: undercloud, allovercloud

  • groups: pre-upgrade, post-deployment, post-upgrade

  • parameters:

  • roles: container_status

Role documentation

containerized-undercloud-docker

Verify docker containers are up and ports are open.

Ensure relevant docker containers are up and running, with ports open to listen. We iterate through a list of container names and ports provided in defaults, and ensure the system has those available.

  • hosts: undercloud

  • groups: post-deployment, pre-upgrade

  • parameters:

    • open_ports: [111, 873, 3000, 3306, 4369, 5000, 5050, 5672, 6000, 6001, 6002, 6379, 6385, 8000, 8004, 8080, 8088, 8774, 8775, 8778, 8787, 8888, 8989, 9000, 9292, 9696, 11211, 15672, 25672, 35357, 39422, {‘port’: 22, ‘search_regex’: ‘OpenSSH’}]

    • running_containers: [‘glance_api’, ‘heat_api’, ‘heat_api_cfn’, ‘heat_api_cron’, ‘heat_engine’, ‘ironic_api’, ‘ironic_conductor’, ‘ironic_inspector’, ‘ironic_inspector_dnsmasq’, ‘ironic_neutron_agent’, ‘ironic_pxe_http’, ‘ironic_pxe_tftp’, ‘iscsid’, ‘keystone’, ‘keystone_cron’, ‘logrotate_crond’, ‘memcached’, ‘mistral_api’, ‘mistral_engine’, ‘mistral_event_engine’, ‘mistral_executor’, ‘mysql’, ‘neutron_api’, ‘neutron_dhcp’, ‘neutron_l3_agent’, ‘neutron_ovs_agent’, ‘nova_api’, ‘nova_api_cron’, ‘nova_compute’, ‘nova_conductor’, ‘nova_metadata’, ‘nova_placement’, ‘nova_scheduler’, ‘rabbitmq’, ‘swift_account_auditor’, ‘swift_account_reaper’, ‘swift_account_replicator’, ‘swift_account_server’, ‘swift_container_auditor’, ‘swift_container_replicator’, ‘swift_container_server’, ‘swift_container_updater’, ‘swift_object_auditor’, ‘swift_object_expirer’, ‘swift_object_replicator’, ‘swift_object_server’, ‘swift_object_updater’, ‘swift_proxy’, ‘swift_rsync’, ‘tripleo_ui’, ‘zaqar’, ‘zaqar_websocket’]

  • roles: containerized_undercloud_docker

Role documentation

controller-token

Verify that keystone admin token is disabled.

This validation checks that keystone admin token is disabled on both undercloud and overcloud controller after deployment.

  • hosts: undercloud, Controller

  • groups: post-deployment

  • parameters:

    • keystone_conf_file: /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf

  • roles: controller_token

Role documentation

controller-ulimits

Check controller ulimits.

This will check the ulimits of each controller.

  • hosts: Controller

  • groups: post-deployment

  • parameters:

    • nofiles_min: 1024

    • nproc_min: 2048

  • roles: controller_ulimits

Role documentation

haproxy

HAProxy configuration.

Verify the HAProxy configuration has recommended values.

  • hosts: Controller

  • groups: post-deployment

  • parameters:

    • config_file: /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg

    • global_maxconn_min: 20480

    • defaults_maxconn_min: 4096

    • defaults_timeout_queue: 2m

    • defaults_timeout_client: 2m

    • defaults_timeout_server: 2m

    • defaults_timeout_check: 10s

  • roles: haproxy

Role documentation

healthcheck-service-status

Healthcheck systemd services Check.

Check for failed healthcheck systemd services.

  • hosts: undercloud, allovercloud

  • groups: post-deployment

  • parameters:

    • retries_number: 1

    • delay_number: 1

    • inflight_healthcheck_services: []

  • roles: healthcheck_service_status

Role documentation

image-serve

Verify image-serve service is working and answering.

Ensures image-serve vhost is configured and httpd is running.

  • hosts: undercloud

  • groups: pre-upgrade, post-deployment, post-upgrade

  • parameters:

  • roles: image_serve

Role documentation

mysql-open-files-limit

MySQL Open Files Limit.

Verify the open-files-limit configuration is high enough https://access.redhat.com/solutions/1598733

  • hosts: Controller, Database

  • groups: post-deployment

  • parameters:

    • min_open_files_limit: 16384

  • roles: mysql_open_files_limit

Role documentation

neutron-sanity-check

Neutron Sanity Check.

Run neutron-sanity-check on the controller nodes to find out potential issues with Neutron’s configuration. The tool expects all the configuration files that are passed to the Neutron services.

  • hosts: Controller

  • groups: post-deployment

  • parameters:

  • roles: neutron_sanity_check

Role documentation

nova-event-callback

Nova Event Callback Configuration Check.

This validations verifies that the Nova Event Callback feature is configured which is generally enabled by default. It checks the following files on the Overcloud Controller(s):
  • /etc/nova/nova.conf: [DEFAULT]/vif_plugging_is_fatal = True [DEFAULT]/vif_plugging_timeout >= 300

  • /etc/neutron/neutron.conf: [nova]/auth_url = ‘http://nova_admin_auth_ip:5000’ [nova]/tenant_name = ‘service’ [DEFAULT]/notify_nova_on_port_data_changes = True [DEFAULT]/notify_nova_on_port_status_changes = True

  • hosts: Controller

  • groups: post-deployment

  • parameters:

    • nova_config_file: /var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf

    • neutron_config_file: /var/lib/config-data/puppet-generated/neutron/etc/neutron/neutron.conf

    • vif_plugging_fatal_check: vif_plugging_is_fatal

    • vif_plugging_timeout_check: vif_plugging_timeout

    • vif_plugging_timeout_value_min: 300

    • notify_nova_on_port_data_check: notify_nova_on_port_data_changes

    • notify_nova_on_port_status_check: notify_nova_on_port_status_changes

    • tenant_name_check: tenant_name

  • roles: nova_event_callback

Role documentation

ntp

Verify all deployed nodes have their clock synchronised.

Each overcloud node should have their clocks synchronised. The deployment should configure and run chronyd. This validation verifies that it is indeed running and connected to an NTP server on all nodes.

  • hosts: allovercloud

  • groups: post-deployment

  • parameters:

  • roles: ntp

Role documentation

openstack-endpoints

Check connectivity to various OpenStack services.

This validation gets the PublicVip address from the deployment and tries to access Horizon and get a Keystone token.

  • hosts: undercloud

  • groups: post-deployment, pre-upgrade, post-upgrade

  • parameters:

  • roles: openstack_endpoints

Role documentation

ovs-dpdk-pmd-cpus-check

Validates OVS DPDK PMD cores from all NUMA nodes..

OVS DPDK PMD cpus must be provided from all NUMA nodes. A failed status post-deployment indicates PMD CPU list is not configured correctly.

  • hosts: ComputeOvsDpdk

  • groups: post-deployment

  • parameters:

  • roles: ovs_dpdk_pmd

Role documentation

pacemaker-status

Check the status of the pacemaker cluster.

This runs pcs status and checks for any failed actions. A failed status post-deployment indicates something is not configured correctly. This should also be run before upgrade as the process will likely fail with a cluster that’s not completely healthy.

  • hosts: Controller

  • groups: post-deployment

  • parameters:

  • roles: pacemaker_status

Role documentation

rabbitmq-limits

Rabbitmq limits.

Make sure the rabbitmq file descriptor limits are set to reasonable values.

  • hosts: Controller

  • groups: post-deployment

  • parameters:

    • min_fd_limit: 16384

  • roles: rabbitmq_limits

Role documentation

service-status

Ensure services state.

Detect services status on the target host and fails if we find a failed service.

  • hosts: undercloud, allovercloud

  • groups: prep, pre-deployment, pre-upgrade, post-deployment, post-upgrade

  • parameters:

  • roles: service_status

Role documentation

stonith-exists

Validate stonith devices.

Verify that stonith devices are configured for your OpenStack Platform HA cluster. We don’t configure stonith device with TripleO Installer. Because the hardware configuration may be differ in each environment and requires different fence agents. How to configure fencing please read https://access.redhat.com/documentation/en/red-hat-openstack-platform/8/paged/director-installation-and-usage/86-fencing-the-controller-nodes

  • hosts: Controller

  • groups: post-deployment

  • parameters:

  • roles: stonith_exists

Role documentation

tls-everywhere-post-deployment

Confirm that overcloud nodes are setup correctly.

Checks that overcloud nodes are registered with IdM and that all certs being tracked by certmonger are in the MONITORING state.

  • hosts: allovercloud

  • groups: post-deployment

  • parameters:

  • roles: tls_everywhere

Role documentation

validate-selinux

validate-selinux.

Ensures we don’t have any SELinux denials on the system

  • hosts: all

  • groups: pre-deployment, post-deployment, pre-upgrade, post-upgrade

  • parameters:

    • validate_selinux_working_dir: /var/log/validations

    • validate_selinux_audit_source: /var/log/audit/audit.log

    • validate_selinux_skip_list_dest: {{ validate_selinux_working_dir }}/denials-skip-list.txt

    • validate_selinux_filtered_denials_dest: {{ validate_selinux_working_dir }}/denials-filtered.log

    • validate_selinux_strict: False

    • validate_selinux_filter: None

    • validate_selinux_skip_list: {}

  • roles: validate_selinux

Role documentation