RabbitMQ トラブルシューティング

RabbitMQ トラブルシューティング

このセクションは一般的な RabbitMQ の問題を解決するヒントを提供します。

RabbitMQ サービス停止

It is quite common for the RabbitMQ service to hang when it is restarted or stopped. Therefore, it is highly recommended that you manually restart RabbitMQ on each controller node.

注釈

The RabbitMQ service name may vary depending on your operating system or vendor who supplies your RabbitMQ service.

  1. Restart the RabbitMQ service on the first controller node. The service rabbitmq-server restart command may not work in certain situations, so it is best to use:

    # service rabbitmq-server stop
    # service rabbitmq-server start
    
  2. If the service refuses to stop, then run the pkill command to stop the service, then restart the service:

    # pkill -KILL -u rabbitmq
    # service rabbitmq-server start
    
  3. RabbitMQ プロセスが動作していることを確認します。

    # ps -ef | grep rabbitmq
    # rabbitmqctl list_queues
    # rabbitmqctl list_queues 2>&1 | grep -i error
    
  4. If there are errors, run the cluster_status command to make sure there are no partitions:

    # rabbitmqctl cluster_status
    

    詳細は RabbitMQ のドキュメント を参照してください。

  5. Go back to the first step and try restarting the RabbitMQ service again. If you still have errors, remove the contents in the /var/lib/rabbitmq/mnesia/ directory between stopping and starting the RabbitMQ service.

  6. エラーがなければ、次のコントローラーノードで RabbitMQ サービスを再起動します。

Since the Liberty release, OpenStack services will automatically recover from a RabbitMQ outage. You should only consider restarting OpenStack services after checking if RabbitMQ heartbeat functionality is enabled, and if OpenStack services are not picking up messages from RabbitMQ queues.

RabbitMQ アラート

RabbitMQ のアラートを受けとった場合、以下の手順を実行して、問題を調査および解決します。

  1. RabbitMQ のアラームが発生しているサーバーを特定します。

  2. 影響のある環境において、nova インスタンスを起動できるか試します。

  3. インスタンスを起動できない場合、問題のトラブルシューティングを続けます。

  4. Log in to each of the controller nodes for the affected environment, and check the /var/log/rabbitmq log files for any reported issues.

  5. ログファイルにおいて、識別された接続の問題を探します。

  6. For each controller node in your environment, view the /etc/init.d directory to check it contains nova*, cinder*, neutron*, or glance*. Also check RabbitMQ message queues that are growing without being consumed which will indicate which OpenStack service is affected. Restart the affected OpenStack service.

  7. For each compute node your environment, view the /etc/init.d directory and check if it contains nova*, cinder*, neutron*, or glance*, Also check RabbitMQ message queues that are growing without being consumed which will indicate which OpenStack services are affected. Restart the affected OpenStack services.

  8. OpenStack Dashboard を開き、インスタンスを起動します。インスタンスが起動すると、問題が解決されています。

  9. If you cannot launch an instance, check the /var/log/rabbitmq log files for reported connection issues.

  10. すべてのコントローラーにおいて RabbitMQ サービスを再起動します。

    # service rabbitmq-server stop
    # service rabbitmq-server start
    

    注釈

    すでに OpenStack コンポーネントのみを再起動して、RabbitMQ サービスに接続できない場合、この手順が適用されます。

  11. 手順 7-8 を繰り返します。

Excessive database management memory consumption

Since the Liberty release, OpenStack with RabbitMQ 3.4.x or 3.6.x has an issue with the management database consuming the memory allocated to RabbitMQ. This is caused by statistics collection and processing. When a single node with RabbitMQ reaches its memory threshold, all exchange and queue processing is halted until the memory alarm recovers.

この問題の解決手順:

  1. メモリー消費を確認します。

    # rabbitmqctl status
    
  2. Edit the /etc/rabbitmq/rabbitmq.config configuration file, and change the collect_statistics_interval parameter between 30000-60000 milliseconds. Alternatively you can turn off statistics collection by setting collect_statistics parameter to 「none」.

File descriptor limits when scaling a cloud environment

A cloud environment that is scaled to a certain size will require the file descriptor limits to be adjusted.

rabbitmqctl status を実行して、現在のファイル記述子の制限を表示します。

"{file_descriptors,
     [{total_limit,3996},
      {total_used,135},
      {sockets_limit,3594},
      {sockets_used,133}]},"

Adjust the appropriate limits in the /etc/security/limits.conf configuration file.

Creative Commons Attribution 3.0 License

Except where otherwise noted, this document is licensed under Creative Commons Attribution 3.0 License. See all OpenStack Legal Documents.