.. _openstack_neutron_control_plane_performance_report: OpenStack Networking (Neutron) control plane performance report for 400 nodes ***************************************************************************** :Abstract: This document includes OpenStack Networking (aka Neutron) control plane performance test results against two OpenStack environments: 200 nodes and 378 nodes. All tests have been performed regarding :ref:`openstack_neutron_control_plane_performance_test_plan` Environment description ======================= Lab A (200 nodes) ----------------- 3 controllers, 196 computes, 1 node for Grafana/Prometheus Hardware configuration of each server ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. table:: Description of controller servers +-------+----------------+---------------------------------+ |server |vendor,model |Supermicro MBD-X10DRI | +-------+----------------+---------------------------------+ |CPU |vendor,model |Intel Xeon E5-2650v3 | | +----------------+---------------------------------+ | |processor_count |2 | | +----------------+---------------------------------+ | |core_count |10 | | +----------------+---------------------------------+ | |frequency_MHz |2300 | +-------+----------------+---------------------------------+ |RAM |vendor,model |8x Samsung M393A2G40DB0-CPB | | +----------------+---------------------------------+ | |amount_MB |2097152 | +-------+----------------+---------------------------------+ |NETWORK|vendor,model |Intel,I350 Dual Port | | +----------------+---------------------------------+ | |bandwidth |1G | | +----------------+---------------------------------+ | |vendor,model |Intel,82599ES Dual Port | | +----------------+---------------------------------+ | |bandwidth |10G | +-------+----------------+---------------------------------+ |STORAGE|vendor,model |Intel SSD DC S3500 Series | | +----------------+---------------------------------+ | |SSD/HDD |SSD | | +----------------+---------------------------------+ | |size |240GB | | +----------------+---------------------------------+ | |vendor,model |2x WD WD5003AZEX | | +----------------+---------------------------------+ | |SSD/HDD |HDD | | +----------------+---------------------------------+ | |size |500GB | +-------+----------------+---------------------------------+ .. table:: Description of compute servers +-------+----------------+---------------------------------+ |server |vendor,model |SUPERMICRO 5037MR-H8TRF | +-------+----------------+---------------------------------+ |CPU |vendor,model |INTEL XEON Ivy Bridge 6C E5-2620 | | +----------------+---------------------------------+ | |processor_count |1 | | +----------------+---------------------------------+ | |core_count |6 | | +----------------+---------------------------------+ | |frequency_MHz |2100 | +-------+----------------+---------------------------------+ |RAM |vendor,model |4x Samsung DDRIII 8GB DDR3-1866 | | +----------------+---------------------------------+ | |amount_MB |32768 | +-------+----------------+---------------------------------+ |NETWORK|vendor,model |AOC-STGN-i2S - 2-port | | +----------------+---------------------------------+ | |bandwidth |10G | +-------+----------------+---------------------------------+ |STORAGE|vendor,model |Intel SSD DC S3500 Series | | +----------------+---------------------------------+ | |SSD/HDD |SSD | | +----------------+---------------------------------+ | |size |80GB | | +----------------+---------------------------------+ | |vendor,model |1x WD Scorpio Black BP WD7500BPKT| | +----------------+---------------------------------+ | |SSD/HDD |HDD | | +----------------+---------------------------------+ | |size |750GB | +-------+----------------+---------------------------------+ Lab B (378 nodes) ----------------- Environment contains 4 types of servers: - rally node - controller node - compute-osd node - compute node .. table:: Amount of servers each role (node types are listed below) +------------+--------------+------+ |Role |Servers count |Type | +============+==============+======+ |rally |1 |1 or 2| +------------+--------------+------+ |controller |3 |1 or 2| +------------+--------------+------+ |compute |291 |1 or 2| +------------+--------------+------+ |compute-osd |34 |3 | +------------+--------------+------+ |compute-osd |49 |1 | +------------+--------------+------+ Hardware configuration of each server ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All servers have 3 types of configuration describing in table below .. table:: Description of servers hardware type 1 +-------+----------------+---------------------------------+ |server |vendor,model |Dell PowerEdge R630 | +-------+----------------+---------------------------------+ |CPU |vendor,model |Intel,E5-2680 v3 | | +----------------+---------------------------------+ | |processor_count |2 | | +----------------+---------------------------------+ | |core_count |12 | | +----------------+---------------------------------+ | |frequency_MHz |2500 | +-------+----------------+---------------------------------+ |RAM |vendor,model |Samsung, M393A2G40DB0-CPB | | +----------------+---------------------------------+ | |amount_MB |262144 | +-------+----------------+---------------------------------+ |NETWORK|interface_name s|eno1, eno2 | | +----------------+---------------------------------+ | |vendor,model |Intel,X710 Dual Port | | +----------------+---------------------------------+ | |bandwidth |10G | | +----------------+---------------------------------+ | |interface_names |enp3s0f0, enp3s0f1 | | +----------------+---------------------------------+ | |vendor,model |Intel,X710 Dual Port | | +----------------+---------------------------------+ | |bandwidth |10G | +-------+----------------+---------------------------------+ |STORAGE|dev_name |/dev/sda | | +----------------+---------------------------------+ | |vendor,model | | raid1 - Dell, PERC H730P Mini | | | | | 2 disks Intel S3610 | | +----------------+---------------------------------+ | |SSD/HDD |SSD | | +----------------+---------------------------------+ | |size | 3,6TB | +-------+----------------+---------------------------------+ .. table:: Description of servers hardware type 2 +-------+----------------+-------------------------------+ |server |vendor,model |Lenovo ThinkServer RD550 | +-------+----------------+-------------------------------+ |CPU |vendor,model |Intel,E5-2680 v3 | | +----------------+-------------------------------+ | |processor_count |2 | | +----------------+-------------------------------+ | |core_count |12 | | +----------------+-------------------------------+ | |frequency_MHz |2500 | +-------+----------------+-------------------------------+ |RAM |vendor,model |Samsung, M393A2G40DB0-CPB | | +----------------+-------------------------------+ | |amount_MB |262144 | +-------+----------------+-------------------------------+ |NETWORK|interface_name s|enp3s0f0, enp3s0f1 | | +----------------+-------------------------------+ | |vendor,model |Intel,X710 Dual Port | | +----------------+-------------------------------+ | |bandwidth |10G | | +----------------+-------------------------------+ | |interface_names |ens2f0, ens2f1 | | +----------------+-------------------------------+ | |vendor,model |Intel,X710 Dual Port | | +----------------+-------------------------------+ | |bandwidth |10G | +-------+----------------+-------------------------------+ |STORAGE|dev_name |/dev/sda | | +----------------+-------------------------------+ | |vendor,model | | raid1 - Lenovo 720ix | | | | | 2 disks Intel S3610 | | +----------------+-------------------------------+ | |SSD/HDD |SSD | | +----------------+-------------------------------+ | |size |799GB | +-------+----------------+-------------------------------+ .. table:: Description of servers hardware type 3 +-------+----------------+-------------------------------+ |server |vendor,model |Lenovo ThinkServer RD650 | +-------+----------------+-------------------------------+ |CPU |vendor,model |Intel,E5-2670 v3 | | +----------------+-------------------------------+ | |processor_count |2 | | +----------------+-------------------------------+ | |core_count |12 | | +----------------+-------------------------------+ | |frequency_MHz |2500 | +-------+----------------+-------------------------------+ |RAM |vendor,model |Samsung, M393A2G40DB0-CPB | | +----------------+-------------------------------+ | |amount_MB |131916 | +-------+----------------+-------------------------------+ |NETWORK|interface_name s|enp3s0f0, enp3s0f1 | | +----------------+-------------------------------+ | |vendor,model |Intel,X710 Dual Port | | +----------------+-------------------------------+ | |bandwidth |10G | | +----------------+-------------------------------+ | |interface_names |ens2f0, ens2f1 | | +----------------+-------------------------------+ | |vendor,model |Intel,X710 Dual Port | | +----------------+-------------------------------+ | |bandwidth |10G | +-------+----------------+-------------------------------+ |STORAGE|dev_name |/dev/sda | | +----------------+-------------------------------+ | |vendor,model | | raid1 - Lenovo 720ix | | | | | 2 disks Intel S3610 | | +----------------+-------------------------------+ | |SSD/HDD |SSD | | +----------------+-------------------------------+ | |size |799GB | | +----------------+-------------------------------+ | |dev_name |/dev/sdb | | +----------------+-------------------------------+ | |vendor,model | | raid10 - Lenovo 720ix | | | | | 10 disks 2T | | +----------------+-------------------------------+ | |SSD/HDD |HDD | | +----------------+-------------------------------+ | |size |9999GB | +-------+----------------+-------------------------------+ Network configuration of each server ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All servers have same network configuration: .. image:: configs/Network_Scheme.png :alt: Network Scheme of the environment :width: 650px Software configuration on servers with controller, compute and compute-osd roles ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. table:: Services on servers by role +------------+----------------------------+ |Role |Service name | +============+============================+ |controller || horizon | | || keystone | | || nova-api | | || nava-scheduler | | || nova-cert | | || nova-conductor | | || nova-consoleauth | | || nova-consoleproxy | | || cinder-api | | || cinder-backup | | || cinder-scheduler | | || cinder-volume | | || glance-api | | || glance-glare | | || glance-registry | | || neutron-dhcp-agent | | || neutron-l3-agent | | || neutron-metadata-agent | | || neutron-openvswitch-agent | | || neutron-server | | || heat-api | | || heat-api-cfn | | || heat-api-cloudwatch | | || ceph-mon | | || rados-gw | | || memcached | | || rabbitmq_server | | || mysqld | | || galera | | || corosync | | || pacemaker | | || haproxy | +------------+----------------------------+ |compute || nova-compute | | || neutron-l3-agent | | || neutron-metadata-agent | | || neutron-openvswitch-agent | +------------+----------------------------+ |compute-osd || nova-compute | | || neutron-l3-agent | | || neutron-metadata-agent | | || neutron-openvswitch-agent | | || ceph-osd | +------------+----------------------------+ .. table:: Software version on servers with controller, compute and compute-osd roles +------------+-------------------+ |Software |Version | +============+===================+ |OpenStack |Mitaka | +------------+-------------------+ |Ceph |Hammer | +------------+-------------------+ |Ubuntu |Ubuntu 14.04.3 LTS | +------------+-------------------+ You can find outputs of some commands and /etc folder in the following archives: | :download:`controller-1.tar.gz ` | :download:`controller-2.tar.gz ` | :download:`controller-3.tar.gz ` | :download:`compute-1.tar.gz ` | :download:`compute-osd-1.tar.gz ` Software configuration on servers with Rally role ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ On this server should be installed Rally. How to do it you can find in `Rally installation documentation`_ .. table:: Software version on server with Rally role +------------+-------------------+ |Software |Version | +============+===================+ |Rally |0.5.0 | +------------+-------------------+ |Ubuntu |Ubuntu 14.04.3 LTS | +------------+-------------------+ Test results ============ Test Case 1: Basic Neutron test suite ------------------------------------- The following list of tests were run with the `default configuration`_ against `Lab A (200 nodes)`_: * create-and-list-floating-ips * create-and-list-networks * create-and-list-ports * create-and-list-routers * create-and-list-security-groups * create-and-list-subnets * create-and-delete-floating-ips * create-and-delete-networks * create-and-delete-ports * create-and-delete-routers * create-and-delete-security-groups * create-and-delete-subnets * create-and-update-networks * create-and-update-ports * create-and-update-routers * create-and-update-security-groups * create-and-update-subnets The time needed for each scenario can be comparatively presented using the following chart: .. image:: reports/basic_neutron.png :alt: Basic Neutron scenarios timings comparison :width: 650px To overview extended information please download the following report: :download:`basic_neutron.html ` .. _default configuration: https://github.com/openstack/rally/tree/master/samples/tasks/scenarios/neutron Test Case 2: Stressful Neutron test suite ----------------------------------------- The following list of tests were run against both `Lab A (200 nodes)`_ and `Lab B (378 nodes)`_: * create-and-list-networks * create-and-list-ports * create-and-list-routers * create-and-list-security-groups * create-and-list-subnets * boot-and-list-server * boot-and-delete-server-with-secgroups * boot-runcommand-delete Here is short representation of the collected results: +--------------------------------------+-----------------------+-------------------------+---------------------------------------------+ |Scenario |Iterations/concurrency |Time, sec |Errors | | +-----------+-----------+------------+------------+----------------------+----------------------+ | |Lab A |Lab B |Lab A |Lab B |Lab A |Lab B | +======================================+===========+===========+============+============+======================+======================+ |create-and-list-networks |3000/50 |5000/50 |avg 2.375 |avg 3.654 |1 |6 | | | | |max 7.904 |max 11.669 |Internal server error |Internal server error | | | | | | |while processing your |while processing your | | | | | | |request |request | +--------------------------------------+-----------+-----------+------------+------------+----------------------+----------------------+ |create-and-list-ports |1000/50 |2000/50 |avg 123.97 |avg 99.274 |1 |0 | | | | |max 277.977 |max 270.84 |Internal server error | | | | | | | |while processing your | | | | | | | |request | | +--------------------------------------+-----------+-----------+------------+------------+----------------------+----------------------+ |create-and-list-routers |2000/50 |2000/50 |avg 15.59 |avg 12.94 |0 |0 | | | | |max 29.006 |max 19.398 | | | +--------------------------------------+-----------+-----------+------------+------------+----------------------+----------------------+ |create-and-list-security-groups |50/1 |1000/50 |avg 210.706 |avg 68.712 |0 |0 | | | | |max 210.706 |max 169.315 | | | +--------------------------------------+-----------+-----------+------------+------------+----------------------+----------------------+ |create-and-list-subnets |2000/50 |2000/50 |avg 25.973 |avg 17.415 |1 |0 | | | | |max 64.553 |max 50.415 |Internal server error | | | | | | | |while processing your | | | | | | | |request | | +--------------------------------------+-----------+-----------+------------+------------+----------------------+----------------------+ |boot-and-list-server |4975/50 |1000/50 |avg 21.445 |avg 14.375 |0 |0 | | | | |max 40.736 |max 25.21 | | | +--------------------------------------+-----------+-----------+------------+------------+----------------------+----------------------+ |boot-and-delete-server-with-secgroups |4975/200 |1000/100 |avg 190.772 |avg 65.651 |394 |0 | | | | |max 443.518 |max 95.651 |Server has ERROR | | | | | | | |status; The server | | | | | | | |didn't respond in | | | | | | | |time. | | +--------------------------------------+-----------+-----------+------------+------------+----------------------+----------------------+ |boot-runcommand-delete |2000/15 |3000/50 |avg 28.39 |avg 28.587 |34 |1 | | | | |max 35.756 |max 85.659 |Rally tired waiting |Resource to |s_rally_b58e9bd | | | | | | |become ('ICMP UP'), |e_Y369JdPf> has ERROR | | | | | | |current status |status. Deadlock | | | | | | |('ICMP DOWN') |found when trying to | | | | | | | |get lock. | +--------------------------------------+-----------+-----------+------------+------------+----------------------+----------------------+ During execution of Rally were filed and fixed bugs affecting `boot-and-delete-server-with-secgroups` and `boot-runcommand-delete` scenarios on Lab A: * `Bug LP #1610303 l2pop mech fails to update_port_postcommit on a loaded cluster`_ , fix - https://review.openstack.org/353835 * `Bug LP #1614452 Port create time grows at scale due to dvr arp update`_ , fix - https://review.openstack.org/357052 With these fixes applied on Lab B mentioned Rally scenarios passed *successfully*. Other bugs that were faced: * `Bug LP #1595837 Neutron failed to create network`_ * `Bug LP #1610153 nova list can sometimes return 404`_ Observed trends ~~~~~~~~~~~~~~~ Create and list networks: the total time spent on each iteration grows linearly .. image:: reports/create_and_list_networks.png :alt: Create and list networks trend :width: 650px Create and list routers: router list operation time gradually grows from 0.12 to 1.5 sec (2000 iterations). .. image:: reports/create_and_list_routers_list.png :alt: Create and list routers trend (list operation) :width: 650px Create and list routers: total load duration remains line-rate .. image:: reports/create_and_list_routers.png :alt: Create and list routers trend (total time) :width: 650px Create and list subnets: subnet list operation time increases ~ after 1750 iterations (4.5 sec at 1700th iteration to 10.48 at 1800th iteration). .. image:: reports/create_and_list_subnets_list.png :alt: Create and list subnets trend (list operation) :width: 650px Create and list subnets: creating subnets has time peaks after 1750 iterations .. image:: reports/create_and_list_subnets_create.png :alt: Create and list subnets trend (create operation) :width: 650px Create and list security groups: secgroup list operation exposes the most rapid growth rate with time increasing from 0.548 sec in first iteration to over 10 sec in last iterations .. image:: reports/create_and_list_secgroup_list.png :alt: Create and list security groups trend (list operation) :width: 650px More details can be found in original Rally report: :download:`stress_neutron.html ` Test case 3: Neutron scalability test with many networks -------------------------------------------------------- In our tests 100 networks (each with a subnet, router and a VM) were created per each iteration. +-----------------------+--------------+--------------+-------------------------------+ |Iterations/concurrency |Avg time, sec |Max time, sec |Errors | +=======================+==============+==============+===============================+ |10/1 |1237.389 |1294.549 |0 | +-----------------------+--------------+--------------+-------------------------------+ |20/3 |1298.611 |1425.878 |1 | | | | |HTTPConnectionPool Read time | | | | |out | +-----------------------+--------------+--------------+-------------------------------+ Load graph for run with 20 iterations/concurrency 3: .. image:: reports/scale_neutron_networks.png :alt: Neutron scalability test with many networks trend :width: 650px More details can be found in original Rally report: :download:`scale_neutron_networks.html ` Test case 4: Neutron scalability test with many servers ------------------------------------------------------- During each iteration this test creates huge number of VMs (100 in our case) per a single network, hence it is possible to check the case with many number of ports per subnet. +-----------------------+--------------+--------------+-------------------------------+ |Iterations/concurrency |Avg time, sec |Max time, sec |Errors | +=======================+==============+==============+===============================+ |10/1 |100.422 |104.315 |0 | +-----------------------+--------------+--------------+-------------------------------+ |20/3 |119.767 |147.107 |0 | +-----------------------+--------------+--------------+-------------------------------+ Load graph for run with 20 iterations/concurrency 3: .. image:: reports/scale_neutron_servers.png :alt: Neutron scalability test with many servers trend :width: 650px More details can be found in original Rally report: :download:`scale_neutron_servers.html ` .. references: .. _Rally installation documentation: https://rally.readthedocs.io/en/latest/install.html .. _Bug LP #1610303 l2pop mech fails to update_port_postcommit on a loaded cluster: https://bugs.launchpad.net/neutron/+bug/1610303 .. _Bug LP #1614452 Port create time grows at scale due to dvr arp update: https://bugs.launchpad.net/neutron/+bug/1614452 .. _Bug LP #1595837 Neutron failed to create network: https://bugs.launchpad.net/mos/+bug/1595837 .. _Bug LP #1610153 nova list can sometimes return 404: https://bugs.launchpad.net/nova/+bug/1610153