2.1. Results of measuring of API performance of Kubernetes

Abstract:This document includes performance test results of Kubernetes API. All tests have been performed regarding Measuring of API performance of container cluster systems

2.1.1. Environment description

2.1.1.1. Hardware configuration of each server

Description of servers hardware
server name node-{1..500}, kuber* node-{1..355}
role kubernetes cluster kubernetes cluster
vendor,model Dell, R630 Lenovo, RD550-1U
operating_system
4.4.0-36-generic
Ubuntu-xenial
x86_64
4.4.0-36-generic
Ubuntu-xenial
x86_64
CPU vendor,model Intel, E5-2680v3 Intel, E5-2680 v3
processor_count 2 2
core_count 12 12
frequency_MHz 2500 2500
RAM vendor,model Hynix, HMA42GR7MFR4N-TF IBM,???
amount_MB 262144 262144
NETWORK interface_name bond0 bond0
vendor,model Intel, X710 Dual Port Intel, X710 Dual Port
interfaces_count 2 2
bandwidth 10G 10G
STORAGE dev_name /dev/sda /dev/sda
vendor,model
raid1 PERC H730P Mini
2 disks Intel S3610
raid1 - LSI ????
2 disks Intel S3610
SSD/HDD SSD SSD
size 800GB 800GB
  • kuber is a one-node Kubernetes cluster used to run container with test tool

2.1.1.2. Network scheme and part of configuration of hardware network switches

Network scheme of the environment:

Network Scheme of the environment

Here is the piece of switch configuration for each switch port which is a part of bond0 interface of a server:

switchport mode trunk
switchport trunk native vlan 600
switchport trunk allowed vlan 600-602,630-649
spanning-tree port type edge trunk
spanning-tree bpduguard enable
no snmp trap link-status

2.1.1.3. Software configuration of Kubernetes service

2.1.1.3.1. Setting up Kubernetes

Kubernetes was installed using Kargo deplyment tool. Kargo operates the following roles:

  • master: Calico, Kubernetes API services
  • minion: Calico, kubernetetes minion services
  • etcd: etcd service

Kargo deploys Kubernetes cluster with the following matching hostnames and roles:

  • node1: minion+master+etcd
  • node2: minion+master+etcd
  • node3: minion+etcd
  • all other nodes: minion

We installed Kargo on top of dedicated node and start deployment (change ADMIN_IP and SLAVE_IPS variables to addresses of your nodes and SLAVES_COUNT to nodes count):

git clone https://review.openstack.org/openstack/fuel-ccp-installer
cd fuel-ccp-installer
cat >> create_env_kargo.sh << EOF
set -ex

export ENV_NAME="kargo-test"
export DEPLOY_METHOD="kargo"
export WORKSPACE="/root/workspace"
export ADMIN_USER="vagrant"
export ADMIN_PASSWORD="kargo"

# for 10 nodes
export SLAVES_COUNT=10
export ADMIN_IP="10.3.58.122"
export SLAVE_IPS="10.3.58.122 10.3.58.138 10.3.58.145 10.3.58.140 10.3.58.124 10.3.58.126 10.3.58.158 10.3.58.173 10.3.58.151 10.3.58.161"

export CUSTOM_YAML='docker_version: 1.12
hyperkube_image_repo: "quay.io/coreos/hyperkube"
hyperkube_image_tag: "v1.3.5_coreos.0"
etcd_image_repo: "quay.io/coreos/etcd"
etcd_image_tag: "v3.0.1"
calicoctl_image_repo: "calico/ctl"
#calico_node_image_repo: "calico/node"
calico_node_image_repo: "l23network/node"
calico_node_image_tag: "v0.20.0"
calicoctl_image_tag: "v0.20.0"
kube_apiserver_insecure_bind_address: "0.0.0.0"

mkdir -p $WORKSPACE
echo "Running on $NODE_NAME: $ENV_NAME"
cd /root/fuel-ccp-installer
bash "./utils/jenkins/run_k8s_deploy_test.sh"

EOF
./create_env_kargo.sh
Versions of some software
Software Version
Ubuntu Ubuntu 16.04.1 LTS
Kargo 54d64106c74c72433c7c492a8a9a5075e17de35b

2.1.1.3.2. Operating system configuration

You can find outputs of some commands and /etc folder in the following archive:

2.1.1.4. Software configuration of Test tool:

2.1.1.4.1. Test tool preparation

Kubernetes e2e-tests has been used to collect API latencies during the tests. We’ve run the test having Docker container with the tool. To build the container create e2e-tests directory and copy files from Files and scripts to build Docker container with e2e-test tool section to the directory. Then build the image:

root@kuber:~# cd e2e-tests
root@kuber:~/e2e-tests# docker build -t k8s_e2e ./

2.1.1.4.2. Test tool description

  • The test creates 30 pods per Kubernetes minion.
    • 300 on 10-nodes cluster
    • 1500 on 50-nodes cluster
    • 10650 on 355-nodes cluster
  • The test actually spawns replication controllers, not pods directly
  • The test spawns three types of replication controllers:
    • small which includes 5 pods
    • medium which includes 30 pods
    • big which includes 250 pods
  • After all containers are spawned the test resizes them
  • The test performs 10 actions/sec

You can see more from the load.py code.

Versions of some software
Software Version
Ubuntu Ubuntu 14.04 LTS
e2e-test (Kubernetes repo) v1.3.5
Docker 1.11.2, build b9f10c9

2.1.1.4.3. Operating system configuration:

You can find outputs of some commands and /etc folder in the following archive: server_description_of_e2e-test_node

2.1.2. Testing process

2.1.2.1. Preparation

  1. Kubernetes was set up on top of 10 nodes as described in Setting up Kubernetes section.
  2. e2e-test container was running on top of infrastructure one-node Kubernetes cluster called “kuber”. You can find k8s_e2e.yaml in Files and scripts to run Docker container with e2e-test tool. You need to change “${API_SERVER}” to URI of Kubernetes API (for example http://10.3.58.66:8080). Also you need to specify filder where results will be stored. For 10-nodes cluster we created “/var/lib/volumes/e2e-test/10_nodes” filder. This folder will be mounted to the container as a volume.
root@kuber:~/e2e-tests# mkdir -p /var/lib/volumes/e2e-test/10_nodes
# set API URI and volume folder:
root@kuber:~/e2e-tests# vim k8s_e2e.yaml
root@kuber:~/e2e-tests# kubectl create -f k8s_e2e.yaml
# To store log to a file:
root@kuber:~/e2e-tests# kubectl attach k8s-e2e 2>&1 | tee -a /var/lib/volumes/e2e-test/10_nodes/k8s-e2e.log
  1. After that we have a log file which includes JSON with Kubernetes API latency. We can use simple Python script from Script to convert JSON from log file to RST table to create rst tables from the log file.
root@kuber:~/e2e-tests# python create_rst_table_from_k8s_e2e_log.py /var/lib/volumes/e2e-test/10_nodes/k8s-e2e.log

Now we have /var/lib/volumes/e2e-test/10_nodes/k8s-e2e.rst file with rst tables.

We performed the steps from 1 to 3 for Kubernetes cluster on top of 10, 50 and 355 nodes.

2.1.3. Results

2.1.3.1. 10-nodes cluster (all values are presented in milliseconds)

2.1.3.1.1. resourcequotas

Method Perc99 Perc90 Perc50
LIST 1.323 1.323 1.323

2.1.3.1.2. secrets

Method Perc99 Perc90 Perc50
GET 2.121 1.734 1.505

2.1.3.1.3. replicationcontrollers

Method Perc99 Perc90 Perc50
PUT 6.425 5.793 4.77
POST 6.849 4.074 3.433
GET 1.872 1.6 1.393
LIST 7.31 6.674 3.989
DELETE 5.573 5.468 5.122

2.1.3.1.4. namespaces

Method Perc99 Perc90 Perc50
POST 2.514 2.514 2.514

2.1.3.1.5. nodes

Method Perc99 Perc90 Perc50
PUT 14.585 9.123 8.21
GET 2.342 2.255 1.971

2.1.3.1.6. endpoints

Method Perc99 Perc90 Perc50
GET 1.786 1.575 1.327

2.1.3.1.7. pods

Method Perc99 Perc90 Perc50
PUT 9.142 6.858 5.742
GET 2.369 1.775 1.514
LIST 4.951 1.936 1.328
DELETE 15.229 12.946 11.485

2.1.3.2. 50-nodes cluster (all values are presented in milliseconds)

2.1.3.2.1. resourcequotas

Method Perc99 Perc90 Perc50
LIST 1.289 1.289 1.161

2.1.3.2.2. jobs

Method Perc99 Perc90 Perc50
LIST 1.564 1.564 1.564

2.1.3.2.3. secrets

Method Perc99 Perc90 Perc50
GET 8.046 1.709 1.488

2.1.3.2.4. replicasets

Method Perc99 Perc90 Perc50
LIST 1.801 1.801 1.801

2.1.3.2.5. replicationcontrollers

Method Perc99 Perc90 Perc50
PUT 28.672 5.783 5.244
POST 11.475 4.107 3.295
GET 3.42 1.563 1.376
LIST 25.058 20.305 11.274
DELETE 7.505 5.625 4.957

2.1.3.2.6. daemonsets

Method Perc99 Perc90 Perc50
LIST 1.782 1.782 1.782

2.1.3.2.7. deployments

Method Perc99 Perc90 Perc50
LIST 1.988 1.988 1.988

2.1.3.2.8. petsets

Method Perc99 Perc90 Perc50
LIST 5.269 5.269 5.269

2.1.3.2.9. namespaces

Method Perc99 Perc90 Perc50
POST 3.032 3.032 3.032

2.1.3.2.10. services

Method Perc99 Perc90 Perc50
LIST 2.084 2.084 2.084

2.1.3.2.11. bindings

Method Perc99 Perc90 Perc50
POST 17.604 5.612 4.728

2.1.3.2.12. endpoints

Method Perc99 Perc90 Perc50
PUT 5.118 4.572 4.109
GET 4.355 1.417 1.238

2.1.3.2.13. pods

Method Perc99 Perc90 Perc50
PUT 15.325 6.657 5.43
GET 5.453 1.745 1.498
LIST 14.656 4.422 2.943
DELETE 17.64 12.753 11.651

2.1.3.2.14. nodes

Method Perc99 Perc90 Perc50
PUT 16.434 7.589 6.505
GET 3.959 1.836 1.558

2.1.3.3. 355-nodes cluster (all values are presented in milliseconds)

2.1.3.3.1. resourcequotas

Method Perc99 Perc90 Perc50
LIST 17.992 1.157 0.876

2.1.3.3.2. jobs

Method Perc99 Perc90 Perc50
LIST 16.852 16.852 0.807

2.1.3.3.3. secrets

Method Perc99 Perc90 Perc50
GET 23.669 1.605 1.211

2.1.3.3.4. replicasets

Method Perc99 Perc90 Perc50
LIST 52.656 52.656 1.282

2.1.3.3.5. replicationcontrollers

Method Perc99 Perc90 Perc50
PUT 18.369 5.031 4.116
POST 28.599 7.342 2.929
DELETE 9.61 4.845 4.137
LIST 85.6 53.296 28.359
GET 16.689 1.397 1.167

2.1.3.3.6. daemonsets

Method Perc99 Perc90 Perc50
LIST 53.41 53.41 17.984

2.1.3.3.7. deployments

Method Perc99 Perc90 Perc50
LIST 19.634 19.634 9.899

2.1.3.3.8. petsets

Method Perc99 Perc90 Perc50
LIST 9.086 9.086 0.987

2.1.3.3.9. namespaces

Method Perc99 Perc90 Perc50
POST 2.513 2.513 2.513

2.1.3.3.10. services

Method Perc99 Perc90 Perc50
LIST 1.542 1.542 1.258

2.1.3.3.11. nodes

Method Perc99 Perc90 Perc50
PUT 35.889 7.488 5.77
GET 23.749 1.832 1.407

2.1.3.3.12. endpoints

Method Perc99 Perc90 Perc50
GET 16.444 1.359 1.095

2.1.3.3.13. pods

Method Perc99 Perc90 Perc50
PUT 26.753 5.988 4.446
GET 18.755 1.579 1.258
LIST 44.249 24.433 13.045
DELETE 23.212 11.478 9.783

2.1.3.4. Comparation

Here is you can see results comparation from 10, 50 and 355 nodes clusters. Please note, that numbers of pods and other items depend on numbers of nodes.

  • 300 pods will be spawned on 10-nodes cluster
  • 1500 pods will be spawned on 50-nodes cluster
  • 10650 pods will be spawned on 355-nodes cluster
replicationcontrollers latency pods latency
endpoints latency nodes latency
resourcequotas.png latency secrets latency

2.1.3.5. Kubernetes pod startup latency measurement

For this testing purposes MMM(MySQL/Master/Minions) testing suite was used (more information in Pod startup time measurement toolkit section).

This toolkit was run against 150 nodes Kubernetes environment installed via Kargo deployment tool (these nodes were taken from the same nodes pool all previous Kubernetes API performance tests were run against). The most basic configuration (1 replication controller, N pods, each pod containing 1 container) was run against the environment. Additional configurations will be tested and results published in terms of further researches.

The first run includes information about 500 pods being run on fresh Kubernetes environment (no tests have been run on it before, warm-up run with about 3 pods per node density):

Containers startup time (500 containers, first run)

This weird timings pattern is related to the fact that first 500 containers pack was run against not warmed up environment (minions images were not pre-loaded on Kubernetes worker nodes, that means that during first run Docker registry/repo/etc was really stressed).

The same scenario run against the warmed-up environment will have linear pattern (with ~50 milliseconds per container startup, about 3 pods per cluster node density):

Containers startup time (500 containers, second run)

This pattern will remain the same with bigger number of containers (15000 containers, the same ~50 milliseconds per container startup, 100 pods per cluster node density):

Containers startup time (15000 containers)

2.1.4. Applications

2.1.4.1. Files and scripts to build Docker container with e2e-test tool

e2e-tests/Dockerfile:

FROM golang:1.6.3

RUN mkdir /reports && \
    apt-get update && \
    apt-get install -y rsync && \
    mkdir -p /go/src/k8s.io && \
    go get -u github.com/jteeuwen/go-bindata/go-bindata && \
    git clone -b v1.3.5 https://github.com/kubernetes/kubernetes.git /go/src/k8s.io/kubernetes

WORKDIR /go/src/k8s.io/kubernetes

RUN make all WHAT=cmd/kubectl && \
    make all WHAT=vendor/github.com/onsi/ginkgo/ginkgo && \
    make all WHAT=test/e2e/e2e.test

COPY entrypoint.sh /
RUN chmod +x /entrypoint.sh
CMD /entrypoint.sh

e2e-tests/entrypoint.sh:

#!/bin/bash
set -u -e

function escape_test_name() {
    sed 's/[]\$*.^|()[]/\\&/g; s/\s\+/\\s+/g' <<< "$1" | tr -d '\n'
}

TESTS_TO_SKIP=(
    '[k8s.io] Port forwarding [k8s.io] With a server that expects no client request should support a client that connects, sends no data, and disconnects [Conformance]'
    '[k8s.io] Port forwarding [k8s.io] With a server that expects a client request should support a client that connects, sends no data, and disconnects [Conformance]'
    '[k8s.io] Port forwarding [k8s.io] With a server that expects a client request should support a client that connects, sends data, and disconnects [Conformance]'
    '[k8s.io] Downward API volume should update annotations on modification [Conformance]'
    '[k8s.io] DNS should provide DNS for services [Conformance]'
    '[k8s.io] Load capacity [Feature:ManualPerformance] should be able to handle 3 pods per node'
)

function skipped_test_names () {
    local first=y
    for name in "${TESTS_TO_SKIP[@]}"; do
        if [ -z "${first}" ]; then
            echo -n "|"
        else
            first=
        fi
        echo -n "$(escape_test_name "${name}")\$"
    done
}

if [ -z "${API_SERVER}" ]; then
    echo "Must provide API_SERVER env var" 1>&2
    exit 1
fi

export KUBERNETES_PROVIDER=skeleton
export KUBERNETES_CONFORMANCE_TEST=y

# Configure kube config
cluster/kubectl.sh config set-cluster local --server="${API_SERVER}" --insecure-skip-tls-verify=true
cluster/kubectl.sh config set-context local --cluster=local --user=local
cluster/kubectl.sh config use-context local

if [ -z "${FOCUS}" ]; then
    # non-serial tests can be run in parallel mode
    GINKGO_PARALLEL=y go run hack/e2e.go --v --test -check_version_skew=false \
      --test_args="--ginkgo.focus=\[Conformance\] --ginkgo.skip=\[Serial\]|\[Flaky\]|\[Feature:.+\]|$(skipped_test_names)"

    # serial tests must be run without GINKGO_PARALLEL
    go run hack/e2e.go --v --test -check_version_skew=false --test_args="--ginkgo.focus=\[Serial\].*\[Conformance\] --ginkgo.skip=$(skipped_test_names)"
else
    go run hack/e2e.go --v --test -check_version_skew=false --test_args="--ginkgo.focus=$(escape_test_name "${FOCUS}") --ginkgo.skip=$(skipped_test_names)"
fi

2.1.4.2. Files and scripts to run Docker container with e2e-test tool

e2e-tests/k8s-e2e.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: k8s-e2e
spec:
  containers:
  - image: k8s-e2e
    name: k8s-e2e
    env:
    - name: E2E_REPORT_DIR
      value: /reports
    - name: API_SERVER
      value: ${API_SERVER}
    - name: FOCUS
      value: "Load capacity"
    volumeMounts:
      - mountPath: /reports
        name: job-params
  restartPolicy: Never
  volumes:
    - hostPath:
        path: /var/lib/volumes/e2e-test/10_nodes
      name: job-params

2.1.4.3. Script to convert JSON from log file to RST table

e2e-tests/create_rst_table_from_k8s_e2e_log.py:

#!/usr/bin/python

import json
import logging
import sys

from tabulate import tabulate


def cut_json_data(file_with_results):
    json_data = "{"
    start = False
    end = False
    with open(file_with_results) as f:
        for line in f:
            end = end or "Finish:Performance" in line
            if end:
                break
            if start:
                json_data += line
            start = start or "Result:Performance" in line
    data = json.loads(json_data)
    return data


def get_resources_and_request_types(data):
    resources = {}
    for data_item in data["dataItems"]:
        resource = data_item["labels"]["Resource"]
        if resource not in resources:
            resources[resource] = {}
        type_of_request = data_item["labels"]["Verb"]
        resources[resource][type_of_request] = data_item["data"]
    return resources


def create_rst_tables(resource):
    headers = ["Method"]
    data = []
    for method, perc in resource.iteritems():
        headers += perc.keys()
        data.append([method] + perc.values())
    tables = tabulate(data, headers=headers, tablefmt="grid")
    return tables


def put_tables_to_file(file_with_results):
    rst_file = file_with_results.split(".")[0] + ".rst"
    data = cut_json_data(file_with_results)
    with open(rst_file, 'w') as f:
        for resource, data in \
                get_resources_and_request_types(data).iteritems():
            table_head = "\n" + resource + "\n"
            table_head_underline = ""
            for character in resource:
                table_head_underline += "^"
            table_head += table_head_underline + "\n"
            f.write(table_head + create_rst_tables(data))


def main(file_with_results):
    put_tables_to_file(file_with_results)

if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG)
    main(sys.argv[1])

2.1.4.4. Pod startup time measurement toolkit

for Kubernetes pod startup latency measurement test case MMM(MySQL/Master/Minions) testing suite was used.

This is a client/server set for testing speed of k8s/docker/networking scheduling capabilities speed.

Architecture is simple and consist of the following:

  • MariaDB/MySQL service (replication controller with only one replica)
  • Master service, a simple Python application based on Flask framework with multiple threads and producer/consumer queue for SQL inserts
  • Minion replication controller - a simple bash script which registers minions on master service.

This approach guarantees that container will report about its status itself, so any issues (e.g. too slow startup or unsuccessful at all attempt to create a container will be observed in the testing results).

For more details please proceed to the MMM(MySQL/Master/Minions) testing suite documentation.