5.2. Measuring performance of container repositories

status:

ready

version:

1.0

Abstract:

This document describes a test plan for quantifying the performance of docker repository as a function of the number of clients of the systems.

Conventions:
  • Docker repository A complete microservices architecture need a some repository for images. This repository should provide storage for image, should can work with image versions, provide HA mode and scalability. There are several repositories, such as Docker Registry2, Sonatype Nexus or JFrog Artifactory.
  • Pull from a docker repository is a process when a client gets some docker image from a docker repository.
  • Push to a docker repository is a process when a client uploads some docker image to a docker repository.
  • Client is a software which communicate with a docker repository to push/pull a docker image to/from the docker repository. We’ll use Docker as a client.

5.2.1. List of performance metrics

The table below shows the list of test metrics which impact to docker repository system at all:

List of performance metrics
Parameter Description
PULL_TIME
The time which a client spends on reading a data
from the docker repository
PUSH_TIME
The time which a client spends on writing a data
to a docker repository
ITERATIONS_COUNT
Numbers of requests or chains of requests from a
client to docker repository and corresponding
responses from docker repository
to a client wchich perform an action or chain of
actions like a pull, push etc.
CONCURRENCY
Numbers of clients which pull/push a data from/to a
data from/to the docker repository at the same time
DATA_SIZE
A size of a data which clients read/write from/to
docker repository during one request-response cycle

5.2.2. Test Plan

5.2.2.1. Test Environment

5.2.2.1.1. Preparation

To test docker repository some tool is needed. Here we can propose Script for collecting performance metrics which you can find in Applications section.

5.2.2.1.2. Environment description

Test results MUST include a description of the environment used. The following items should be included:

  • Hardware configuration of each server. If virtual machines are used then both physical and virtual hardware should be fully documented. An example format is given below:
Description of servers hardware
server name    
role    
vendor,model    
operating_system    
CPU vendor,model    
processor_count    
core_count    
frequency_MHz    
RAM vendor,model    
amount_MB    
NETWORK interface_name    
vendor,model    
bandwidth    
STORAGE dev_name    
vendor,model    
SSD/HDD    
size    
  • Configuration of hardware network switches The configuration file from the switch can be downloaded and attached.
  • Configuration of virtual machines and virtual networks (if they are used) The configuration files can be attached, along with the mapping of virtual machines to host machines.
  • Network scheme. The plan should show how all hardware is connected and how the components communicate. All ethernet/fibrechannel and VLAN channels should be included. Each interface of every hardware component should be matched with the corresponding L2 channel and IP address.
  • Software configuration of the docker repository system sysctl.conf and any other kernel file that is changed from the default should be attached. List of installed packages should be attached. Specifications of the operating system, network interfaces configuration, and disk partitioning configuration should be included. If distributed provisioning systems are to be tested then the parts that are distributed need to be described.
  • Software configuration of the client nodes The operating system, disk partitioning scheme, network interface configuration, installed packages and other components of client nodes define limits which a client can experience during sending requests and getting responses to/from docker repository.

5.2.2.2. Test Case #1: Uploading to a docker repository.

5.2.2.2.1. Description

This test is aimed at measuring the image uploading (pull action) time.

5.2.2.2.2. List of performance metrics

list of test metrics to be collected during this test
Parameter Description
PUSH_TIME(CONCURRENCY)
The time which a client spends on pushing a
data to the docker repository, as a
function of concurrency value
list of test metrics to be persistent during this test:
Parameter Value
ITERATIONS_COUNT 1000
DATA_SIZE depends on your docker file

5.2.2.2.3. Measuring PUSH_TIME(CONCURRENCY) values

  1. Deploy docker repository from scratch. We should be sure that there is no data in the docker repository.
  2. Build 1000 images.
  1. Run a client in the cycle with ITERATIONS_COUNT iterations and CONCURRENCY concurrency value. The client should be able to push the images which we created on the step 2 and write a response time to a log/report. You need to perform by one cycle per each CONCURRENCY value from the following list:
    • CONCURRENCY=1
    • CONCURRENCY=10
    • CONCURRENCY=30
    • CONCURRENCY=50
    • CONCURRENCY=100
  1. As a result of the previous step you should be able to provide the amount of graphs and tables with the dependences on an iteration number of a response time. One graph and one table per each CONCURRENCY. On this step you need to calculate minima, maxima, average and 95% percental of PUSH_TIME parameter per each CONCURRENCY value. You need to fill the following table with calculated values:
PUSH_TIME(CONCURRENCY)
CONCURRENCY PUSH_TIME
minima maxima average 95%
         

5.2.2.3. Test Case #2: Downloading from a docker repository.

5.2.2.3.1. Description

This test is aimed at measuring the image downloading (pull action) time.

5.2.2.3.2. List of performance metrics

list of test metrics to be collected during this test
Parameter Description
PULL_TIME(CONCURRENCY)
The time which a client spends on pulling a
data from the docker repository, as a
function of concurrency value
list of test metrics to be persistent during this test:
Parameter Value
ITERATIONS_COUNT 1000
DATA_SIZE depends on your docker file

5.2.2.3.3. Measuring PULL_TIME(CONCURRENCY) values

  1. Deploy docker repository from scratch. We should be sure that there is no data in the docker repository.
  2. Build 1000 images.
  3. Upload 1000 images to the docker repository
  4. Delete created images from a local docker on a machine with test tool where docker images was created. After this step created images should be placed in the docker repository and they should be removed from the local docker.
  5. Run a client in the cycle with ITERATIONS_COUNT iterations and CONCURRENCY concurrency value. The client should be able to pull the images which we uploaded on the step 3 and write a response time to a log/report. You need to perform by one cycle per each CONCURRENCY value from the following list:
    • CONCURRENCY=1
    • CONCURRENCY=10
    • CONCURRENCY=30
    • CONCURRENCY=50
    • CONCURRENCY=100
  1. As a result of the previous step you should be able to provide the amount of graphs and tables with the dependences on an iteration number of a response time. One graph and one table per each CONCURRENCY. On this step you need to calculate minima, maxima, average and 95% percental of PULL_TIME parameter per each CONCURRENCY value. You need to fill the following table with calculated values:
PULL_TIME(CONCURRENCY)
CONCURRENCY PULL_TIME
minima maxima average 95%
         

5.2.3. Applications

5.2.3.1. list of container repositories

Name of container repositories Version
Docker Registry2  
Sonatype Nexus  
JFrog Artifactory  

5.2.3.2. Script for collecting performance metrics

This script has been tested with Python2.7. Here is three variables which you need to change:

  • iterations: - number of images which should be created, uploaded to a repository and downloaded from the repository.
  • concurrency: - number of threads which should work at the same time.
  • repo_address - address and port of a repository service.
#!/usr/bin/python

from subprocess import Popen, PIPE
from time import time
from threading import Thread
from Queue import Queue
import os
import argparse

iterations = 1000
concurrency = 30
repo_address = "172.20.9.16:5000"

repo_ref = "/test-1"
repo_url = repo_address + repo_ref
container_name = "nginx"
container_tag = "latest"
work_dir = "containers/nginx"
build_results_file = "build_results.csv"
push_results_file = "push_results.csv"
pull_results_file = "pull_results.csv"
delete_local_results_file = "delete_local_results.csv"

results_files = [build_results_file, push_results_file, pull_results_file, delete_local_results_file]
for results_file in results_files:
    outfile = open(results_file, 'w')
    outfile.write("iteration,spent_time")
    outfile.close()

work_queue = Queue()


def build_container(iteration):
    start_time = time()
    build_command = Popen(['docker', 'build', '--no-cache=true', '-t', repo_url + '/' + container_name + '-' + str(iteration) + ':' + container_tag, '--file=' + work_dir + '/Dockerfile', work_dir])
    build_command.wait()
    end_time = time()
    action_time = end_time - start_time
    print "Iteration", iteration, "has been done in", action_time
    outfile = open(build_results_file, 'a')
    outfile.write('\n' + str(iteration) + "," + str(action_time))
    outfile.close()


def push_container(iteration):
    start_time = time()
    build_command = Popen(['docker', 'push', repo_url + '/' + container_name + '-' + str(iteration)])
    build_command.wait()
    end_time = time()
    action_time = end_time - start_time
    print "Iteration", iteration, "has been done in", action_time
    outfile = open(push_results_file, 'a')
    outfile.write('\n' + str(iteration) + "," + str(action_time))
    outfile.close()


def delete_local_images(iteration):
    start_time = time()
    delete_local_images_command = Popen(['docker', 'rmi', repo_url + '/' + container_name + '-' + str(iteration)])
    delete_local_images_command.wait()
    end_time = time()
    action_time = end_time - start_time
    print "Iteration", iteration, "has been done in", action_time
    outfile = open(delete_local_results_file, 'a')
    outfile.write('\n' + str(iteration) + "," + str(action_time))
    outfile.close()


def pull_container(iteration):
    start_time = time()
    build_command = Popen(['docker', 'pull', repo_url + '/' + container_name + '-' + str(iteration)])
    build_command.wait()
    end_time = time()
    action_time = end_time - start_time
    print "Iteration", iteration, "has been done in", action_time
    outfile = open(pull_results_file, 'a')
    outfile.write('\n' + str(iteration) + "," + str(action_time))
    outfile.close()


def repeat():
    while work_queue.empty() is False:
        iteration = work_queue.get_nowait()
        container_action(iteration)
        work_queue.task_done()


def fill_queue(iterations):
    for iteration in range(1, (iterations + 1)):
        work_queue.put(iteration)

container_actions = [build_container, push_container, delete_local_images, pull_container]
for container_action in container_actions:
    fill_queue(iterations)
    for thread_num in range(1, (concurrency + 1)):
        if work_queue.empty() is True:
            break
        worker = Thread(target=repeat)
        worker.start()
    work_queue.join()

5.2.3.3. Proposed docker file

FROM ubuntu:14.04
RUN apt-get install -y nginx
EXPOSE 80
CMD /usr/sbin/nginx -g 'daemon off;'