Monitoring

Statsd reporting

Zuul comes with support for the statsd protocol, when enabled and configured (see below), the Zuul scheduler will emit raw metrics to a statsd receiver which let you in turn generate nice graphics.

Configuration

Statsd support uses the statsd python module. Note that Zuul will start without the statsd python module, so an existing Zuul installation may be missing it.

The configuration is done via environment variables STATSD_HOST and STATSD_PORT. They are interpreted by the statsd module directly and there is no such parameter in zuul.conf yet. Your init script will have to initialize both of them before executing Zuul.

Your init script most probably loads a configuration file named /etc/default/zuul which would contain the environment variables:

$ cat /etc/default/zuul
STATSD_HOST=10.0.0.1
STATSD_PORT=8125

Metrics

These metrics are emitted by the Zuul Scheduler:

gerrit.event.<type> (counter)

Gerrit emits different kinds of messages over its stream-events interface. Zuul will report counters for each type of event it receives from Gerrit.

Refer to your Gerrit installation documentation for a complete list of Gerrit event types.

zuul.pipeline

Holds metrics specific to jobs. This hierarchy includes:

zuul.pipeline.<pipeline name>

A set of metrics for each pipeline named as defined in the Zuul config.

zuul.pipeline.<pipeline name>.all_jobs (counter)

Number of jobs triggered by the pipeline.

zuul.pipeline.<pipeline name>.current_changes (gauge)

The number of items currently being processed by this pipeline.

zuul.pipeline.<pipeline name>.job

Subtree detailing per jobs statistics:

zuul.pipeline.<pipeline name>.job.<jobname>

The triggered job name.

zuul.pipeline.<pipeline name>.job.<jobname>.<result> (counter, timer)

A counter for each type of result (e.g., SUCCESS or FAILURE, ERROR, etc.) for the job. If the result is SUCCESS or FAILURE, Zuul will additionally report the duration of the build as a timer.

zuul.pipeline.<pipeline name>.resident_time (timer)

A timer metric reporting how long each item has been in the pipeline.

zuul.pipeline.<pipeline name>.total_changes (counter)

The number of changes processed by the pipeline since Zuul started.

zuul.pipeline.<pipeline name>.wait_time (timer)

How long each item spent in the pipeline before its first job started.

zuul.pipeline.<pipeline name>.<project>

This hierarchy holds more specific metrics for each project participating in the pipeline. If the project name contains a / character, it will be replaced with a ..

zuul.pipeline.<pipeline name>.<project>.current_changes (gauge)

The number of items of this project currently being processed by this pipeline.

zuul.pipeline.<pipeline name>.<project>.resident_time (timer)

A timer metric reporting how long each item for this project has been in the pipeline.

zuul.pipeline.<pipeline name>.<project>.total_changes (counter)

The number of changes for this project processed by the pipeline since Zuul started.

As an example, given a job named myjob triggered by the gate pipeline which took 40 seconds to build, the Zuul scheduler will emit the following statsd events:

  • zuul.pipeline.gate.job.myjob.SUCCESS +1
  • zuul.pipeline.gate.job.myjob 40 seconds
  • zuul.pipeline.gate.all_jobs +1