Collector configuration

Common options

Options common to all collectors are specified in the [collect] section of the configuration file. The following options are available:

  • collector: Defaults to gnocchi. The name of the collector to load. Must be one of [gnocchi, monasca, prometheus].

  • period: Default to 3600. Duration (in seconds) of the collect period.

  • wait_periods: Defaults to 2. Periods to wait before the current timestamp. This is done to avoid missing some data that hasn’t been retrieved by the data source yet.

  • metrics_conf: Defaults to /etc/cloudkitty/metrics.yml. Path of the metric collection configuration file. See “Metric collection” section below for details.

  • scope_key: Defaults to project_id. Key at which the scope can be found. The scope defines how data collection is split between the processors.

Collector-specific options

Collector-specific options must be specified in the collector_{collector_name} section of cloudkitty.conf.

Gnocchi

Section: collector_gnocchi.

  • gnocchi_auth_type: Defaults to keystone. Defines what authentication method should be used by the gnocchi collector. Must be one of basic (for gnocchi basic authentication) or keystone (for classic keystone authentication). If keystone is chosen, credentials can be specified in a section pointed at by the auth_section parameter.

  • gnocchi_user: For gnocchi basic authentication only. The gnocchi user.

  • gnocchi_endpoint: For gnocchi basic authentication only. The gnocchi endpoint.

  • interface: Defaults to internalURL. For keystone authentication only. The interface to use for keystone URL discovery.

  • region_name: Defaults to RegionOne. For keystone authentication only. Region name.

Monasca

Section: collector_monasca.

  • interface: Defaults to internal. The interface to use for keystone URL discovery.

  • monasca_service_name: Defaults to monasca. Name of the Monasca service in Keystone.

Note

By default, cloudkitty retrieves all metrics from Monasca in the project it is identified in. However, some metrics may need to be fetched from another tenant (for example if ceilometer is publishing metrics to monasca in the service tenant but monasca-agent is publishing metrics to the admin tenant). See the monasca-specific section in “Metric collection” below for details on how to configure this.

Prometheus

Section collector_prometheus.

  • prometheus_url: Prometheus HTTP API URL.

  • prometheus_user: For HTTP basic authentication. The username.

  • prometheus_password: For HTTP basic authentication. The password.

  • cafile: Option to allow custom certificate authority file.

  • insecure: Option to explicitly allow untrusted HTTPS connections.

Metric collection

Metric collection is highly configurable in cloudkitty. In order to keep the main configuration file as clean as possible, metric collection is configured in a yaml file. The path to this file defaults to /etc/cloudkitty/metrics.yml, but can be configured:

[collect]
metrics_conf = /my/custom/path.yml

Minimal Configuration

This config file has the following format:

metrics: # top-level key
  metric_one: # metric name
    unit: squirrel
    groupby: # attributes by which metrics should be grouped
      - id
    metadata: # additional attributes to retrieve
      - color

At the top level of the file, a metrics key is required. It contains a dict of metrics to collect, each key of the dict being the name of a metric as it is called in the datasource (volume.size or image.size for example).

For each metric, the following attributes are required:

  • unit: the unit in which the metric will be stored after conversion. This is just an indication for humans and has absolutely no impact on metric collection, conversion or rating.

  • groupby: A list of attributes by which metrics should be grouped on collection. These will allow to re-group data when it is retrieved through the v2 API. A typical usecase would be to group data by ID, project ID, domain ID and user ID on collection, but only by user ID on retrieval.

  • metadata: A list of additional attributes that should be collected for the given metric. These can be used for rating rules and will appear in monthly reports. However, it is not possible to group on these attributes. If you need to group on a metadata attribute, move it to the groupby list.

Note

The scope_key is automatically added to groupby.

Optional parameters

Unit conversion

If you need to convert the collected qty (from MiB to GiB for example), it can be done with the factor and offset options. factor defaults to 1 and offset to 0. These options are used to calculate the final result with the following formula: qty = collected_qty * factor + offset.

Note

factor and offset can be floats, integers or fractions.

Example from the default configuration file, conversion from B to MiB for the image.size metric:

metrics:
  image.size:
    groupby:
      - id
    metadata:
      - disk_format
    unit: MiB # Final unit
    factor: 1/1048576 # Dividing by 1024 * 1024

Note

Here we don’t add anything, so there is no need to specify offset.

Quantity mutation

It is also possible to mutate the collected qty with the mutate option. Four values are accepted for this parameter:

  • NONE: This is the default. The collected data is not modifed.

  • CEIL: The qty is rounded up to the closest integer.

  • FLOOR: The qty is rounded down to the closest integer.

  • NUMBOOL: If the collected qty equals 0, leave it at 0. Else, set it to 1.

  • NOTNUMBOOL: If the collected qty equals 0, set it to 1. Else, set it to 0.

Warning

Quantity mutation is done after conversion. Example:

factor: 10
mutate: CEIL

In consequence, the configuration above will convert 9.9 to 99 (9.9 -> 99 -> 99) and not to 100 (9.9 -> 10 -> 100)

A typical usecase for the NUMBOOL conversion would be instance uptime collection with the gnocchi collector: In order to know if an instance is running or paused, you can use the cpu metric. This metric is at 0 when the instance is paused. Thus, the qty is mutated to a NUMBOOL because the cpu metric always represents one instance. Rating rules are then defined based on the instance metadata. Example:

metrics:
  cpu:
    unit: instance
    mutate: NUMBOOL
    groupby:
      - id
    metadata:
      - flavor_id

The NOTNUMBOOL mutator is useful for status-like metrics where 0 denotes the billable state. For example the following Prometheus metric has value of 0 when the instance is in ACTIVE state but 4 if the instance is in ERROR state:

metrics:
  openstack_nova_server_status:
    unit: instance
    mutate: NOTNUMBOOL
    groupby:
      - id
    metadata:
      - flavor_id

Display name

Sometimes, you’ll want to use another name for a metric, either to shorten it a bit or to make it more explicit. For example, the cpu metric from the previous section could be called instance. That’s what the alt_name option does:

metrics:
  cpu:
    unit: instance
    alt_name: instance
    mutate: NUMBOOL
    groupby:
      - id
    metadata:
      - flavor_id

Collector-specific configuration

Some collectors require extra options. These must be specified through the extra_args option. Some options have defaults, other must be systematically specified. The extra args for each collector are detailed below.

Gnocchi

Note

In order to retrieve metrics from Gnocchi, Cloudkitty uses the dynamic aggregates endpoint. It builds an operation of the following format: (aggregate RE_AGGREGATION_METHOD (metric METRIC_NAME AGGREGATION_METHOD)). This means “retrieve all aggregates of type AGGREGATION_METHOD for the metric named METRIC_NAME and re-aggregate them using RE_AGGREGATION_METHOD”.

By default, the re-aggregation method defaults to the aggregation method.

Setting the re-aggregation method to a different value than the aggregation method is useful when the granularity of the aggregates does not match CloudKitty’s collect period, or when using rate: aggregation, as you’re probably don’t want a rate of rates, but rather a sum or max of rates.

  • resource_type: No default value. The resource type the current metric is bound to.

  • resource_key: Defaults to id. The attribute containing the unique resource identifier. This is an advanced option, do not modify it unless you know what you’re doing.

  • aggregation_method: Defaults to max. The aggregation method to use when retrieving measures from gnocchi. Must be one of min, max, mean, rate:min, rate:max, rate:mean.

  • re_aggregation_method: Defaults to aggregation_method. The re_aggregation method to use when retrieving measures from gnocchi.

  • force_granularity: Defaults to 0. If > 0, this granularity will be used for metric aggregations. Else, the lowest available granularity will be used (meaning the granularity covering the longest period).

  • use_all_resource_revisions: Defaults to True. This option is useful when using Gnocchi with the patch introduced via https://github .com/gnocchixyz/gnocchi/pull/1059. That patch can cause queries to return more than one entry per granularity (timespan), according to the revisions a resource has. This can be problematic when using the ‘mutate’ option of Cloudkitty. This option to allow operators to discard all datapoints returned from Gnocchi, but the last one in the granularity queried by CloudKitty for a resource id. The default behavior is maintained, which means, CloudKitty always use all of the data points returned.

  • custom_query: Provide means for operators to customize the aggregation query executed against Gnocchi. By default we use the following (aggregate RE_AGGREGATION_METHOD (metric METRIC_NAME AGGREGATION_METHOD)). Therefore, this option enables operators to take full advantage of operations available in Gnocchi such as any arithmetic operations, logical operations and many others. When using a custom aggregation query, you can keep the placeholders RE_AGGREGATION_METHOD, AGGREGATION_METHOD, and METRIC_NAME: they will be replaced at runtime by values from the metric configuration.

    One example use case is metrics that are supposed to be always growing values, such as RadosGW usage data. The usage data is affected by usage data trimming on RadosGW, which can lead to swaps (meaning, that the right side value of the series is smaller than the left side value) in the data series in Gnocchi. Therefore, to handle this situation one could, for instance, use the following custom query: (div (+ (aggregate RE_AGGREGATION_METHOD (metric METRIC_NAME AGGREGATION_METHOD)) (abs (aggregate RE_AGGREGATION_METHOD (metric METRIC_NAME AGGREGATION_METHOD)))) 2): this custom query would return 0 when the value of the series swap.

Monasca

  • resource_key: Defaults to resource_id. The attribute containing the unique resource identifier. This is an advanced option, do not modify it unless you know what you’re doing.

  • aggregation_method: Defaults to max. The aggregation method to use when retrieving measures from monasca. Must be one of min, max, mean.

  • forced_project_id: Defaults to None. Force the given metric to be fetched from a specific tenant instead of the one cloudkitty is identified in. For example, if cloudkitty is identified in the service project, but needs to fetch a metric from the admin project, its ID should be specified through this option. If this option is set to SCOPE_ID, the metric will be fetched from the current project (this assumes that scopes are configured to be projects/tenants).

Prometheus

  • aggregation_method: Defaults to max. The aggregation method to use when retrieving measures from prometheus. Must be one of avg, min, max, sum, count, stddev, stdvar.

  • query_function: Optional argument. The function to apply to an instant vector after the aggregation_method or range_function has altered the data. Must be one of abs, ceil, exp, floor, ln, log2, log10, round, sqrt. For more information on these functions, you can check this page

  • query_prefix: Optional argument. An arbitrary prefix to add to the Prometheus query generated by CloudKitty, separated by a space.

  • query_suffix: Optional argument. An arbitrary suffix to add to the Prometheus query generated by CloudKitty, separated by a space.

  • range_function: Optional argument. The function to apply instead of the implicit {aggregation_method}_over_time. Must be one of changes, delta, deriv, idelta, irange, irate, rate. For more information on these functions, you can check this page