Skip to content
This repository was archived by the owner on Feb 7, 2024. It is now read-only.

Expose task metrics for Prometheus #419

Merged
merged 6 commits into from
Apr 23, 2021
Merged

Expose task metrics for Prometheus #419

merged 6 commits into from
Apr 23, 2021

Conversation

mgrabovsky
Copy link
Contributor

@mgrabovsky mgrabovsky commented Apr 22, 2021

Introduce a new HTTP endpoint /metrics serving basic figures regarding retrace tasks in a machine readable format. This allows integration with a Prometheus server.

The following metrics are exposed at the moment:

  • Free disk space on volume where task data are located (/var/spool/retrace-server by default).
  • Number of retrace workers currently running.
  • Total number of retrace jobs denied because of exceeded server capacity (see the MaxParallelTasks configuration option).
  • Total number of retrace tasks finished by result -- failed or successful.

See https://retrace-stg.aws.fedoraproject.org/metrics for a testing version.

@mgrabovsky mgrabovsky requested a review from msrb April 22, 2021 10:50
@mgrabovsky mgrabovsky force-pushed the metrics branch 2 times, most recently from 756cf7c to 2ee3092 Compare April 22, 2021 11:51
* Use HTTPS everywhere.
* Remove some outdated information.
* Use `@var` instead of `@command` for configuration options.
Factor out the path to /usr/bin/ps into a compile-time variable and
add a dependency on procps-ng in the spec file.
/usr/bin/df from coreutils is used to check for free disk
space.

Also remove some duplicate dependencies.
Introduce a new HTTP endpoint `/metrics` serving basic figures regarding
retrace tasks in a machine readable format. This allows integration with
a Prometheus server.

The following metrics are exposed at the moment:

- Free disk space on volume where task data are located
  (`/var/spool/retrace-server` by default).
- Number of retrace workers currently running.
- Total number of retrace jobs denied because of exceeded server
  capacity (see the `MaxParallelTasks` configuration option).
- Total number of retrace tasks finished by result -- failed or
  successful.
Add a configuration option for exposing the metrics endpoint.
Keep it disabled by default.
Copy link
Member

@msrb msrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really great. I just submitted a request via abrt-retrace-client and I can see the metrics reflecting the fact that there is now a job running on the server.

+1 😉

@mgrabovsky
Copy link
Contributor Author

OK, I'll just go ahead and merge this. We can watch how it performs on retrace-stg for now and tweak details later.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants