llvm-premerge-checks/scripts/metrics/README.md

# Metrics

To measure the impact and usefulness of the pre-merge checks, we want to collect
a set of metrics. This doc will summarize the metrics and tools. All of the data
shall be collected as time series, so that we can see changes over time.

* Impact - The metrics we ultimately want to improve
    * Percentage of [build-bot build](http://lab.llvm.org:8011/) on master 
      failing. (Buildbot_percentage_failing)
    * Time to fix a broken master build: Time between start of failing builds 
      until the build is fixed. (BuildBot_time_to_fix)
    * Percentage of Revisions on Phabricator where a broken build was fixed 
      afterwards. This would indicate that a bug was found and fixed during 
      the code review phase. (Premerge_fixes)
    * Number of reverts on master. This indicates that something was broken on
      master that slipped through the pre-merge tests or was submitted without
      any review. (Upstream_reverts)

* Users and behavior - Interesting to see and useful to adapt our approach.
    * Percentage of commits to master that went through Phabricator.
    * Number of participants in pre-merge tests.
    * Percentage of Revisions with pre-merge tests executed
    * Number of 30-day active committers on master and Phabricator.

* Builds - See how the infrastructure is doing.
    * Time between upload of diff until build results available.
    * Percentage of Revisions with successful/failed tests
    * Number of pre-merge builds/day.
    * Build queuing time.
    * Individual times for `cmake`, `ninja all`, `ninja check-all` per 
      OS/architecture.
    * Result storage size.
    * Percentage of builds failing.

# Requirements

* Must: 
    * Do not collect/store personal data.
* Should:
    * Minimize the amount of additional tools/scripts we need to maintain.
    * Collect all metrics in a central location for easy evaluation (e.g. 
      database, CSV files).
* Nice to have:
    * As the data is from an open source project and available anyway, give 
      public access to the metrics (numbers and charts). 
    * Send out alerts/notifications.
    * Show live data in charts.


# Data sources

This section will explain where we can get the data from.

* build bot statistics

# Solution

We need to find solutions for these parts:
* Collect the data (regularly).
* Store the time series somewhere.
* Create & display charts.

Some ideas for this:
* bunch of scripts:
    * Run a bunch of scripts manually to generate the metrics every now and 
      then. Phabricator already has a database and most entries there have 
      timestamps. So we could also reconstruct the history from that.
    * TODO: Figure out if we can collect the most important metrics this way. 
      This requires that we can reconstruct historic values from the current
      logs/git/database/... entries.
* Jenkins + CSV + Sheets:
    * collect data with jenkins
    * store numbers as CSV in this repo
    * Charts are created manually on Google Sheets
* do it yourself:
    * Collect data with Jenkins jobs
    * Store the data on Prometheus 
    * Visualize with Grafana 
    * host all tools ourselves
* Stackdriver on GCP:
    * TODO: figure out if we can get all the required data into Stackdriver
* Jupyter notebooks:
    * TODO: figure out how that works
Proposal for metrics (#127) * First draft for metrics 2020-02-17 10:39:53 +01:00			`# Metrics`

			`To measure the impact and usefulness of the pre-merge checks, we want to collect`
			`a set of metrics. This doc will summarize the metrics and tools. All of the data`
			`shall be collected as time series, so that we can see changes over time.`

			`* Impact - The metrics we ultimately want to improve`
counting failed builds on buildbots storing results on Stackdriver 2020-02-17 13:10:52 +01:00			`* Percentage of [build-bot build](http://lab.llvm.org:8011/) on master`
			`failing. (Buildbot_percentage_failing)`
Proposal for metrics (#127) * First draft for metrics 2020-02-17 10:39:53 +01:00			`* Time to fix a broken master build: Time between start of failing builds`
counting failed builds on buildbots storing results on Stackdriver 2020-02-17 13:10:52 +01:00			`until the build is fixed. (BuildBot_time_to_fix)`
Proposal for metrics (#127) * First draft for metrics 2020-02-17 10:39:53 +01:00			`* Percentage of Revisions on Phabricator where a broken build was fixed`
			`afterwards. This would indicate that a bug was found and fixed during`
counting failed builds on buildbots storing results on Stackdriver 2020-02-17 13:10:52 +01:00			`the code review phase. (Premerge_fixes)`
Proposal for metrics (#127) * First draft for metrics 2020-02-17 10:39:53 +01:00			`* Number of reverts on master. This indicates that something was broken on`
			`master that slipped through the pre-merge tests or was submitted without`
counting failed builds on buildbots storing results on Stackdriver 2020-02-17 13:10:52 +01:00			`any review. (Upstream_reverts)`
Proposal for metrics (#127) * First draft for metrics 2020-02-17 10:39:53 +01:00
			`* Users and behavior - Interesting to see and useful to adapt our approach.`
			`* Percentage of commits to master that went through Phabricator.`
			`* Number of participants in pre-merge tests.`
			`* Percentage of Revisions with pre-merge tests executed`
			`* Number of 30-day active committers on master and Phabricator.`

			`* Builds - See how the infrastructure is doing.`
			`* Time between upload of diff until build results available.`
			`* Percentage of Revisions with successful/failed tests`
			`* Number of pre-merge builds/day.`
			`* Build queuing time.`
			* Individual times for `cmake`, `ninja all`, `ninja check-all` per
			`OS/architecture.`
			`* Result storage size.`
			`* Percentage of builds failing.`

			`# Requirements`

			`* Must:`
			`* Do not collect/store personal data.`
			`* Should:`
			`* Minimize the amount of additional tools/scripts we need to maintain.`
			`* Collect all metrics in a central location for easy evaluation (e.g.`
			`database, CSV files).`
			`* Nice to have:`
			`* As the data is from an open source project and available anyway, give`
			`public access to the metrics (numbers and charts).`
			`* Send out alerts/notifications.`
			`* Show live data in charts.`

counting failed builds on buildbots storing results on Stackdriver 2020-02-17 13:10:52 +01:00
			`# Data sources`

			`This section will explain where we can get the data from.`

			`* build bot statistics`

Proposal for metrics (#127) * First draft for metrics 2020-02-17 10:39:53 +01:00			`# Solution`

			`We need to find solutions for these parts:`
			`* Collect the data (regularly).`
			`* Store the time series somewhere.`
			`* Create & display charts.`

			`Some ideas for this:`
			`* bunch of scripts:`
			`* Run a bunch of scripts manually to generate the metrics every now and`
			`then. Phabricator already has a database and most entries there have`
			`timestamps. So we could also reconstruct the history from that.`
			`* TODO: Figure out if we can collect the most important metrics this way.`
			`This requires that we can reconstruct historic values from the current`
			`logs/git/database/... entries.`
			`* Jenkins + CSV + Sheets:`
			`* collect data with jenkins`
			`* store numbers as CSV in this repo`
			`* Charts are created manually on Google Sheets`
			`* do it yourself:`
			`* Collect data with Jenkins jobs`
			`* Store the data on Prometheus`
			`* Visualize with Grafana`
			`* host all tools ourselves`
			`* Stackdriver on GCP:`
			`* TODO: figure out if we can get all the required data into Stackdriver`
			`* Jupyter notebooks:`
			`* TODO: figure out how that works`