History

dependabot[bot] d7ae0f889a Bump urllib3 from 1.26.16 to 1.26.17 in /scripts/metrics Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.16 to 1.26.17. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/1.26.16...1.26.17) --- updated-dependencies: - dependency-name: urllib3 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>		2023-10-04 10:24:52 +02:00
..
SQL_queries	added more queries	2021-12-14 11:40:34 +01:00
.gitignore	script to get statistics from git repo	2020-02-21 13:19:11 +01:00
analyze_jobs.ipynb	rename master to main branch following LLVM	2020-12-10 09:29:24 +01:00
buildbot_monitoring.py	use same db for all tables	2021-06-18 12:21:41 +02:00
buildbot_status_emails.py	creating tmp dir as needed	2020-05-05 16:54:08 +02:00
buildbots.py	fixed initialisation	2020-02-17 14:55:10 +01:00
buildkite_master_stats.py	rename master to main branch following LLVM	2020-12-10 09:29:24 +01:00
connect_db.sh	cron jobs for buildbot and phab monitoring	2021-05-20 21:44:19 +02:00
download_buildkite_builds_pg.sh	invoke script	2023-09-04 18:21:09 +02:00
jenkins.py	fixed report writing	2020-05-06 17:24:26 +02:00
load_buildkite.py	update buildkite monitoring	2021-06-28 10:16:36 +02:00
Pipfile	Cron job to load BK data to DB	2021-05-20 17:30:43 +02:00
Pipfile.lock	Bump urllib3 from 1.26.16 to 1.26.17 in /scripts/metrics	2023-10-04 10:24:52 +02:00
pull_phab_build_stats.py	added more phabricator metrics	2020-08-13 13:14:54 +02:00
README.md	rename master to main branch following LLVM	2020-12-10 09:29:24 +01:00
repo.py	rename master to main branch following LLVM	2020-12-10 09:29:24 +01:00
repo_hist.py	improved git metrics script (#295 )	2021-04-27 16:42:38 +02:00
repo_hist_db.py	repo_hist_db now using postgres DB	2021-05-11 10:09:51 +02:00
server_monitoring.py	use same db for all tables	2021-06-18 12:21:41 +02:00

README.md

Metrics

To measure the impact and usefulness of the pre-merge checks, we want to collect a set of metrics. This doc will summarize the metrics and tools. All of the data shall be collected as time series, so that we can see changes over time.

Impact - The metrics we ultimately want to improve
- Percentage of build-bot build on main failing. (Buildbot_percentage_failing)
- Time to fix a broken main build: Time between start of failing builds until the build is fixed. (BuildBot_time_to_fix)
- Percentage of Revisions on Phabricator where a broken build was fixed afterwards. This would indicate that a bug was found and fixed during the code review phase. (Premerge_fixes)
- Number of reverts on main. This indicates that something was broken on main that slipped through the pre-merge tests or was submitted without any review. (Upstream_reverts)
Users and behavior - Interesting to see and useful to adapt our approach.
- Percentage of commits to main that went through Phabricator.
- Number of participants in pre-merge tests.
- Percentage of Revisions with pre-merge tests executed
- Number of 30-day active committers on main and Phabricator.
Builds - See how the infrastructure is doing.
- Time between upload of diff until build results available.
- Percentage of Revisions with successful/failed tests
- Number of pre-merge builds/day.
- Build queuing time.
- Individual times for cmake, ninja all, ninja check-all per OS/architecture.
- Result storage size.
- Percentage of builds failing.

Requirements

Must:
- Do not collect/store personal data.
Should:
- Minimize the amount of additional tools/scripts we need to maintain.
- Collect all metrics in a central location for easy evaluation (e.g. database, CSV files).
Nice to have:
- As the data is from an open source project and available anyway, give public access to the metrics (numbers and charts).
- Send out alerts/notifications.
- Show live data in charts.

Data sources

This section will explain where we can get the data from.

build bot statistics

Solution

We need to find solutions for these parts:

Collect the data (regularly).
Store the time series somewhere.
Create & display charts.

Some ideas for this:

bunch of scripts:
- Run a bunch of scripts manually to generate the metrics every now and then. Phabricator already has a database and most entries there have timestamps. So we could also reconstruct the history from that.
- TODO: Figure out if we can collect the most important metrics this way. This requires that we can reconstruct historic values from the current logs/git/database/... entries.
Jenkins + CSV + Sheets:
- collect data with jenkins
- store numbers as CSV in this repo
- Charts are created manually on Google Sheets
do it yourself:
- Collect data with Jenkins jobs
- Store the data on Prometheus
- Visualize with Grafana
- host all tools ourselves
Stackdriver on GCP:
- TODO: figure out if we can get all the required data into Stackdriver
Jupyter notebooks:
- TODO: figure out how that works