1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-12-26 23:40:57 +01:00

Count lines in build log slices more cheaply

Summary:
See PHI766. Ref T13164. Build log chunk processing does a `preg_split()` on slices, but this isn't terribly efficient.

We can get the same count more cheaply by just using `substr_count()` a few times.

(I also tried `preg_match_all()`, which was between the two in speed.)

Test Plan:
- Used `bin/harbormaster rebuild-log --id X --force` to rebuild logs. Verified that the linemap is identical before/after this change.
- Saw local time for the 18MB log in PHI766 drop from ~1.7s to ~900ms, and `preg_split()` drop out of the profiler (we're now spending the biggest chunk of time on `gzdeflate()`).

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13164

Differential Revision: https://secure.phabricator.com/D19545
This commit is contained in:
epriestley 2018-07-27 13:22:36 -07:00
parent 690a460c8e
commit 9cf3b3bbf8

View file

@ -643,7 +643,13 @@ final class HarbormasterBuildLog
$pos += $slice_length;
$map_bytes += $slice_length;
$line_count += count(preg_split("/\r\n|\r|\n/", $slice)) - 1;
// Count newlines in the slice. This goofy approach is meaningfully
// faster than "preg_match_all()" or "preg_split()". See PHI766.
$n_rn = substr_count($slice, "\r\n");
$n_r = substr_count($slice, "\r");
$n_n = substr_count($slice, "\n");
$line_count += ($n_rn) + ($n_r - $n_rn) + ($n_n - $n_rn);
if ($map_bytes >= ($marker_distance - $max_utf8_width)) {
$map[] = array(