1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-11-25 16:22:43 +01:00

Improve some behaviors around memory pressure when pushing many and/or large changes

Summary:
Ref T13142. When commits are pushed, we try to handle them on one of two pathways:

  - Normal changes: we load these into memory and potentially apply Herald content rules to them.
  - "Enormous" changes: we don't load these into memory and skip content rules for them.

The goal is to degrade gracefully when users push huge changes: they should work, just not support all the features.

However, some changes can slip through the cracks right now:

  - If you push a lot of commits at once, we'll try to cache all of the changes smaller than 1GB in memory. This can require an arbitrarily large amount of RAM.
  - We calculate sizes by just looking at the `strlen()` of the diff, but a changeset takes more RAM in PHP than the raw diff does. So even if a diff is "only" 500MB, it can take much more memory than that. On systems with relatively little memory available, this may result in OOM while processing changes that are close to the "enormous" limit.

This change makes two improvements:

  - Instead of caching everything, cache only 64MB of things.
    - For most pushes, this is the same, since they have less than 64MB of diffs.
    - For pushes of single very large changes, this is a bit slower (more CPU) since we have to do some work twice.
    - For pushes of many changes, this is slower (more CPU) since we have to do some work twice, but, critically, doesn't require unlimited memory.
  - Instead of flagging changes as "enormous" at 1GB, flag them as "enormous" at 256MB.
    - This reduces how much memory is required to process the largest "non-enormous" changes.
    - This also gets us under Git's hard-coded 512MB "always binary" cutoff; see T13143.
    - This is still completely gigantic and way larger than any normal change should be.

An additional improvement would be to try to reduce the amount of memory we need to use to hold a change in process memory. I think the other changes here alone will fix the immediate issue in PHI657, but it would be nice if the "largest non-enormous change" required only a couple gigs of RAM.

Test Plan:
- Used `ini_set('memory_limit', '1G')` to artificially limit memory to 1GB.
- Pushed a series of two commits which add two 550MB text files (Temporarily, I added a `--binary` flag to trick Git into showing real diffs for these, see T13143.)
- Got a memory limit error.
- Applied the "cache only 64MB of stuff" and "consider 256MB, not 1GB, to be enormous" changes.
- Pushed again, got properly rejected as enormous.
- Added `memory_get_usage()` calls to measure how actual memory size and reported "size" estimate compare. For these changes, saw a 639MB diff require 31,479MB of memory, i.e. a factor of about 50x. This is, uh, pretty not great.
- Allowed enormous changes, pushed again, push went through.

Reviewers: amckinley

Maniphest Tasks: T13142

Differential Revision: https://secure.phabricator.com/D19455
This commit is contained in:
epriestley 2018-05-18 12:23:23 -07:00
parent 3c5668b4a5
commit de999af614
2 changed files with 23 additions and 5 deletions

View file

@ -37,6 +37,7 @@ final class DiffusionCommitHookEngine extends Phobject {
private $rejectDetails;
private $emailPHIDs = array();
private $changesets = array();
private $changesetsSize = 0;
/* -( Config )------------------------------------------------------------- */
@ -1121,11 +1122,22 @@ final class DiffusionCommitHookEngine extends Phobject {
return;
}
// See T13142. Don't cache more than 64MB of changesets. For normal small
// pushes, caching everything here can let us hit the cache from Herald if
// we need to run content rules, which speeds things up a bit. For large
// pushes, we may not be able to hold everything in memory.
$cache_limit = 1024 * 1024 * 64;
foreach ($content_updates as $update) {
$identifier = $update->getRefNew();
try {
$changesets = $this->loadChangesetsForCommit($identifier);
$this->changesets[$identifier] = $changesets;
$info = $this->loadChangesetsForCommit($identifier);
list($changesets, $size) = $info;
if ($this->changesetsSize + $size <= $cache_limit) {
$this->changesets[$identifier] = $changesets;
$this->changesetsSize += $size;
}
} catch (Exception $ex) {
$this->changesets[$identifier] = $ex;
@ -1207,7 +1219,11 @@ final class DiffusionCommitHookEngine extends Phobject {
$changes = $parser->parseDiff($raw_diff);
$diff = DifferentialDiff::newEphemeralFromRawChanges(
$changes);
return $diff->getChangesets();
$changesets = $diff->getChangesets();
$size = strlen($raw_diff);
return array($changesets, $size);
}
public function getChangesetsForCommit($identifier) {
@ -1221,7 +1237,9 @@ final class DiffusionCommitHookEngine extends Phobject {
return $cached;
}
return $this->loadChangesetsForCommit($identifier);
$info = $this->loadChangesetsForCommit($identifier);
list($changesets, $size) = $info;
return $changesets;
}
public function loadCommitRefForCommit($identifier) {

View file

@ -207,7 +207,7 @@ final class HeraldCommitAdapter
}
public static function getEnormousByteLimit() {
return 1024 * 1024 * 1024; // 1GB
return 256 * 1024 * 1024; // 256MB. See T13142 and T13143.
}
public static function getEnormousTimeLimit() {