1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-12-23 14:00:56 +01:00

Make the "Is this JSON?" DocumentEngine heuristic a little tighter

Summary:
See PHI749. Ref T13164. We currently misdetect files starting with `[submodule ...` as JSON.

Make this a bit stricter:

  - If the file is short, just see if it's actually literally real JSON.
  - If the file is long, give up.

This should get the right result in pretty much all the cases people care about, I think. We could make the long-file guesser better some day.

Test Plan: Detected a `[submodule ...` file (no longer JSON) and a `{"duck": "quack"}` file (still JSON).

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13164

Differential Revision: https://secure.phabricator.com/D19544
This commit is contained in:
epriestley 2018-07-27 11:26:42 -07:00
parent 727bc2234c
commit cb99396c64

View file

@ -119,11 +119,23 @@ final class PhabricatorDocumentRef
} }
$snippet = $this->getSnippet(); $snippet = $this->getSnippet();
if (!preg_match('/^\s*[{[]/', $snippet)) {
// If the file is longer than the snippet, we don't detect the content
// as JSON. We could use some kind of heuristic here if we wanted, but
// see PHI749 for a false positive.
if (strlen($snippet) < $this->getByteLength()) {
return false; return false;
} }
return phutil_is_utf8($snippet); // If the snippet is the whole file, just check if the snippet is valid
// JSON. Note that `phutil_json_decode()` only accepts arrays and objects
// as JSON, so this won't misfire on files with content like "3".
try {
phutil_json_decode($snippet);
return true;
} catch (Exception $ex) {
return false;
}
} }
public function getSnippet() { public function getSnippet() {