mirror of
https://we.phorge.it/source/phorge.git
synced 2024-12-22 21:40:55 +01:00
Correct a prose diff behavior when prose pieces include newlines
Summary: See <https://discourse.phabricator-community.org/t/bad-regex-in-prose-diff-logic/3969>. The prose splitting rules normally guarantee that newlines appear only at the beginning or end of blocks. However, if a prose sentence ends with text like "...x\n.", we can end up with a newline inside a "sentence". If we do, the regular expression that breaks it into pieces will fail. Arguably, this is an error in how sentences are split apart (we might prefer to split this into two sentences, "x\n" and ".", rather than a single "x\n." sentence) but in the general case it's not unreasonable for blocks to contain newlines, so a simple fix is to make the pattern more robust. Test Plan: Added a failing test which includes this behavior, made it pass. Differential Revision: https://secure.phabricator.com/D21295
This commit is contained in:
parent
f686a0b827
commit
36075f6ce5
2 changed files with 9 additions and 1 deletions
|
@ -148,7 +148,7 @@ final class PhutilProseDifferenceEngine extends Phobject {
|
|||
// whitespace at the end.
|
||||
|
||||
$matches = null;
|
||||
preg_match('/^(\s*)(.*?)(\s*)\z/', $result, $matches);
|
||||
preg_match('/^(\s*)(.*?)(\s*)\z/s', $result, $matches);
|
||||
|
||||
if (strlen($matches[1])) {
|
||||
$results[] = $matches[1];
|
||||
|
|
|
@ -30,6 +30,14 @@ final class PhutilProseDiffTestCase
|
|||
),
|
||||
pht('Remove Paragraph'));
|
||||
|
||||
$this->assertProseParts(
|
||||
'xxx',
|
||||
"xxxyyy\n.zzz",
|
||||
array(
|
||||
'= xxx',
|
||||
"+ yyy\n.zzz",
|
||||
),
|
||||
pht('Amend paragraph, and add paragraph starting with punctuation'));
|
||||
|
||||
// Without smoothing, the alogorithm identifies that "shark" and "cat"
|
||||
// both contain the letter "a" and tries to express this as a very
|
||||
|
|
Loading…
Reference in a new issue