phorge-phorge

mirror of https://we.phorge.it/source/phorge.git synced 2025-01-22 12:41:19 +01:00

History

epriestley 3f8eccdaec Put some whitespace behaviors back, but only for "diff alignment", not display Summary: Depends on D20185. Ref T13161. Fixes T6791. See some discusison in T13161. I want to move to a world where: - whitespace changes are always shown, so users writing YAML and Python are happy without adjusting settings; - the visual impact of indentation-only whitespace changes is significanlty reduced, so indentation changes are easy to read and users writing Javascript or other flavors of Javascript are happy. D20181 needs a little more work, but generally tackles these visual changes and lets us always show whitespace changes, but show them in a very low-impact way when they're relatively unimportant. However, a second aspect to this is how the diff is "aligned". If this file: ``` A ``` ..is changed to this file: ``` X A Y Z ``` ...diff tools will generally produce this diff: ``` + X A + Y + Z ``` This is good, and easy to read, and what humans expect, and it will "align" in two-up like this: ``` 1 X 1 A 2 A 3 Y 4 Z ``` However, if the new file looks like this instead: ``` X A' Y Z ``` ...we get a diff like this: ``` - A + X + A' + Y + Z ``` This one aligns like this: ``` 1 A 1 X 2 A' 3 Y 4 Z ``` This is correct if `A` and `A'` are totally different lines. However, if `A'` is pretty much the same as `A` and it just had a whitespace change, human viewers would prefer this alignment: ``` 1 X 1 A 2 A' 3 Y 4 Z ``` Note that `A` and `A'` are different, but we've aligned them on the same line. `diff`, `git diff`, etc., won't do this automatically, and a `.diff` doesn't have a way to say "these lines are more or less the same even though they're different", although some other visual diff tools will do this. Although `diff` can't do this for us, we can do it ourselves, and already have the code to do it, because we already nearly did this in the changes removed in D20185: in "Ignore All" or "Ignore Most" mode, we pretty much did this already. This mostly just restores a bit of the code from D20185, with some adjustments/simplifications. Here's how it works: - Rebuild the text of the old and new files from the diff we got out of `arc`, `git diff`, etc. - Normalize the files (for example, by removing whitespace from each line). - Diff the normalized files to produce a second diff. - Parse that diff. - Take the "alignment" from the normalized diff (whitespace removed) and the actual text from the original diff (whitespace preserved) to build a new diff with the correct text, but also better diff alignment. Originally, we normalized with `diff -bw`. I've replaced that with `preg_replace()` here mostly just so that we have more control over things. I believe the two behaviors are pretty much identical, but this way lets us see more of the pipeline and possibly add more behaviors in the future to improve diff quality (e.g., normalize case? normalize text encoding?). Test Plan: {F6217133} (Also, fix a unit test.) Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13161, T6791 Differential Revision: https://secure.phabricator.com/D20187	2019-02-19 13:11:50 -08:00
..
data	Put some whitespace behaviors back, but only for "diff alignment", not display	2019-02-19 13:11:50 -08:00
DifferentialParseRenderTestCase.php	Put some whitespace behaviors back, but only for "diff alignment", not display	2019-02-19 13:11:50 -08:00

epriestley 3f8eccdaec Put some whitespace behaviors back, but only for "diff alignment", not display

Summary:
Depends on D20185. Ref T13161. Fixes T6791.

See some discusison in T13161. I want to move to a world where:

- whitespace changes are always shown, so users writing YAML and Python are happy without adjusting settings;
- the visual impact of indentation-only whitespace changes is significanlty reduced, so indentation changes are easy to read and users writing Javascript or other flavors of Javascript are happy.

D20181 needs a little more work, but generally tackles these visual changes and lets us always show whitespace changes, but show them in a very low-impact way when they're relatively unimportant.

However, a second aspect to this is how the diff is "aligned". If this file:

```
A
```

..is changed to this file:

```
X
A
Y
Z
```

...diff tools will generally produce this diff:

```
+ X
A
+ Y
+ Z
```

This is good, and easy to read, and what humans expect, and it will "align" in two-up like this:

```
1 X
1 A 2 A
3 Y
4 Z
```

However, if the new file looks like this instead:

```
X
A'
Y
Z
```

...we get a diff like this:

```
- A
+ X
+ A'
+ Y
+ Z
```

This one aligns like this:

```
1 A
1 X
2 A'
3 Y
4 Z
```

This is correct if `A` and `A'` are totally different lines. However, if `A'` is pretty much the same as `A` and it just had a whitespace change, human viewers would prefer this alignment:

```
1 X
1 A 2 A'
3 Y
4 Z
```

Note that `A` and `A'` are different, but we've aligned them on the same line. `diff`, `git diff`, etc., won't do this automatically, and a `.diff` doesn't have a way to say "these lines are more or less the same even though they're different", although some other visual diff tools will do this.

Although `diff` can't do this for us, we can do it ourselves, and already have the code to do it, because we already nearly did this in the changes removed in D20185: in "Ignore All" or "Ignore Most" mode, we pretty much did this already.

This mostly just restores a bit of the code from D20185, with some adjustments/simplifications. Here's how it works:

- Rebuild the text of the old and new files from the diff we got out of `arc`, `git diff`, etc.
- Normalize the files (for example, by removing whitespace from each line).
- Diff the normalized files to produce a second diff.
- Parse that diff.
- Take the "alignment" from the normalized diff (whitespace removed) and the actual text from the original diff (whitespace preserved) to build a new diff with the correct text, but also better diff alignment.

Originally, we normalized with `diff -bw`. I've replaced that with `preg_replace()` here mostly just so that we have more control over things. I believe the two behaviors are pretty much identical, but this way lets us see more of the pipeline and possibly add more behaviors in the future to improve diff quality (e.g., normalize case? normalize text encoding?).

Test Plan:
{F6217133}

(Also, fix a unit test.)

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13161, T6791

Differential Revision: https://secure.phabricator.com/D20187

2019-02-19 13:11:50 -08:00

data

Put some whitespace behaviors back, but only for "diff alignment", not display

2019-02-19 13:11:50 -08:00

DifferentialParseRenderTestCase.php

Put some whitespace behaviors back, but only for "diff alignment", not display

2019-02-19 13:11:50 -08:00