Summary:
Depends on D19370. See T13124. See PHI549. The particular install in PHI549 migrated a large amount of data via the fallback hunk migration script, which does not compress hunks.
Add a mode to `bin/differential migrate-hunk` that amounts to "compress all the hunks which would benefit from compression".
Test Plan: Ran `bin/differential migrate-hunk` with `--auto`, `--all`, `--to`, `--id`, and `--dry-run` in various mixtures. Forced a bunch of hunks to raw ("byte") format, saw it cleanly upgrade them to compressed ("gzde") format.
Reviewers: amckinley
Reviewed By: amckinley
Differential Revision: https://secure.phabricator.com/D19371
Summary: Depends on D19056. Fixes T8475. Ref T13054. Merges "ModernHunk" back into "Hunk".
Test Plan: Grepped for `modernhunk`. Reviewed revisions. Created a new revision. Used `bin/differential migrate-hunk` to migrate hunks between storage formats and back.
Maniphest Tasks: T13054, T8475
Differential Revision: https://secure.phabricator.com/D19057
Summary:
Ref T12932. For long-lived installs, one of the largest tables tends to be the hunk data table. Although it doesn't grow tremendously fast, it's also well suited to storage in Files instead of the database (infrequent access, relatively large blobs of data, mostly one-at-a-time access), and earlier work anticipated eventually adding support for Files storage.
Make Files storage work, and provide `bin/differential migrate-hunk` to manually test/migrate hunks. This is currently the only way hunks get moved to file storage, but I expect to add a GC step which moves them to File storage after 30 days shortly.
The immediate motivation for this is to relieve storage pressure on db001/db002 so we have more headroom for deploying the Ferret engine and its larger indexes (see also T12819).
Test Plan:
- Used `bin/differential migrate-hunk` to move a hunk to and from file storage, verified it survived intact.
- Downloaded the actual stored file, sanity-checked it. Verified permissions.
- Destroyed a diff with `bin/remove destroy`, saw the hunk and file storage destroyed.
- Verified that going from file -> text destroys the old file properly with `migrate-hunk --trace ...`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12932
Differential Revision: https://secure.phabricator.com/D18584
Summary:
This has been replaced by `PolicyCodex` after D16830. Also:
- Rebuild Celerity map to fix grumpy unit test.
- Fix one issue on the policy exception workflow to accommodate the new code.
Test Plan:
- `arc unit --everything`
- Viewed policy explanations.
- Viewed policy errors.
Reviewers: chad
Reviewed By: chad
Subscribers: hach-que, PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D16831
Summary:
Ref T7447. Fixes T7600. This likely needs significant adjustment, but implements content-aware comment porting for line changes.
Specifically, this moves lines around to adjust their position considering added and removed lines between the diffs and across rebases.
It does not try to do any actual content (line against line) matching.
Test Plan:
- Unit tests.
- Poking around in the web UI seems to generate mostly reasonable-ish results?
- This may be a huge step backward in some cases that I just haven't hit.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: yelirekim, epriestley
Maniphest Tasks: T7600, T7447
Differential Revision: https://secure.phabricator.com/D12741
Summary: Ref T7776. This could get better, but I think I got most of the big stuff. It's ~4x faster now.
Test Plan:
Before:
{F393338}
After:
{F393339}
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T7776
Differential Revision: https://secure.phabricator.com/D12730
Summary:
Ref T1266. This prepares to fix case (2) on T1266 by improving the robustness of hunk parsing.
In particular, the copy detection code abuses this API because it isn't currently expressive or flexible enough.
Make it more flexible and cover it exhaustively.
I'll move callsites to the new stuff in upcoming revisions.
Test Plan: Unit tests.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1266
Differential Revision: https://secure.phabricator.com/D12144
Summary:
Ref T4045. Ref T5179. While we'll eventually need to force a migration, we can let installs (particularly large installs) do an online migration for now. This moves hunks to the new storage format one at a time.
(Note that nothing writes to the new store yet, so this is the only way to populate it.)
WARNING: Installs, don't run this yet! It won't compress the data. Wait until it can also do compression.
Test Plan: Added a `break;` after migrating one row and moved a few rows over. Spot checked them in the database and viewed the affected diffs.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4045, T5179
Differential Revision: https://secure.phabricator.com/D9291
Summary:
Ref T4045. Ref T5179. Hunk storage has two major issues:
- It's utf8, but actual diffs are binary.
- It's huge and can't be compressed or archived.
This introduces a second datastore which solves these problems: by recording hunk encoding, supporting compression, and supporting alternate storage. There's no actual compression or storage support yet, but there's space in the table for them.
Since nothing actually uses hunk IDs, it's fine to have these tables exist at the same time and use the same IDs. We can migrate data between the tables gradually without requiring downtime or disrupting installs.
Test Plan:
- There are no writes to the new table yet.
- The only effect this has is making us issue one extra query when looking for hunks.
- Observed the query issue, but everything else continue working fine.
- Created a new diff.
- Ran unit tests.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4045, T5179
Differential Revision: https://secure.phabricator.com/D9290
Summary:
Ref T4045. We have a lot of direct queries against the hunk table right now. These are messy, not really policy-aware, and limit our options on T4045.
This query is unusual (it requires changesets, and does not accept IDs). This keeps us from having to load changeset -> diff -> revision in order to do policy checks. We could also fix this with smarter policy checks and caching, but I'd rather not open that can of worms for now. This object is very low level and relatively unusual, and this small deviation from convention seems like the cleanest cut to make to keep this from snowballing.
Test Plan: Used Herald dry runs to verify that the affected rules still output the same data.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4045
Differential Revision: https://secure.phabricator.com/D8765
Summary: Ref T4045. These three methods are fairly copy-pastey. Provide a more formal DifferentialHunk API for querying various types of line ranges.
Test Plan: Used test console to verify that "added content", "removed content", and "changed content" rules still produce the same data.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4045
Differential Revision: https://secure.phabricator.com/D8764
Summary:
Does what it says on the label. We already had 'Any changed file content', now we have 'Any added file content' and 'Any removed file content'.
- There is a bit of copied/pasted code here: I'm open to suggestions on how to refactor it so it's less redundant.
- The wording seems a little awkward, and as @epriestley mentioned in T3829, moved code will be detected less than ideally.
Test Plan: Created Herald Rules, verified via dry run that they were triggered in appropriate situations.
Reviewers: epriestley
Reviewed By: epriestley
CC: Korvin, aran
Maniphest Tasks: T3829
Differential Revision: https://secure.phabricator.com/D7214
Summary:
This commit doesn't change license of any file. It just makes the license implicit (inherited from LICENSE file in the root directory).
We are removing the headers for these reasons:
- It wastes space in editors, less code is visible in editor upon opening a file.
- It brings noise to diff of the first change of any file every year.
- It confuses Git file copy detection when creating small files.
- We don't have an explicit license header in other files (JS, CSS, images, documentation).
- Using license header in every file is not obligatory: http://www.apache.org/dev/apply-license.html#new.
This change is approved by Alma Chao (Lead Open Source and IP Counsel at Facebook).
Test Plan: Verified that the license survived only in LICENSE file and that it didn't modify externals.
Reviewers: epriestley, davidrecordon
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T2035
Differential Revision: https://secure.phabricator.com/D3886
Summary: I will also need `getRemovedLines()` so refactor this first.
Test Plan:
New test case.
Viewed uncached diff.
Verified that the only callsite of `getAddedLines()` trims lines.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D2875
Summary:
- `kill_init.php` said "Moving 1000 files" - I hope that this is not some limit in `FileFinder`.
- [src/infrastructure/celerity] `git mv utils.php map.php; git mv api/utils.php api.php`
- Comment `phutil_libraries` in `.arcconfig` and run `arc liberate`.
NOTE: `arc diff` timed out so I'm pushing it without review.
Test Plan:
/D1234
Browsed around, especially in `applications/repository/worker/commitchangeparser` and `applications/` in general.
Auditors: epriestley
Maniphest Tasks: T1103
2012-06-01 12:32:44 -07:00
Renamed from src/applications/differential/storage/hunk/DifferentialHunk.php (Browse further)