Summary: Ran `arc lint --apply-patches --everything` over rP, mainly to change double quotes to single quotes where appropriate. These changes also validate that the `ArcanistXHPASTLinter::LINT_DOUBLE_QUOTE` rule is working as expected.
Test Plan: Eyeballed it.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin, hach-que
Differential Revision: https://secure.phabricator.com/D9431
Summary: Ref T4045. Ref T5179. When saving a modern hunk, deflate it if we have the function and deflating it will save a nontrivial number of bytes.
Test Plan:
- Used `bin/hunks migrate` to move some hunks over, saw ~70-80% compression on most standard hunks.
- Viewed changesets using compressed hunks.
- Profiled `gzinflate()` and verified the cost is trivial (<< 1ms) at least for normal diffs.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4045, T5179
Differential Revision: https://secure.phabricator.com/D9292
Summary:
Fixes T5094. In some cases we do slightly expensive transformations to resources (inlining images, replacing URIs, building packages). We can throw cache in front of them easily since URIs are already permanently associated with a single resource.
Also browse around and move some CSS/JS into packages.
Test Plan:
Added logging to verify the caches are working, saw moderately improved performance.
Browsed around looking at resources tab in developer console, saw fewer total requests.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T5094
Differential Revision: https://secure.phabricator.com/D9175
Summary: Most requests examine the same buckets, especially the first bucket. Let them just read it out of request cache.
Test Plan: Observed most bucket fetches resolving in <10us instead of <10ms.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D9080
Summary:
Ref T2683. This is a refinement and simplification of D5257. In particular:
- D5257 only cached the commit chain, not path changes. This meant that we had to go issue an awkward query (which was slow on Facebook's install) periodically while reading the cache. This was reasonable locally but killed performance at FB scale. Instead, we can include path information in the cache. It is very rare that this is large except in Subversion, and we do not need to use this cache in Subversion. In other VCSes, the scale of this data is quite small (a handful of bytes per commit on average).
- D5257 required a large, slow offline computation step. This relies on D9044 to populate parent data so we can build the cache online at will, and let it expire with normal LRU/LFU/whatever semantics. We need this parent data for other reasons anyway.
- D5257 separated graph chunks per-repository. This change assumes we'll be able to pull stuff from APC most of the time and that the cost of switching chunks is not very large, so we can just build one chunk cache across all repositories. This allows the cache to be simpler.
- D5257 needed an offline cache, and used a unique cache structure. Since this one can be built online it can mostly use normal cache code.
- This also supports online appends to the cache.
- Finally, this has a timeout to guarantee a ceiling on the worst case: the worst case is something like a query for a file that has never existed, in a repository which receives exactly 1 commit every time other repositories receive 4095 commits, on a cold cache. If we hit cases like this we can bail after warming the cache up a bit and fall back to asking the VCS for an answer.
This cache isn't perfect, but I believe it will give us substantial gains in the average case. It can often satisfy "average-looking" queries in 4-8ms, and pathological-ish queries in 20ms on my machine; `hg` usually can't even start up in less than 100ms. The major thing that's attractive about this approach is that it does not require anything external or complicated, and will "just work", even producing reasonble improvements for users without APC.
In followups, I'll modify queries to use this cache and see if it holds up in more realistic workloads.
Test Plan:
- Used `bin/repository cache` to examine the behavior of this cache.
- Did some profiling/testing from the web UI using `debug.php`.
- This //appears// to provide a reasonable fast way to issue this query very quickly in the average case, without the various issues that plagued D5257.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley, jhurwitz
Maniphest Tasks: T2683
Differential Revision: https://secure.phabricator.com/D9045
Summary:
Ref T1191. I believe we only have three meaningful binary fields across all applications:
- The general cache may contain gzipped content.
- The file storage blob may contain arbitrary binary content.
- The Passphrase secret can store arbitrary binary data (although it currently never does).
This adds Lisk config for binary fields, and uses `%B` where necessary.
Test Plan:
- Added and executed unit tests.
- Forced file uploads to use MySQL, uploaded binaries.
- Disabled the CONFIG_BINARY on the file storage blob and tried again, got an appropraite failure.
- Tried to register with an account containing a G-Clef, and was stopped before the insert.
Reviewers: btrahan, arice
Reviewed By: arice
CC: arice, chad, aran
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D8316
Summary: This modularizes the rest of the GC submethods. Turned out there was nothing tricky.
Test Plan: Ran `bin/phd debug garbage` and got reasonable looking behavior and output.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D7971
Summary:
Ref T2015. Not directly related to Drydock, but I bumped into this. All these scripts currently enumerate their workflows explicitly.
Instead, use `PhutilSymbolLoader` to automatically discover workflows. This reduces code duplication and errors (see all the bad `extends` this diff fixes) and lets third parties add new workflows (not clearly valuable?).
Test Plan: Ran `bin/x help` for each modified script.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7840
Summary:
Ref T2015. Not directly related to Drydock, but I've wanted to do this for a bit.
Introduce a common base class for all the workflows in the scripts in `bin/*`. This slightly reduces code duplication by moving `isExecutable()` to the base, but also provides `getViewer()`. This is a little nicer than `PhabricatorUser::getOmnipotentUser()` and gives us a layer of indirection if we ever want to introduce more general viewer mechanisms in scripts.
Test Plan: Lint; ran some of the scripts.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7838
Summary:
See <https://github.com/facebook/phabricator/issues/323>. We have a very old cache management script which doesn't purge all the modern caches (and does purge some caches which are no longer in use). Update it so it purges all the modern caches (remarkup, general, changeset), no longer purges outdated caches, and is easier to use.
Also delete a lot of "this script has moved" scripts from the last few rounds of similar cleanup, I believe all of these have been in master for at least several months, which should be enough time for users to get used to the new stuff.
Test Plan: Ran `bin/cache` with various arguments. Verified caches were purged.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D5978
Summary: See D5657. Also cleans up the namespacing stuff a little bit.
Test Plan: Disabled APC and verified that setup checks didn't run normally, but did run after restart and on `/config/issue/`.
Reviewers: vrana
Reviewed By: vrana
CC: aran
Differential Revision: https://secure.phabricator.com/D5658
Summary:
Submoduling is slightly convenient for developers but hellishly difficult for many users. Since we make about a dozen updates to Javelin per year, just include the source directly.
Even if we run `git submodule status` more often, this creates additional problems for users with PATH misconfigured.
Fixes T2062 by nuking it from orbit.
Test Plan: Loaded site, browsed around. Grepped for references to submodules.
Reviewers: btrahan, vrana
CC: aran
Maniphest Tasks: T2062
Differential Revision: https://secure.phabricator.com/D4581
Summary:
- When a setup issue is nonfatal (i.e., a warning), instruct the user to edit the value from the web UI instead of using `bin/config`.
- When the user edits configuration in response to a setup issue, send them back to the issue when they're done.
- When an issue relates to PHP configuration, link to the PHP documentation on configuration.
- Add new-style setup check for timezone issues.
Test Plan: Mucked with my timezone config, resolved the issues I created.
Reviewers: codeblock, btrahan, vrana
Reviewed By: codeblock
CC: aran
Maniphest Tasks: T2221, T2228
Differential Revision: https://secure.phabricator.com/D4298
Summary:
See T2062. This cache allows us to essentially implement this sort of block:
if (this_code_has_not_run_since_the_last_server_restart()) {
...
}
This will let us do setup checks automatically (i.e., without a specialized setup mode) without imposing hundreds of milliseconds of `git submodule status` and similar checks on every page load, even if an install does not have APC.
Broadly, the major goals here are:
- Reduce user errors and support costs related to misconfiguration (e.g., failure to update submodules).
- Simplify setup and configuration (remove 'phabricator.setup', remove/reduce PHABRICATOR_ENV).
- Move as much configuration to the web as possible (required for SaaS).
Test Plan:
Added this block to webroot/index.php:
$cache = PhabricatorCaches::getSetupCache();
$result = $cache->getKeys(array('x'));
if (empty($result['x'])) {
phlog('Cache miss + set.');
$cache->setKeys(array('x' => 'y'));
} else {
phlog('Cache hit.');
}
Verified it used APC correctly.
Disabled APC and verified it degraded to a reasonable disk-based behavior.
If we miss both of these we end up with no actual caching, but that's the best we can do. This code will also run too early in setup for it to be appropriate to raise exceptions out of this pathway -- later on, we can raise a warning that APC is not installed.
Reviewers: btrahan, vrana
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2227, T2062
Differential Revision: https://secure.phabricator.com/D4281
Summary:
See discussion in D4204. Facebook currently has a 314MB remarkup cache with a 55MB index, which is slow to access. Under the theory that this is an index size/quality problem (the current index is on a potentially-384-byte field, with many keys sharing prefixes), provide a more general index with fancy new features:
- It implements PhutilKeyValueCache, so it can be a component in cache stacks and supports TTL.
- It has a 12-byte hash-based key.
- It automatically compresses large blocks of data (most of what we store is highly-compressible HTML).
Test Plan:
- Basics:
- Loaded /paste/, saw caches generate and save.
- Reloaded /paste/, saw the page hit cache.
- GC:
- Ran GC daemon, saw nothing.
- Set maximum lifetime to 1 second, ran GC daemon, saw it collect the entire cache.
- Deflate:
- Selected row formats from the database, saw a mixture of 'raw' and 'deflate' storage.
- Used profiler to verify that 'deflate' is fast (12 calls @ 220us on my paste list).
- Ran unit tests
Reviewers: vrana, btrahan
Reviewed By: vrana
CC: aran
Differential Revision: https://secure.phabricator.com/D4259
Summary:
This commit doesn't change license of any file. It just makes the license implicit (inherited from LICENSE file in the root directory).
We are removing the headers for these reasons:
- It wastes space in editors, less code is visible in editor upon opening a file.
- It brings noise to diff of the first change of any file every year.
- It confuses Git file copy detection when creating small files.
- We don't have an explicit license header in other files (JS, CSS, images, documentation).
- Using license header in every file is not obligatory: http://www.apache.org/dev/apply-license.html#new.
This change is approved by Alma Chao (Lead Open Source and IP Counsel at Facebook).
Test Plan: Verified that the license survived only in LICENSE file and that it didn't modify externals.
Reviewers: epriestley, davidrecordon
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T2035
Differential Revision: https://secure.phabricator.com/D3886
Summary:
The immediate issue this addresses is T1366, adding a rendering cache to Phriction. For wiki pages with code blocks especially, rerendering them each time is expensive.
The broader issue is that out markup caches aren't very good right now. They have three major problems:
**Problem 1: the data is stored in the wrong place.** We currently store remarkup caches on objects. This means we're always loading it and passing it around even when we don't need it, can't genericize cache management code (e.g., have one simple script to drop/GC caches), need to update authoritative rows to clear caches, and can't genericize rendering code since each object is different.
To solve this, I created a dedicated cache database that I plan to move all markup caches to use.
**Problem 2: time-variant rules break when cached.** Some rules like `**bold**` are time-invariant and always produce the same output, but some rules like `{Tnnn}` and `@username` are variant and may render differently (because a task was closed or a user is on vacation). Currently, we cache the raw output, so these time-variant rules get locked at whatever values they had when they were first rendered. This is the main reason Phriction doesn't have a cache right now -- I wanted `{Tnnn}` rules to reflect open/closed tasks.
To solve this, I split markup into a "preprocessing" phase (which does all the parsing and evaluates all time-invariant rules) and a "postprocessing" phase (which evaluates time-variant rules only). The preprocessing phase is most of the expense (and, notably, includes syntax highlighting) so this is nearly as good as caching the final output. I did most of the work here in D737 / D738, but we never moved to use it in Phabricator -- we currently just do the two operations serially in all cases.
This diff splits them apart and caches the output of preprocessing only, so we benefit from caching but also get accurate time-variant rendering.
**Problem 3: cache access isn't batched/pipelined optimally.** When we're rendering a list of markup blocks, we should be able to batch datafetching better than we do. D738 helped with this (fetching is batched within a single hunk of markup) and this improves batching on cache access. We could still do better here, but this is at least a step forward.
Also fixes a bug with generating a link in the Phriction history interface ($uri gets clobbered).
I'm using PHP serialization instead of JSON serialization because Remarkup does some stuff with non-ascii characters that might not survive JSON.
Test Plan:
- Created a Phriction document and verified that previews don't go to cache (no rows appear in the cache table).
- Verified that published documents come out of cache.
- Verified that caches generate/regenerate correctly, time-variant rules render properly and old documents hit the right caches.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1366
Differential Revision: https://secure.phabricator.com/D2945