Summary:
Ref T13164. See PHI790. Older versions of APCu reported cache keys as "key" from `apcu_cache_info()`. APC and newer APCu report it as "info".
Check both indexes for compatibility.
Test Plan:
- Locally, with newer APCu, saw no behavioral change.
- Will double check on `admin`, which has an older APCu with the "key" behavior.
- (I hunted this down by dumping `apcu_cache_info()` on `admin` to see what was going on.)
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13164
Differential Revision: https://secure.phabricator.com/D19569
Summary:
Ref T13000. This marks each table as either "data" (normal data), "cache" (automatically rebuilt, no need to ever dump) or "index" (can be manually rebuilt).
By default, `bin/storage dump` dumps data and index tables, but not cache tables.
With `--no-indexes`, it dumps only data tables. Indexes can be rebuilt after a restore with `bin/search index --all ...`.
Test Plan:
- Ran `--no-indexes` and normal dumps with `--trace`, verified that cache and index (former case) or cache only (latter case) tables were dumped with `--no-data`.
- Verified dump has the same number of `CREATE TABLE` statements as before the changes.
- Reviewed persistence tags in the web UI (note Ferret engine tables are "Index"):
{F5210886}
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13000
Differential Revision: https://secure.phabricator.com/D18682
Summary:
Ref T12997. See that task for more details. Briefly, an unusual dataset (where commits are mentioned hundreds of times by other commits) is causing some weird memory behavior in the daemons.
Forcing PHP to GC cycles explicitly after each task completes seems to help with this, by cleaning up some of the memory between tasks. A more thorough fix might be to untangle the `$xactions` structure, but that's significantly more involved.
Test Plan:
- Did this locally in a controlled environment, saw an immediate collection of a 500MB `$xactions` cycle.
- Put a similar change in production, memory usage seemed less to improve. It's hard to tell for sure that this does anything, but it shouldn't hurt.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T12997
Differential Revision: https://secure.phabricator.com/D18657
Summary:
See PHI36. APCu originally had `apc_` methods, but at some point dropped these and only provides `apcu_` methods.
When the `apcu_` method is present, use it. It may not be present for older versions of APCu, so keep the fallback.
Test Plan:
- With modern APCu, clicked "Purge Caches" in Config > Caches.
- Before: fatal on bad `apc_clear_caches` call.
- After: Valid cache clear.
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D18449
Summary: Ref T12859. This is an older command with a lot of hard-coded flags. Modularize cache purging in a modern way so it can be extended.
Test Plan: Ran `bin/cache purge --trace` with various valid and invalid flags.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12859
Differential Revision: https://secure.phabricator.com/D18146
Summary: Ref T9640. APCu 5.0+ (for PHP7) uses `apcu_*` functions instead of `apc_` functions. Test for function existence and call the appropriate functions.
Test Plan: {F2352695}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9640
Differential Revision: https://secure.phabricator.com/D17198
Summary:
Ref T11954. I want to store some lists/arrays in the mutable (database) cache, but it only supports string storage.
Provide a serializing wrapper which flattens when values are written and expands them when they're read.
Test Plan: Used by D16997. See that revision.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11954
Differential Revision: https://secure.phabricator.com/D16999
Summary:
Ref T11954. Depends on D16993. We have a couple of "look up the class for this key" queries which are costly enough to show up on a profile.
These aren't huge wins, but they're pretty easy. We currently do this like this:
```
$class_map = load_every_subclass();
return idx($class_map, $key);
```
However, we don't need to load EVERY subclass if we're only looking for, say, the Conduit method subclass which implements `user.whoami`. This allows us to cache that map and find the right class efficiently.
This cache is self-validating and completely safe even in development.
Test Plan:
- Used `curl` to make queries to `user.whoami`, verified that content was identical before and after the change.
- Used `ab -n100` to roughly measure 99th percentile time, which dropped from 74ms to 65ms. This is a small improvement (13% in the best case, here) but it benefits every Conduit method call.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11954
Differential Revision: https://secure.phabricator.com/D16994
Summary:
Ref T11954. Depends on D16992. We have some data which can be generated and cached at runtime. Three examples are:
- Class map from Conduit method names to implementing classes.
- Class map from PHID types to implementing classes.
- The main routing map.
None of these are huge wins but they impose global costs and can be shaved down through caching without introducing an enormous amount of new complexity.
The cost to these maps is that sometimes you'll need to restart your webserver, even in development mode if these caches are active. However, in some cases these changes are very rare, and in other cases we can just leave the cache disabled in development mode without a huge complexity cost.
Specifically, the Conduit/PHID type class maps are self-validating and can not go bad, even in development mode.
The routing map will be able to, but I plan to just disable it in development mode.
This provides a general-purpose pure APC cache stack for storing this data.
Test Plan: See future changes.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11954
Differential Revision: https://secure.phabricator.com/D16993
Summary:
Ref T11469. This isn't directly related, but has been on my radar for a while: building SSH keyfiles (particular for installs with a lot of keys, like ours) can be fairly slow.
At least one cluster instance is making multiple clone requests per second. While that should probably be rate limited separately, caching this should mitigate the impact of these requests.
This is pretty straightforward to cache since it's exactly the same every time, and only changes when users modify SSH keys (which is rare).
Test Plan:
- Ran `bin/auth-ssh`, saw authfile generate.
- Ran it again, saw it read from cache.
- Changed an SSH key.
- Ran it again, saw it regenerate.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11469
Differential Revision: https://secure.phabricator.com/D16744
Summary:
Ref T11613. In D16503/T11598 I refined the setup flow to improve messaging for early-stage setup issues, but failed to fully untangle things.
We sometimes still try to access a cache which uses configuration before we build configuration, which causes an error.
Instead, store "are we in flight / has setup ever worked?" in a separate cache which doesn't use the cache namespace. This stops us from trying to read config before building config.
Test Plan:
Hit bad extension error with a fake extension, got a proper setup help page:
{F1812803}
Solved the error, reloaded, broke things again, got a "friendly" page:
{F1812805}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11613
Differential Revision: https://secure.phabricator.com/D16542
Summary: Ref T4103. Provide a CLI mechanism for purging the user cache.
Test Plan:
- Purged with `--purge-user` and `--purge-all`.
- Verified cache table got wiped.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4103
Differential Revision: https://secure.phabricator.com/D16033
Summary:
Ref T4571. When a database goes down briefly, we fall back to replicas.
However, this fallback is slow (not good for users) and keeps sending a lot of traffic to the master (might be bad if the root cause is load-related).
Keep track of recent connections and fully degrade into "severed" mode if we see a sequence of failures over a reasonable period of time. In this mode, we send much less traffic to the master (faster for users; less load for the database).
We do send a little bit of traffic still, and if the master recovers we'll recover back into normal mode seeing several connections in a row succeed.
This is similar to what most load balancers do when pulling web servers in and out of pools.
For now, the specific numbers are:
- We do at most one health check every 3 seconds.
- If 5 checks in a row fail or succeed, we sever or un-sever the database (so it takes about 15 seconds to switch modes).
- If the database is currently marked unhealthy, we reduce timeouts and retries when connecting to it.
Test Plan:
- Configured a bad `master`.
- Browsed around for a bit, initially saw "unrechable master" errors.
- After about 15 seconds, saw "major interruption" errors instead.
- Fixed the config for `master`.
- Browsed around for a while longer.
- After about 15 seconds, things recovered.
- Used "Cluster Databases" console to keep an eye on health checks: it now shows how many recent health checks were good:
{F1213397}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4571
Differential Revision: https://secure.phabricator.com/D15677
Summary:
Ref T4571. There will be a very long path beyond this, but add a basic read-only mode. You can explicitly enable this to put Phabricator in a sort of "maintenance" mode today if you're swapping databases or something.
In the long term, we'll automatically degrade into this mode if the master database is down.
Test Plan:
- Enabled read-only mode.
- Browsed around.
- Didn't immediately see anything that was totally 100% broken.
Most stuff is 80-90% broken right now. For example:
- Stuff like submitting comments doesn't work, and gives you a confusing, unhelpful error.
- None of the UI really knows that it's read-only. EditEngine stuff should all hide itself and say "you can't add new comments while an install is in read-only mode", for example, but currently does not.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4571
Differential Revision: https://secure.phabricator.com/D15662
Summary:
Fixes T9874.
- Stop using the phrase "restart your webserver". Instead, say "restart Phabricator".
- Write a document explaining that "Restart Phabricator" means to restart all of the server processes, depending on how your configuration is set up, and approximately how to do that.
- Link to this document.
- In places where we are not specifically giving instructions and the user isn't expected to do anything, be intentionally vague so as to avoid being misleading.
Test Plan:
- Read document.
- Hit "exetnsion" and "PHP config" setup checks, got "restart Phabricator" with documentation links in both cases.
Reviewers: chad
Maniphest Tasks: T9874
Differential Revision: https://secure.phabricator.com/D14636
Summary:
Fixes T9599. When APC/APCu are not available, we fall back to a disk-based cache.
We try to share this cache across webserver processes like APC/APCu would be shared in order to improve performance, but are just kind of guessing how to coordinate it. From T9599, it sounds like we don't always get this right in every configuration.
Since this is complicated and error prone, just stop trying to do this. This cache has bad performance anyway (no production install should be using it), and we have much better APC/APCu setup instructions now than we did when I wrote this. Just using the PID is simpler and more correct.
Test Plan:
- Artificially disabled APC.
- Reloaded the page, saw all the setup stuff run.
- Reloaded the page, saw no setup stuff run (i.e., cache was hit).
- Restarted the webserver.
- Reloaded the page, saw all the setup stuff run.
- Reloaded again, got a cache hit.
I don't really know how to reproduce the exact problem with the parent PID not working, but from T9599 it sounds like this fixed the issue and from my test plan we still appear to get correct behavior in the standard/common case.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9599
Differential Revision: https://secure.phabricator.com/D14302
Summary:
Fixes T9494. This:
- Removes all the random GC.x.y.z config.
- Puts it all in one place that's locked and which you use `bin/garbage set-policy ...` to adjust.
- Makes every TTL-based GC configurable.
- Simplifies the code in the actual GCs.
Test Plan:
- Ran `bin/garbage collect` to collect some garbage, until it stopped collecting.
- Ran `bin/garbage set-policy ...` to shorten policy. Saw change in web UI. Ran `bin/garbage collect` again and saw it collect more garbage.
- Set policy to indefinite and saw it not collect garabge.
- Set policy to default and saw it reflected in web UI / `collect`.
- Ran `bin/phd debug trigger` and saw all GCs fire with reasonable looking queries.
- Read new docs.
{F857928}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9494
Differential Revision: https://secure.phabricator.com/D14219
Summary: Reachable via the cache config page, restricted to admins only. This makes it convenient to hotfix phabricator without requiring a restart.
Test Plan:
- Local dev machine doesn't have apc, so I get the not installed message.
- Faked the name and isEnabled parameters, verified dialog shows up as expected.
- Didn't test clear code
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: tycho.tatitscheff, joshuaspence, Korvin
Differential Revision: https://secure.phabricator.com/D14064
Summary: All classes should extend from some other class. See D13275 for some explanation.
Test Plan: `arc unit`
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D13283
Summary:
Ref T8424. This adds a standard KeyValueCache to serve as a request cache.
In particular, I need to cache Spaces (they are frequently accessed, sometimes by multiple viewers) but not have them survive longer than the scope of one request.
This request cache is explicitly destroyed by each web request and each daemon request.
In the very long term, building this kind of construct supports reusing PHP interpreters to run web requests (see some discussion in T2312).
Test Plan:
- Added and executed unit tests.
- Ran every daemon.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T8424
Differential Revision: https://secure.phabricator.com/D13153
Summary: Use `__CLASS__` instead of hard-coding class names. Depends on D12605.
Test Plan: Eyeball it.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: hach-que, Korvin, epriestley
Differential Revision: https://secure.phabricator.com/D12806
Summary:
Ref T7811. Fixes two minor issues I observed in the cluster:
- Sometimes APC doesn't give us key names. Not sure exactly what's up here, but we can do a better job with this.
- The `%` in `25%` actually needs more escaping, since it's interpreted by both `pht()` (immediately) and `console_format()` (later).
Test Plan:
- First one is just from an error log, not sure how to repro offhand.
- Ran `bin/phd help start` for the second one.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T7814, T7811
Differential Revision: https://secure.phabricator.com/D12395
Summary: Fixes T7791.
Test Plan: grep'd for the typo and only the typo declaration had that functon name.
Reviewers: epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T7791
Differential Revision: https://secure.phabricator.com/D12334
Summary: Ref T5501. These settings reduce error log noise.
Test Plan: Faked into this branch and hit the warning.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T5501
Differential Revision: https://secure.phabricator.com/D12319
Summary:
Ref T5501. Currently, we emit some bad warnings about, e.g., "apc.stat" on PHP 5.5+ systems with OPcache, where the warnings are not relevant.
Generate and raise warnings out of the CacheSpec pipeline so we only run relevant code.
Test Plan: Faked various warnings and saw them render correctly.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T5501
Differential Revision: https://secure.phabricator.com/D12318
Summary: Ref T5501. This expands cache information a little more.
Test Plan: {F362975}
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T5501
Differential Revision: https://secure.phabricator.com/D12316
Summary:
Ref T5501. This code was headed down a bad road; dump an indirection layer between rendering and data gatehring.
In particular, this will make it much easier to lift these issues into setup warnings eventually.
Test Plan: Viewed cache status page.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T5501
Differential Revision: https://secure.phabricator.com/D12315
Summary:
Ref T1191. Now that the whole database is covered, we don't need to do as much work to build expected schemata. Doing them database-by-database was helpful in converting, but is just reudndant work now.
Instead of requiring every application to build its Lisk objects, just build all Lisk objects.
I removed `harbormaster.lisk_counter` because it is unused.
It would be nice to autogenerate edge schemata, too, but that's a little trickier.
Test Plan: Database setup issues are all green.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley, hach-que
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10620
Summary:
Ref T1191. When changing the column type of an AUTO_INCREMENT column, we currently may lose the autoincrement attribute.
Instead, support it. This is a bit messy because AUTO_INCREMENT columns interact with PRIMARY KEY columns (tables may only have one AUTO_INCREMENT column, and it must be a primary key). We need to migrate in more phases to avoid this issue.
Introduce new `auto` and `auto64` types to represent autoincrement IDs.
Test Plan:
- Saw autoincrement show up correctly in web UI.
- Fixed an autoincrement issue on the XHProf storage table with `bin/storage adjust` safely.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10607
Summary:
Ref T1191.
- Adds definitions for missing keys and keys with wrong uniqueness. Generally, I defined these before fixing the key query to actually pull all keys and support uniqueness.
- Moves "key uniqueness" to note severity; this is fixable (probably?) and there are no remaining issues.
- Moves "Missing Key" to note severity; missing keys are fixable and all remaining missing keys are really missing (either missing edge keys, or missing PHID keys):
{F210089}
- Moves "Surplus Key" to note seveirty; surplus keys are fixable all remaining surplus keys are really surplus (duplicate key in Harbormaster, key on unused column in Worker):
{F210090}
Test Plan:
- Vetted missing/surplus/unique messages.
- 146 issues remaining.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10590
Summary: Ran `arc lint --apply-patches --everything` over rP, mainly to change double quotes to single quotes where appropriate. These changes also validate that the `ArcanistXHPASTLinter::LINT_DOUBLE_QUOTE` rule is working as expected.
Test Plan: Eyeballed it.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin, hach-que
Differential Revision: https://secure.phabricator.com/D9431
Summary: Ref T4045. Ref T5179. When saving a modern hunk, deflate it if we have the function and deflating it will save a nontrivial number of bytes.
Test Plan:
- Used `bin/hunks migrate` to move some hunks over, saw ~70-80% compression on most standard hunks.
- Viewed changesets using compressed hunks.
- Profiled `gzinflate()` and verified the cost is trivial (<< 1ms) at least for normal diffs.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4045, T5179
Differential Revision: https://secure.phabricator.com/D9292
Summary:
Fixes T5094. In some cases we do slightly expensive transformations to resources (inlining images, replacing URIs, building packages). We can throw cache in front of them easily since URIs are already permanently associated with a single resource.
Also browse around and move some CSS/JS into packages.
Test Plan:
Added logging to verify the caches are working, saw moderately improved performance.
Browsed around looking at resources tab in developer console, saw fewer total requests.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T5094
Differential Revision: https://secure.phabricator.com/D9175
Summary: Most requests examine the same buckets, especially the first bucket. Let them just read it out of request cache.
Test Plan: Observed most bucket fetches resolving in <10us instead of <10ms.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D9080
Summary:
Ref T2683. This is a refinement and simplification of D5257. In particular:
- D5257 only cached the commit chain, not path changes. This meant that we had to go issue an awkward query (which was slow on Facebook's install) periodically while reading the cache. This was reasonable locally but killed performance at FB scale. Instead, we can include path information in the cache. It is very rare that this is large except in Subversion, and we do not need to use this cache in Subversion. In other VCSes, the scale of this data is quite small (a handful of bytes per commit on average).
- D5257 required a large, slow offline computation step. This relies on D9044 to populate parent data so we can build the cache online at will, and let it expire with normal LRU/LFU/whatever semantics. We need this parent data for other reasons anyway.
- D5257 separated graph chunks per-repository. This change assumes we'll be able to pull stuff from APC most of the time and that the cost of switching chunks is not very large, so we can just build one chunk cache across all repositories. This allows the cache to be simpler.
- D5257 needed an offline cache, and used a unique cache structure. Since this one can be built online it can mostly use normal cache code.
- This also supports online appends to the cache.
- Finally, this has a timeout to guarantee a ceiling on the worst case: the worst case is something like a query for a file that has never existed, in a repository which receives exactly 1 commit every time other repositories receive 4095 commits, on a cold cache. If we hit cases like this we can bail after warming the cache up a bit and fall back to asking the VCS for an answer.
This cache isn't perfect, but I believe it will give us substantial gains in the average case. It can often satisfy "average-looking" queries in 4-8ms, and pathological-ish queries in 20ms on my machine; `hg` usually can't even start up in less than 100ms. The major thing that's attractive about this approach is that it does not require anything external or complicated, and will "just work", even producing reasonble improvements for users without APC.
In followups, I'll modify queries to use this cache and see if it holds up in more realistic workloads.
Test Plan:
- Used `bin/repository cache` to examine the behavior of this cache.
- Did some profiling/testing from the web UI using `debug.php`.
- This //appears// to provide a reasonable fast way to issue this query very quickly in the average case, without the various issues that plagued D5257.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley, jhurwitz
Maniphest Tasks: T2683
Differential Revision: https://secure.phabricator.com/D9045
Summary:
Ref T1191. I believe we only have three meaningful binary fields across all applications:
- The general cache may contain gzipped content.
- The file storage blob may contain arbitrary binary content.
- The Passphrase secret can store arbitrary binary data (although it currently never does).
This adds Lisk config for binary fields, and uses `%B` where necessary.
Test Plan:
- Added and executed unit tests.
- Forced file uploads to use MySQL, uploaded binaries.
- Disabled the CONFIG_BINARY on the file storage blob and tried again, got an appropraite failure.
- Tried to register with an account containing a G-Clef, and was stopped before the insert.
Reviewers: btrahan, arice
Reviewed By: arice
CC: arice, chad, aran
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D8316
Summary: This modularizes the rest of the GC submethods. Turned out there was nothing tricky.
Test Plan: Ran `bin/phd debug garbage` and got reasonable looking behavior and output.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D7971
Summary:
Ref T2015. Not directly related to Drydock, but I bumped into this. All these scripts currently enumerate their workflows explicitly.
Instead, use `PhutilSymbolLoader` to automatically discover workflows. This reduces code duplication and errors (see all the bad `extends` this diff fixes) and lets third parties add new workflows (not clearly valuable?).
Test Plan: Ran `bin/x help` for each modified script.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7840
Summary:
Ref T2015. Not directly related to Drydock, but I've wanted to do this for a bit.
Introduce a common base class for all the workflows in the scripts in `bin/*`. This slightly reduces code duplication by moving `isExecutable()` to the base, but also provides `getViewer()`. This is a little nicer than `PhabricatorUser::getOmnipotentUser()` and gives us a layer of indirection if we ever want to introduce more general viewer mechanisms in scripts.
Test Plan: Lint; ran some of the scripts.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7838