Summary:
Ref T4571. If we fail to connect to the master, automatically try to degrade into a temporary read-only mode ("UNREACHABLE") for the remainder of the request, if possible.
If the request was something like "load the homepage", that'll work fine. If it was something like "submit a comment", there's nothing we can do and we just have to fail.
Detecting this condition imposes a performance penalty: every request checks the connection and gives the database a long time to respond, since we don't want to drop writes unless we have to. So the degraded mode works, but it's really slow, and may perpetuate the problem if the root issue is load-related.
This lays the groundwork for improving this case by degrading futher into a "SEVERED" mode which will persist across requests. In the future, if several requests in a short period of time fail, we'll sever the database host and refuse to try to connect to it for a little while, connecting directly to replicas instead (basically, we're "health checking" the master, like a load balancer would health check a web application server). This will give us a better (much faster) degraded mode in a major service disruption, and reduce load on the master if the root cause is load-related, giving it a better chance of recovering on its own.
Test Plan:
- Disabled master in config by changing the host/username, got degraded automatically to UNREACAHBLE mode immediately.
- Faked full SEVERED mode, requests hit replicas and put me in the mode properly.
- Made stuff work, hit some good pages.
- Hit some non-cluster pages.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4571
Differential Revision: https://secure.phabricator.com/D15674
Summary: Ref T4571. If `cluster.databases` is configured but only has replicas, implicitly drop to read-only mode and send writes to a replica.
Test Plan:
- Disabled the `master`, saw Phabricator automatically degrade into read-only mode against replicas.
- (Also tested: explicit read-only mode, non-cluster mode, properly configured cluster mode).
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4571
Differential Revision: https://secure.phabricator.com/D15672
Summary:
Ref T4571. Allows users to click the "read-only mode" notification to get more information about why an install is in read-only mode.
Installs can be in this mode for several reasons (explicit administrative action, no masters defined, no masters reachable), and it's useful to be able to tell the difference.
Test Plan: {F1212930}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4571
Differential Revision: https://secure.phabricator.com/D15671
Summary:
Ref T4571. Ref T10759. Ref T10758. This isn't complete, but gets most of the job done:
- When `cluster.databases` is set up, most things ignore `mysql.host` now.
- You can `bin/storage upgrade` and stuff works.
- You can browse around in the web UI and stuff works.
There's still a lot of weird tricky stuff to navigate, and this has real no advantages over configuring a single server yet (no automatic failover, etc).
Test Plan:
- Configured `cluster.databases` to point at my `t1.micro` hosts in EC2 (master + replica).
- Ran `bin/storage upgrade`, got a new install setup on them properly.
- Survived setup warnings, browsed around.
- Switched back to local config, ran `bin/storage upgrade`, browsed around, went through setup checks.
- Intentionally broke config (bad hosts, no masters) and things seemed to react reasonably well.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4571, T10758, T10759
Differential Revision: https://secure.phabricator.com/D15668
Summary:
Ref T4571. There will be a very long path beyond this, but add a basic read-only mode. You can explicitly enable this to put Phabricator in a sort of "maintenance" mode today if you're swapping databases or something.
In the long term, we'll automatically degrade into this mode if the master database is down.
Test Plan:
- Enabled read-only mode.
- Browsed around.
- Didn't immediately see anything that was totally 100% broken.
Most stuff is 80-90% broken right now. For example:
- Stuff like submitting comments doesn't work, and gives you a confusing, unhelpful error.
- None of the UI really knows that it's read-only. EditEngine stuff should all hide itself and say "you can't add new comments while an install is in read-only mode", for example, but currently does not.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4571
Differential Revision: https://secure.phabricator.com/D15662
Summary: Fixes T8762.
Test Plan: Ran `bin/storage upgrade --namespace ... --user limited`, saw a more specific error.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T8762
Differential Revision: https://secure.phabricator.com/D15080
Summary:
Ref T10195. Distinguish between "database does not exist" and "database exists, you just don't have permission to access it".
We can't easily get this information out of INFORMATION_SCHEMA but can just `SHOW TABLES IN ...` every database that looks like it's missing and then look at the error code.
Test Plan:
- Created a user `limited` with limited access.
- Ran `bin/storage adjust`.
- Got hopefully more helpful messages about access problems, instead of "Missing" errors.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10195
Differential Revision: https://secure.phabricator.com/D15079
Summary: Fixes T9715. Adds a MySQL-based lock to ensure that schema migrations are not applied on multiple hosts simultaneously.
Test Plan: Ran `./bin/storage upgrade` concurrently. One invocation was successful whilst the other hit a `PhutilLockException`.
Reviewers: #blessed_reviewers, epriestley
Subscribers: Korvin
Maniphest Tasks: T9715
Differential Revision: https://secure.phabricator.com/D14463
Test Plan:
I didn't put any skill points in spelling since I need
combat skills to survive in a nuclear wasteland, but spell check says
this is better.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D14522
Summary: Rename the XHPAST database from `{$NAMESPACE}_xpastview` to `{$NAMESPACE}_xhpast`.
Test Plan: Ran `./bin/storage --namespace test upgrade --no-quickstart`.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin
Differential Revision: https://secure.phabricator.com/D14442
Summary:
It's hard for us to predict how long patches and migrations will take in the general case since it varies a lot from install to install, but we can give installs some kind of rough heads up about longer patches. I'm planning to just put a sort of hint for things in the changelog, something like this:
{F905579}
To make this easier, start storing how long stuff took. I'll write a little script to dump this into a table for the changelog.
Test Plan:
Ran `bin/storage status`:
{F905580}
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D14320
Summary: Ref T9514. I missed these when I swapped out the console stuff recently.
Test Plan: Ran `bin/storage probe`, saw bold instead of escape sequences.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9514
Differential Revision: https://secure.phabricator.com/D14240
Summary:
Ref T8672. Ref T9187. Root issue in at least one case is:
- User makes a commit including a file with some non-UTF8 text (say, a Japanese file full of Shift-JIS).
- We pass the file to the TransactionEditor so it can inline or attach the patch if the server is configured for these things.
- When inlining patches, we convert them to UTF8 before inlining. We must do this since the rest of the mail is UTF8.
- When attaching patches, we send them in the original encoding (as file attachments). This is correct, and means we need to give the worker the raw patch in whatever encoding it was originally in: we can't just convert it to utf8 earlier, or we'd attach the wrong patch in some cases.
- TransactionEditor does its thing (e.g., creates the commit), then gets ready to send mail about whatever it did.
- The publishing work now happens in the daemon queue, so we prepare to queue a PublishWorker and pass it the patch (with some other data).
- When we queue workers, we serialize the state data with JSON.
So far, so good. But this is where things go wrong:
- JSON can't encode binary data, and can't encode Shift-JIS. The encoding silently fails and we ignore it.
Then we get to the worker, and things go wrong-er:
- Since the data is bad, we fatal. This isn't a permanent failure, so we continue retrying the task indefinitely.
This applies several fixes:
# When queueing tasks, fail loudly when JSON encoding fails.
# In the worker, fail permanently when data can't be decoded.
# Allow Editors to specify that some of their data is binary and needs special handling.
This is fairly messy, but some simpler alternatives don't seem like good ways forward:
- We can't convert to UTF8 earlier, because we need the original raw patch when adding it as an attachment.
- We could encode //only// this field, but I suspect some other fields will also need attention, so that adding a mechanism will be worthwhile. In particular, I suspect filenames //may// be causing a similar problem in some cases.
- We could convert task data to always use a serialize()-based binary safe encoding, but this is a larger change and I think it's correct that things are UTF8 by default, even if it makes a bit of a mess. I'd rather have an explicit mess like this than a lot of binary data floating around.
The change to make `LiskDAO` will almost certainly catch some other problems too, so I'm going to hold this until after `stable` is cut. These problems were existing problems (i.e., the code was previously breaking or destroying data) so it's definitely correct to catch them, but this will make the problems much more obvious/urgent than they previously were.
Test Plan:
- Created a commit with a bunch of Shift-JIS stuff in a file.
- Tried to import it.
Prior to patch:
- Broken PublishWorker with distant, irrelevant error message.
With patch partially applied (only new error checking):
- Explicit, local error message about bad key in serialized data.
With patch fully applied:
- Import went fine and mail generated.
Reviewers: chad
Reviewed By: chad
Subscribers: devurandom, nevogd
Maniphest Tasks: T8672, T9187
Differential Revision: https://secure.phabricator.com/D13939
Summary: Use `PhutilClassMaQuery` instead of `PhutilSymbolLoader`, mostly for consistency. Depends on D13588.
Test Plan: Poked around a bunch of pages.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D13589
Summary: DRAFT - throw together Phurl skeleton.
Test Plan: The idea is that `some/long/url` will become `install/Udet4d` and can be viewed and edited at `install/Udet4d/view` and `install/Udet4d/edit`, respectively?
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: joshuaspence, chad, epriestley, Korvin
Maniphest Tasks: T6049
Differential Revision: https://secure.phabricator.com/D13681
Summary:
Basic plumbing for Badges application.
- You can make Badges.
- You can look at a list of them.
- They can be edited.
- They can be assigned to people.
- You can revoke them from people.
- You can subscribe to them.
Test Plan: Make Badges with various options. Give them to people. Take them away from people.
Reviewers: lpriestley, epriestley
Reviewed By: epriestley
Subscribers: tycho.tatitscheff, johnny-bit, epriestley, Korvin
Maniphest Tasks: T6526
Differential Revision: https://secure.phabricator.com/D13626
Summary: All classes should extend from some other class. See D13275 for some explanation.
Test Plan: `arc unit`
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D13283
Summary:
Ref T8424. I'm using Paste as a testbed application because Spaces make some degree of sense for it but it's also flat/simple.
This doesn't do anything interesting or useful and mostly just making the next (more interesting) diff smaller.
Test Plan:
- Ran `bin/storage upgrade -f`.
- Browsed pastes.
- Created a paste.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T8424
Differential Revision: https://secure.phabricator.com/D13154
Summary: Use `__CLASS__` instead of hard-coding class names. Depends on D12605.
Test Plan: Eyeball it.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: hach-que, Korvin, epriestley
Differential Revision: https://secure.phabricator.com/D12806
Summary:
This was broken in D12680.
```
EXCEPTION: (PhutilJSONParserException) Parse error on line 0 at column 0: 'null' is not a valid JSON object. at [<phutil>/src/parser/PhutilJSONParser.php:41]
#0 PhutilJSONParser::parse(string) called at [<phutil>/src/utils/utils.php:1062]
#1 phutil_json_decode(string) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:1640]
#2 LiskDAO::applyLiskDataSerialization(array, boolean) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:1386]
#3 LiskDAO::willReadData(array) called at [<phabricator>/src/infrastructure/storage/lisk/PhabricatorLiskDAO.php:214]
#4 PhabricatorLiskDAO::willReadData(array) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:608]
#5 LiskDAO::loadFromArray(array) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:652]
#6 LiskDAO::loadAllFromArray(array) called at [<phabricator>/src/applications/transactions/query/PhabricatorApplicationTransactionQuery.php:62]
#7 PhabricatorApplicationTransactionQuery::loadPage() called at [<phabricator>/src/infrastructure/query/policy/PhabricatorPolicyAwareQuery.php:227]
#8 PhabricatorPolicyAwareQuery::execute() called at [<phabricator>/src/infrastructure/query/policy/PhabricatorCursorPagedPolicyAwareQuery.php:143]
#9 PhabricatorCursorPagedPolicyAwareQuery::executeWithCursorPager(AphrontCursorPagerView) called at [<phabricator>/src/applications/base/controller/PhabricatorController.php:577]
#10 PhabricatorController::buildTransactionTimeline(PhabricatorPaste, PhabricatorPasteTransactionQuery) called at [<phabricator>/src/applications/paste/controller/PhabricatorPasteViewControll
```
Test Plan: No exception shown.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin, epriestley
Differential Revision: https://secure.phabricator.com/D12714
Summary:
Ref T6930. This application collects and displays performance samples -- roughly, things Phabricator spent some kind of resource on. It will collect samples on different types of resources and events:
- Wall time (queries, service calls, pages)
- Bytes In / Bytes Out (requests)
- Implicit requests to CSS/JS (static resources)
I've started with the simplest case (static resources), since this can be used in an immediate, straghtforward way to improve packaging (look at which individual files have the most requests recently).
There's no aggregation yet and a lot of the data isn't collected properly. Future diffs will add more dimension data (controllers, users), more event and resource types (queries, service calls, wall time), and more display options (aggregation, sorting).
Test Plan: {F389344}
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T6930
Differential Revision: https://secure.phabricator.com/D12623
Summary:
Ref T7149. When users give us dumpfiles for import, they will almost inevitably use the `phabricator` namespace. They need to be renamed to use an instance namespace.
We can do this either by:
- importing the data first, then renaming; or
- renaming first, then importing.
This implements the second one, basically `storage renamespace --in dump.sql --from phabricator --to instancename > instance.sql`.
Renaming first is a little hackier since we have to `preg_match()` a SQL dump file, but I think it's better overall:
- With only one database, it lets you dump/import without downtime.
- If you have development stuff in a development environment in the `phabricator` namespace, you don't have to move it aside to do an import.
- No possibility that two people doing an import at the same time on the same box will collide with each other.
- You can do the rename once and then repeat the import process with the renamed dump more easily.
- No tricky stuff with modern Phabricator running against an old dump and the database names not matching up.
None of this is super important, but it just makes large dumps a bit easier to work with, and the dumpfile format is regular enough that this seems unlikely to ever really not work.
Test Plan: Renamespaced a dump, did a `diff -u`, saw all the relevant parts changed (and only those parts changed).
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T7149
Differential Revision: https://secure.phabricator.com/D12105
Summary:
Ref T7522. This seems like the least-bad approach to a messy issue:
- When backfilling accounts from an imported instance, I need to write ExternalAccount rows to the instance to link instance accounts with upstream accounts.
- We do this in the daemons in some other cases, which lets us run all the code in the context of the instance. However, I really want to do this in-process here because it's way way simpler and we need to do writes to //both// the instance and the upstream, and they're interleaved, and they depend on one another.
- I can hard-code the query with `qsprintf()` but that feels like 100x worse than this.
This allows me to do this:
```
id(new PhabricatorExternalAccount())
->setForcedConnnection($instance_conn)
->...
->save();
```
...and get a write to the instance database, which is at least not completely a minefield.
Test Plan: Backfilled instance accounts and got interleaved instance and upstream writes as expected.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T7522
Differential Revision: https://secure.phabricator.com/D12098
Summary:
Fixes T7422. After the recent fix for "sort" columns, we can end up with invalid SQL in some cases when running quickstart.
In particular, we do "COLLATE binary CHARACTER SET utf8_general_ci" (which is invalid).
Preprocess these so we get "COLLATE utf8 CHARACTER SET utf8_general_ci" (which is valid and correct).
Test Plan: Ran `bin/storage upgrade -f --namespace blahblhbaba` with and without `--disable-utf8mb4`.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T7422
Differential Revision: https://secure.phabricator.com/D11929
Summary:
Fixes T7287. This trades off 4-byte character support for case insensitivity in these columns, which is a much better trade on the balance.
Also adds more warnings about old MySQL. Note that we already issue a warning when you run "storage adjust" (which I've made stronger) and already "strongly recommend" MySQL 5.5 or newer in the install documentation.
Test Plan:
- Ran `storage adjust --disable-utf8mb4` to go to old definitions, then ran `storage adjust` to get back to the new ones. Everything seemed OK in both cases.
- Verified that utf8mb4 data can be migrated out of these colums with `--unsafe` (which will truncate).
- Verified that manual explains this.
- Faked my way into the setup warning.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T7287
Differential Revision: https://secure.phabricator.com/D11893
Summary:
Ref T6840. This feels a little dirty; open to alternate suggestions.
We currently have a race condition where multiple daemons may load a commit and then save it at the same time, when processing "reverts X" text. Prior to this feature, two daemons would never load a commit at the same time.
The "reverts X" load/save has no effect (doesn't change any object properties), but it will set the state back to the loaded state on save(). This overwrites any flag updates made to the commit in the meantime, and can produce the race in T6840.
In other cases (triggers, harbormaster, repositories) we deal with this kind of problem with "append-only-updates + single-consumer", or a bunch of locking. There isn't really a good place to add a single consumer for commits, since a lot of daemons need to access them. We could move the flags column to a separate table, but this feels pretty complicated. And locking is messy, also mostly because we have so many consumers.
Just exempting this column (which has unusual behavior) from `save()` feels OK-ish? I don't know if we'll have other use cases for this, and I like it even less if we never do, but this patch is pretty small and feels fairly understandable (that said, I also don't like that it can make some properties just silently not update if you aren't on the lookout).
So, this is //a// fix, and feels simplest/least-bad for the moment to me, I thiiink.
Test Plan: Added and executed unit tests.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T6840
Differential Revision: https://secure.phabricator.com/D11822
Summary: mysqldump output can end up having weird encoding issues when raw BLOBs are in the output, preventing the backup restoration from succeeding. This hex-encodes blobs in the dump from the backup workflow causing the output file to only contain ASCII and ensure imports are successful.
Test Plan: Had issues restoring a backup from the original `mysqldump` command issued by this workflow. Ran the same command with this flag added and I was able to restore the backup.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D11704
Summary: Fixes T7078. Adds a `./bin/storage shell` command which passes through to a MySQL shell. This is slightly more convenient than running `mysql` manually.
Test Plan: Ran `./bin/storage shell` and got a MySQL shell.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin, epriestley
Maniphest Tasks: T7078
Differential Revision: https://secure.phabricator.com/D11548
Summary: Fixes T7050. I got the regexp slightly wrong and didn't catch it because it works fine on modern MySQL.
Test Plan: `arc unit --everything` still passes.
Reviewers: btrahan, chad
Reviewed By: chad
Subscribers: epriestley
Maniphest Tasks: T7050
Differential Revision: https://secure.phabricator.com/D11522
Summary:
One advantage I wanted to get out of T1191 is automated rebuilds of `quickstart.sql`. If they don't actually work, I'd like to know sooner rather than later. We haven't rebuilt in a couple months, so give it a shot.
Ran into two issues:
- Some very old patches specify overlong keys which don't work if your default charsets are utf8mb4. Shorten these. No real users have applied these in a very long time.
- Some gymnastics around `corpus` for the new Conpherence search index.
Test Plan:
- Ran `arc unit --everything`, got clean results.
- Cost to do a storage upgrade on an empty namespace dropped from ~4s to ~3s.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D11454
Summary:
We have to do some garbage nonsense to write database backups right now, see T6996.
When storage isn't initialized, we previously ended up with this message gzipped in a file and an empty error. Make the behavior slightly more tolerable.
Test Plan: Saw a meaningful error after trying to back up an uninitialized database.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D11449
Summary: Ref T6822.
Test Plan: `grep`. This method is only called from within `PhutilArgumentWorkflow::__construct`.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin, epriestley
Maniphest Tasks: T6822
Differential Revision: https://secure.phabricator.com/D11415
Summary: Ref T6822.
Test Plan: `grep`. This method is only called from within `LiskDAO::establishConnection()`.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T6822
Differential Revision: https://secure.phabricator.com/D11412
Summary:
Ref T6881. This is part 1 of my 35-step plan to support subscriptions that bill monthly.
Expanding the capabilities of counters will let me use them to create a logical clock on time-based event updates, build a daemon on top of that, and eventually get time-based triggers.
Test Plan: Added and executed unit tests.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: chad, epriestley
Maniphest Tasks: T6881
Differential Revision: https://secure.phabricator.com/D11395
Summary:
Fixes T6548.
- This workflow doesn't work under reasonable configurations and isn't trivial to fix (see T6548).
- We don't need it; this just makes things a little bit faster if you have to migrate everything (e.g., immediately after T1191) and the installs we know about have generally upgraded by now.
- This keeps kicking PKCS8 keys out of cache which is a pain.
Test Plan: Ran `bin/storage adjust` without it doing an implicit cache purge.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T6548
Differential Revision: https://secure.phabricator.com/D11377
Summary: Ref T6822. There are a bunch of places where we call `$something->generatePHID(...)` externally (outside of the class). Therefore, these methods need to be `public`.
Test Plan: I wouldn't expect //increasing// method visibility to break anything.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T6822
Differential Revision: https://secure.phabricator.com/D11363
Summary: Ref T6822.
Test Plan: Visual inspection. This method is only called from within the `PhabricatorTestCase` class.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T6822
Differential Revision: https://secure.phabricator.com/D11245