Summary: Fixes T12775. Currently, we do not validate this option and it's possible to configure it in an invalid way.
Test Plan: Tried to misconfigure things, was helpfully pointed toward errors.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12775
Differential Revision: https://secure.phabricator.com/D18041
Summary:
Ref T12611. Currently, the HTTP/SSH logs don't have an option to include the instance name.
Add such an option.
Leave it out of the default logs because most installs don't use this.
Test Plan: See next changes.
Reviewers: chad, amckinley
Reviewed By: chad
Maniphest Tasks: T12611
Differential Revision: https://secure.phabricator.com/D17776
Summary:
Ref T11476. This is a bit hacky, but makes `Application` extend `LiskDAO` so we can apply transactions to it with an `Editor` class.
Also fixes schema stuff so builds should produce a clean bill of health again.
This might only get you slightly further, yell if you run into more trouble.
Test Plan:
- Ran `bin/storage upgrade -f` and got no warnings.
- Browsed around, nothing exploded?
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T11476
Differential Revision: https://secure.phabricator.com/D17738
Summary:
Ref T12563. Before broadcasting messages from the server, store them in a history buffer.
A future change will let clients retrieve them.
Test Plan:
- Used the web frontend to look at the buffer, reloaded over time, sent messages. Saw buffer size go up as I sent messages and fall after 60 seconds.
- Set size to 4 messages, sent a bunch of messages, saw the buffer size max out at 4 messages.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12563
Differential Revision: https://secure.phabricator.com/D17707
Summary: Also fixes insufficiently-escaped regex examples
Test Plan: Made several changes to http://local.phacility.com/config/edit/syntax.filemap/ and observed validation failures on malformed regexes, and success on well-formed regexes.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Maniphest Tasks: T12532
Differential Revision: https://secure.phabricator.com/D17684
Test Plan:
Created new paste with title '.arcconfig' without choosing a language; observed that the paste gets highlighted as JSON.
JSON mode:
{F4901762}
Javascript mode:
{F4901763}
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Maniphest Tasks: T11667
Differential Revision: https://secure.phabricator.com/D17682
Summary: Ref T12509. This encourages code to move away from HMAC+SHA1 by making the method name more obviously undesirable.
Test Plan: `grep`, browsed around.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12509
Differential Revision: https://secure.phabricator.com/D17632
Summary:
Ref T12509. This adds support for HMAC+SHA256 (instead of HMAC+SHA1). Although HMAC+SHA1 is not currently broken in any sense, SHA1 has a well-known collision and it's good to look at moving away from HMAC+SHA1.
The new mechanism also automatically generates and stores HMAC keys.
Currently, HMAC keys largely use a per-install constant defined in `security.hmac-key`. In theory this can be changed, but in practice essentially no install changes it.
We generally (in fact, always, I think?) don't use HMAC digests in a way where it matters that this key is well-known, but it's slightly better if this key is unique per class of use cases. Principally, if use cases have unique HMAC keys they are generally less vulnerable to precomputation attacks where an attacker might generate a large number of HMAC hashes of well-known values and use them in a nefarious way. The actual threat here is probably close to nonexistent, but we can harden against it without much extra effort.
Beyond that, this isn't something users should really have to think about or bother configuring.
Test Plan:
- Added unit tests.
- Used `bin/files integrity` to verify, strip, and recompute hashes.
- Tampered with a generated HMAC key, verified it invalidated hashes.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12509
Differential Revision: https://secure.phabricator.com/D17630
Summary: Ref T12450. These are like 95% my fault, but Elastic appears to spell the name "Elasticsearch" consistently in their branding.
Test Plan: `grep ElasticSearch`
Reviewers: chad, 20after4
Maniphest Tasks: T12450
Differential Revision: https://secure.phabricator.com/D17601
Summary:
[ ] Write an "Upgrading: ..." guidance task with narrow instructions for installs that are upgrading.
[ ] Do we need to add an indexing activity (T11932) for installs with ElasticSearch?
[ ] We should more clearly detail exactly which versions of ElasticSearch are supported (for example, is ElasticSearch <2 no longer supported)? From T9893 it seems like we may //only// have supported ElasticSearch <2 before, so are the two regions of support totally nonoverlapping and all ElasticSearch users will need to upgrade?
[ ] Documentation should provide stronger guidance toward MySQL and away from Elastic for the vast majority of installs, because we've historically seen users choosing Elastic when they aren't actually trying to solve any specific problem.
[ ] When users search for fulltext results in Maniphest and hit too many documents, the current behavior is approximately silent failure (see T12443). D17384 has also lowered the ceiling for ElasticSearch, although previous changes lowered it for MySQL search. We should not fail silently, and ideally should build toward T12003.
[ ] D17384 added a new "keywords" field, but MySQL does not search it (I think?). The behavior should be as consistent across MySQL and Elastic as we can make it. Likely cleaner is giving "Project" objects a body, with "slugs" and "description" separated by newlines?
[ ] `PhabricatorSearchEngineTestCase` is now pointless and only detects local misconfigurations.
[ ] It would be nice to build a practical test suite instead, where we put specific documents into the index and then search for them. The upstream test could run against MySQL, and some `bin/search test` could run against a configured engine like ElasticSearch. This would make it easier to make sure that behavior was as uniform as possible across engine implementations.
[ ] Does every assigned task now match "user" in ElasticSearch?
[x] `PhabricatorElasticFulltextStorageEngine` has a `json_encode()` which should be `phutil_json_encode()`.
[ ] `PhabricatorSearchService` throws an untranslated exception.
[ ] When a search cluster is down, we probably don't degrade with much grace (unhandled exception)?
[ ] I haven't run bin/search init, but bin/search index doesn't warn me that I may want to. This might be worth adding. The UI does warn me.
[ ] bin/search init warns me that the index is "incorrect". It might be more clear to distinguish between "missing" and "incorrect", since it's more comforting to users to see "everything is as we expect, doing normal first-time setup now" than "something is wrong, fixing it".
[ ] CLI message "Initializing search service "ElasticSearch"" does not end with a period, which is inconsistent with other UI messages.
[ ] It might be nice to let bin/search commands like init and index select a specific service (or even service + host) to act on, as bin/storage --ref ... now does. You can generally get the result you want by fiddling with config.
[ ] When a service isn't writable, bin/search init reports "Search cluster has no hosts for role "write".". This is accurate but does not provide guidance: it might be more useful to the user to explain "This service is not writable, so we're skipping index check for it.".
[x] Even with write off for MySQL, bin/search index --type task --trace still updates MySQL, I think? I may be misreading the trace output. But this behavior doesn't make sense if it is the actual behavior, and it seems like reindexAbstractDocument() uses "all services", not "writable services", and the MySQL engine doesn't make sure it's writable before indexing.
[x] Searching or user fails to find task Grant users tokens when a mention is created, suggesting that stemming is not working.
[x] Searching for users finds that task, but fails to find a task containing "per user per month" in a comment, also suggesting that stemming is not working.
[x] Searching for maniphest fails to find task maniphest.query elephant, suggesting that tokenization in ElasticSearch is not as good as the MySQL tokenization for these words (see D17330).
[x] The "index incorrect" warning UI uses inconsistent title case.
[x] The "index incorrect" warning UI could format the command to be run more cleanly (with addCommand(), I think).
refs T12450
Test Plan:
* Stared blankly at the code.
* Disabled 'write' role on mysql fulltext service.
* Edited a task, ran search indexer, verified that the mysql index wasn't being updated.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin
Maniphest Tasks: T12450
Differential Revision: https://secure.phabricator.com/D17564
Summary:
The goal is to make fulltext search back-ends more extensible, configurable and robust.
When this is finished it will be possible to have multiple search storage back-ends and
potentially multiple instances of each.
Individual instances can be configured with roles such as 'read', 'write' which control
which hosts will receive writes to the index and which hosts will respond to queries.
These two roles make it possible to have any combination of:
* read-only
* write-only
* read-write
* disabled
This 'roles' mechanism is extensible to add new roles should that be needed in the future.
In addition to supporting multiple elasticsearch and mysql search instances, this refactors
the connection health monitoring infrastructure from PhabricatorDatabaseHealthRecord and
utilizes the same system for monitoring the health of elasticsearch nodes. This will
allow Wikimedia's phabricator to be redundant across data centers (mysql already is,
elasticsearch should be as well).
The real-world use-case I have in mind here is writing to two indexes (two elasticsearch clusters
in different data centers) but reading from only one. Then toggling the 'read' property when
we want to migrate to the other data center (and when we migrate from elasticsearch 2.x to 5.x)
Hopefully this is useful in the upstream as well.
Remaining TODO:
* test cases
* documentation
Test Plan:
(WARNING) This will most likely require the elasticsearch index to be deleted and re-created due to schema changes.
Tested with elasticsearch versions 2.4 and 5.2 using the following config:
```lang=json
"cluster.search": [
{
"type": "elasticsearch",
"hosts": [
{
"host": "localhost",
"roles": { "read": true, "write": true }
}
],
"port": 9200,
"protocol": "http",
"path": "/phabricator",
"version": 5
},
{
"type": "mysql",
"roles": { "write": true }
}
]
Also deployed the same changes to Wikimedia's production Phabricator instance without any issues whatsoever.
```
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin, epriestley
Tags: #elasticsearch, #clusters, #wikimedia
Differential Revision: https://secure.phabricator.com/D17384
Summary:
Fixes T12409. Config entries may be marked as "deleted", and `bin/config set --database` doesn't un-delete them, so the edit doesn't do anything.
The "most correct" fix here is to swap to transactions so we run the same code, but just fix this narrowly for now since it's one line of code.
Test Plan:
- Set `maniphest.default-priority` to `123`.
- Deleted `maniphest.default-priority` from the web UI by deleting all the text in the box.
- Before patch: `bin/config set --database maniphest.default-priority 789` had no effect.
- After patch: `bin/config set --database maniphest.default-priority 789` worked.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12409
Differential Revision: https://secure.phabricator.com/D17506
Summary:
Fixes T12400. Adds a "Has MFA" filter to People so you can figure out who you need to harass before turning on "require MFA".
When you run this as a non-admin, you don't currently actually hit the exception: the query just doesn't work. I think this is probably okay, but if we add more of these it might be better to make the "this didn't work" more explicit since it could be confusing in some weird edge cases (like, an administrator sending a non-administrator a link which they expect will show the non-administrator some interesting query results, but they actually just get no constraint). The exception is more of a fail-safe in case we make application changes in the future and don't remember this weird special case.
Test Plan:
- As an administrator and non-administrator, used People and Conduit to query MFA, no-MFA, and don't-care-about-MFA. These queries worked for an admin and didn't work for a non-admin.
- Viewed the list as an administrator, saw MFA users annotated.
- Viewed config help, clicked link as an admin, ended up in the right place.
{F4093033}
{F4093034}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12400
Differential Revision: https://secure.phabricator.com/D17500
Summary:
Fixes T12306. Currently, we warn about daemons not running even if they're in normal "alive" states, particularly "waiting to restart after a failure".
This check was made more strict in D12088, back when we tried to version check running daemons. Since we implemented auto-restart-after-config-change we don't do this anymore, so it should be fine to make this more lax again.
Test Plan:
- Faked an exception for all tasks.
- Before patch: reloading the daemon setup error sometimes raised a false positive ("waiting" daemon detected as dead).
- After patch: daemon setup error no longer triggers.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12306
Differential Revision: https://secure.phabricator.com/D17397
Summary: Ref T12240. When you "Reply All" to a Phabricator mail, we make an effort not to send the response to recipients who you hit with the original message. This isn't perfect and we can't always get it right, but the old description implies it's a bigger problem than it should be in practice.
Test Plan: Read text.
Reviewers: chad, eadler
Reviewed By: chad
Maniphest Tasks: T12240
Differential Revision: https://secure.phabricator.com/D17331
Summary: Fixes T12216. I'd like to remove this option eventually, but just narrow its scope in the config description for now.
Test Plan: Read config description.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12216
Differential Revision: https://secure.phabricator.com/D17317
Summary:
Fixes T12195. For the past few years, Recaptcha (now part of Google) has supported
a new, "no captcha" one-click user interface. This new UI is stable, doesn't
require any typing or reading words, and can even work without JavaScript (if
the administrator enables it on the Recaptcha side).
Furthermore, the new Recaptcha has a completely trivial API that can be dealt
with in a few lines of code. Thus, the external `recaptcha` php library is now
gone.
This API is a complete replacement for the old one, and does not require any
upgrade path for users or Phabricator administrators - public and secret keys
for the "new" Recaptcha UI are the exact same as the "classic" Recaptcha. Any
old Recaptcha keys for a domain will continue to work.
Note that Google is currently testing Yet Another new Captcha API, called
"Invisible reCAPTCHA", that will not require user interaction at all. In fact,
the user will not even be aware there //is even a captcha form//, as far as I
understand. However, this new API is 1) in beta, 2) requires new Recaptcha keys
(so it cannot be a drop-in replacement), and 3) requires more drastic API
changes, as form submission buttons must instead invoke JavaScript code, rather
than a token being passed along with the form submission. This would require far
more extensive changes to the controllers. Maybe when it's several years old, it
can be considered.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Test Plan:
Created a brand-new Phabricator installation, saw the new Captcha UI
on administrator sign up. Logged out, made 5 invalid login attempts, and saw the
new Captcha UI. Reworked the conditional to invert the condition, etc to test
and make sure the API responded properly.
Reviewers: epriestley, #blessed_reviewers, chad
Reviewed By: epriestley, #blessed_reviewers
Subscribers: avivey, Korvin
Maniphest Tasks: T12195
Differential Revision: https://secure.phabricator.com/D17304
Summary: Ref T9640. On 7.0 we had signal handling issues so we can never support it, but async signals should resolve them on 7.1 or newer.
Test Plan: On PHP 7.1, got through the setup warning.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9640
Differential Revision: https://secure.phabricator.com/D17197
Summary: Ref T9640. This option was removed in PHP7, so there's no reason to warn about it.
Test Plan: No longer saw a setup warning on PHP7.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9640
Differential Revision: https://secure.phabricator.com/D17196
Summary:
Fixes T12040. In T12039, a user running local patches followed the report instructions as far as grabbing version information, but didn't update or revert their local changes or try against a clean install before reporting.
This obviously isn't ideal for us, but it's understandable (grabbing version information is much easier than upgrading/reverting), and we can do better about making this information useful: when compiling version information, try to figure out the branchpoint from a known upstream `master` branch by listing remotes, then running `git merge-base` against them.
Additionally, explicitly document that we want upstream hashes. We have to have a fallback case in this document anyway (for when you can't get to Config) so hopefully this makes it more likely that we get useful information in initial reports.
Test Plan: {F2229574}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12040
Differential Revision: https://secure.phabricator.com/D17103
Summary: Ref T571. This was accidentally left behind in D12266.
Test Plan: Used {key command F} to search for "bulk".
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T571
Differential Revision: https://secure.phabricator.com/D17034
Summary:
- Fixes T11995. This got moved but I missed renaming this callsite.
- Fixes T11993. If you have valid credentials, but haven't run `storage upgrade` yet, we can hit this exception during setup. Just ignore it instead.
Test Plan:
- Saved global settings, no more fatal.
- Changed `storage-namespace` to junk, loaded web UI with valid database credentials.
{F2106358}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11993, T11995
Differential Revision: https://secure.phabricator.com/D17024
Summary: Fixes T11544. Attempt to detect if we're on a tiny, burstable-CPU AWS instance and complain.
Test Plan:
- Completely faked this locally.
- Hit the URI on an EC2 instance to check that it's correct (got back "m3.large", since that was the instance class).
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11544
Differential Revision: https://secure.phabricator.com/D17014
Summary:
Ref T11553. With some regularity, users make various configuration mistakes which we can detect by making a request to ourselves.
I use a magical header to make this request because we want to test everything else (parameters, path).
- Fixes T4854, probably. Tries to detect mod_pagespeed by looking for a header. This is a documentation-based "fix", I didn't actually install mod_pagespeed or formally test this.
- Fixes T6866. We now test for parameters (e.g., user somehow lost "QSA").
- Ref T6709. We now test that stuff is decoded exactly once (e.g., user somehow lost "B").
- Fixes T4921. We now test that Authorization survives the request.
- Fixes T2226. Adds a setup check to determine whether gzip is enabled on the web server, and attempts to enable it at the PHP level.
- Fixes `<space space newline newline space><?php` in `preamble.php`.
Test Plan: Tested all of these setup warnings, although mostly by faking them.
Reviewers: joshuaspence, chad
Reviewed By: chad
Subscribers: Korvin
Maniphest Tasks: T4854, T4921, T6709, T6866, T11553, T2226
Differential Revision: https://secure.phabricator.com/D12622
Summary:
Ref T11939. Depends on D16984. Now that CIDRLists can contain IPv6 addresses, blacklist all of the reserved IPv6 space.
This reserved blacklist is used to prevent users from accessing internal services via "Import Calendar" or "Add Macro".
They can't actually reach IPv6 addresses via these mechanisms yet because we need to do more work to support outbound IPv6 requests, but make sure reserved IPv6 space is blacklisted already when that support eventaully arrives.
Also, clean up some error messages (e.g., for trying to hit a bad URI in "Add Macro").
Test Plan:
- Loaded pages with default blacklist.
- Tried to make requests into IPv6 space.
- Currently, this is impossible because of `parse_url()` and `gethostynamel()` calls.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11939
Differential Revision: https://secure.phabricator.com/D16986
Summary:
Ref T11922. When we deploy on Saturday I need to rebuild all the cluster indexes, but some instances won't have anything indexed so they won't actually trigger the activity.
Add a `--force` flag that just clears an activity even if the activity is not required.
Test Plan: Ran `bin/config done reindex --force` several times.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11922
Differential Revision: https://secure.phabricator.com/D16970
Summary:
Ref T11922. After updating to HEAD of `master`, you need to manually rebuild the index. We don't do this during `bin/storage upgrade` because it can take a very long time (`secure.phabricator.com` took roughly an hour) and can happen while Phabricator is running.
However, if we don't warn users about this they'll just get a broken index unless they go read the changelog (or file an issue, then we tell them to go read the changelog).
This adds a very simple table for notes to administrators so we can write a "you need to go rebuild the index" note, then adds one.
Administrators clear the note by completing the activity and running `bin/config done reindex`. This isn't automatic because there are various strategies you can use to approach the issue, which I'll discuss in greater detail in the linked documentation.
Also, fix an issue where `bin/storage upgrade --apply <patch>` could try to re-mark an already-applied patch as applied.
Test Plan:
- Ran storage ugrades.
- Got instructions to rebuild search index.
- Cleared instructions with `bin/config done reindex`.
Reviewers: chad
Reviewed By: chad
Subscribers: avivey
Maniphest Tasks: T11922
Differential Revision: https://secure.phabricator.com/D16965
Summary:
Ref T11741. This makes everything work if we switch to InnoDB, but never actually switches yet.
Since the default minimum word length (3) and stopword list (36 common English words) in InnoDB are generally pretty reasonable, I just didn't add any setup advice for them. I figure we're better off with simpler setup until we identify some real problem that the builtin stopwords create.
Test Plan: Swapped the `false` to `true`, ran `storage adjust`, got InnoDB fulltext indexes, searched for stuff, got default "AND" behavior.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11741
Differential Revision: https://secure.phabricator.com/D16942
Summary:
Ref T11741. On recent-enough versions of MySQL, we would prefer to use InnoDB for fulltext indexes instead of MyISAM.
Allow `bin/storage adjust` to read actual and expected table engines, and apply adjustments as necessary.
We have one existing bad table that uses the wrong engine, `metamta_applicationemail`. This change corrects that table.
Test Plan:
- Ran `bin/storage upgrade`.
- Saw the adjustment phase apply this change properly:
```
>>>[463] <query> ALTER TABLE `local_metamta`.`metamta_applicationemail` COLLATE = 'utf8mb4_bin', ENGINE = 'InnoDB'
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11741
Differential Revision: https://secure.phabricator.com/D16941
Summary:
Ref T11741. Fixes T10642. Parse and compile user queries with a consistent ruleset, then submit queries to the backend using whatever ruleset MySQL is configured with.
This means that `ft_boolean_syntax` no longer needs to be configured (we'll just do the right thing in all cases).
This should improve behavior with RDS immediately (T10642), and allow us to improve behavior with InnoDB in the future (T11741).
Test Plan:
- Ran various queries in the UI, saw the expected results.
- Ran bad queries, got useful errors.
- Searched threads in Conpherence.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10642, T11741
Differential Revision: https://secure.phabricator.com/D16939
Summary:
Ref T11044. This is still catching the older exceptions, which are now more general.
If you loaded the web UI without MySQL running, this meant you got a less-helpful error.
Test Plan: Stopped MySQL, loaded web UI, got a more-helpful error.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11044
Differential Revision: https://secure.phabricator.com/D16930
Summary:
Ref T11044. This was old Facebook cruft for reading configuration from SMC (and maybe doing some other questionable things). See D183.
(See also D175 for discussion of this from 2011.)
In modern Phabricator, you can subclass `SiteConfig` to provide dynamic configuration, and we do so in the Phacility cluster. This lets you change any config, and change in response to requests (e.g., for instancing) and is generally more powerful than this mechanism was.
This configuration provider theoretically let you roll your own replication or partitioning, but in practice I believe no one ever did, and no one ever could have anyway without more support in the upstream (for migrations, read-after-write, etc).
Test Plan:
- Grepped for removed option.
- Browsed around with clustering off.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11044
Differential Revision: https://secure.phabricator.com/D16911
Summary:
Ref T11044. One popular tool in a modern operations environment is Puppet. The primary purpose of this tool is to randomly revert hosts to older or different configurations.
Introducing an element of chaotic unpredictability into operations trains staff to be on high alert at all times, rather than lulled into complacency by predictability or consistency.
When Puppet reverts a Phabricator host's configuration to an older version, we might start writing data to a lot of crazy places where it shouldn't go. This will create a big sticky mess that is virtually impossible to undo, mostly because we'll get two files with ID 123 or two tasks with ID 456 or whatever else and good luck with that.
Instead, after changing the partition layout, require `bin/storage partition` to be run. This writes a copy of the config everywhere.
Then, when we start serving web requests, make sure every database has the exact same config. This will foil Puppet by refusing to run requests on hosts it has reverted.
Test Plan:
- Changed partition configuration.
- Ran Phabricator.
- FOILED!
- Ran `bin/storage partition` to sync config.
- Things worked again.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11044
Differential Revision: https://secure.phabricator.com/D16910
Summary:
Ref T11044. Fixes T10931. This option has essentially never been useful for anything, and we've picked the best implementation for a long time (MySQLi if available, MySQL if not).
I am not aware of any reason to ever set this manually. If someone comes up with some bizarre but legitimate use case that I haven't thought of, we can modularize it.
Test Plan: Browsed around. Grepped for `mysql.implementation`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10931, T11044
Differential Revision: https://secure.phabricator.com/D16909
Summary:
Fixes T10759. Fixes T11817. This runs all the general sanity/configuration checks on all the active servers.
None of these warnings are very important, and this doesn't change any logical stuff.
Depends on D16904.
Test Plan: Painstakingly triggered each warning, verified that they rendered correctly and that messages told me which host was affected.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10759, T11817
Differential Revision: https://secure.phabricator.com/D16905
Summary:
Ref T10759. Check master/replica status during startup.
After D16903, this also means that we check this status after a database comes back online after being unreachable.
If a master is replicating, fatal (since this can do a million kinds of bad things).
If a replica is not replicating, warn (this just means the replica is behind so some data is at risk).
Also: if your masters were actually configured properly (mine weren't until this change detected it), we would throw away patches as we applied them, so they would only apply to the //first// master. Instead, properly apply all migration patches to all masters.
Test Plan:
- Started Phabricator with a replicating master, got a fatal.
- Stopped replication on a replica, got a warning.
- With two non-replicating masters, upgraded storage.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10759
Differential Revision: https://secure.phabricator.com/D16904
Summary:
Ref T10759. We may "discover" the presence of a fatal setup error later, after starting Phabricator.
This can happen in a few ways, but most are unlikely. The one I'm immediately concerned about is:
- Phabricator starts up during a disaster with some databases unreachable.
- We start with warnings (unreachable databases are generally not fatal, since it's OK for some subset of hosts to be down in replicated/partitioned setups).
- The unreachable databases later recover and become accessible again.
- When we run checks against them, we discover that they are misconfigured.
Currently, "fatal" setup issues are not truly fatal if we're "in flight" -- we've survived setup checks at least once in the past. This is bad in the scenario above.
Especially with partitioning, it could lead to mangled data in a disaster scenario where operations staff makes a small configuration mistake while trying to get things running again.
Instead, if we "discover" a fatal error while already "in flight", reset the whole setup process as though the webserver had just restarted. Don't serve requests again until we can make it through setup without hitting fatals.
Test Plan:
- Started Phabricator with multiple masters, one of which was down and broken.
- Got a warning about the bad master.
- Revived the master.
- Before: Phabricator detects the fatal, but keeps serving requests.
- After: Phabricator detects the fatal, resets the webserver, and stops serving requests until the fatal is resolved.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10759
Differential Revision: https://secure.phabricator.com/D16903
Summary:
Ref T10759. Currently, these checks run only against configured masters. Instead, check every host.
These checks also sort of cheat through restart during a recovery, when some hosts will be unreachable: they test for "disaster" by seeing if no masters are reachable, and just skip all the checks in that case.
This is bad for at least two reasons:
- After recent changes, it is possible that //some// masters are dead but it's still OK to start. For example, "slowvote" may have no master, but everything else is reachable. We can safely run without slowvote.
- It's possible to start during a disaster and miss important setup checks completely, since we skip them, get a clean bill of health, and never re-test them.
Instead:
- Test each host individually.
- Fundamental problems (lack of InnoDB, bad schema) are fatal on any host.
- If we can't connect, raise it as a //warning// to make sure we check it later. If you start during a disaster, we still want to make sure that schemata are up to date if you later recover a host.
In particular, I'm going to add these checks soon:
- Fatal if a "master" is replicating.
- Fatal if a "replica" is not replicating.
- Fatal if a database partition config differs from web partition config.
- When we let a database off with a warning because it's down, and later upgrade it to a fatal because we discover it is broken after it comes up again, fatal everything. Currently, we keep running if we "discover" the presence of new fatals after surviving setup checks for the first time.
Test Plan:
- Configured with multiple masters, intentionally broke one (simulating a disaster where one master is lost), saw Phabricator still startup.
- Tested individual setup checks by intentionally breaking them.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10759
Differential Revision: https://secure.phabricator.com/D16902
Summary:
Ref T11044. I'm going to hold this until after the release cut, but I think it's good to go.
This allows installs to configure multiple masters in `cluster.databases` and partition applications across them (for example, put Maniphest on a dedicated database).
When we make a Maniphest connection we go look up which master we should be hitting first, then connect to it.
This has at least approximately been planned for many years, so the actual change is largely just making sure that your config makes sense.
Test Plan:
- Configured `db001.epriestley.com` and `db002.epriestley.com` as master/master.
- Partitioned applications between them.
- Interacted with various applications, saw writes go to the correct host.
- Viewed "Database Servers" and saw partitioning information.
- Ran schema upgrades.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11044
Differential Revision: https://secure.phabricator.com/D16876
Summary: Depends on D16847. Ref T11044. This updates the remaining storage-related workflows from the CLI to accommodate multiple masters.
Test Plan:
- Configured multiple masters.
- Ran all `bin/storage` workflows.
- Ran `arc unit --everything`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11044
Differential Revision: https://secure.phabricator.com/D16848
Summary:
Depends on D16115. Ref T11044. In the brave new world of multiple masters, we need to check the schemata on each master when looking for missing storage patches, keys, schema changes, etc.
This realigns all the "check out what's up with that schema" calls to work for multiple hosts, and updates the web UI to include a "Server" column and allow you to browse per-server.
This doesn't update `bin/storage`, so it breaks things on its own (and unit tests probably won't pass). I'll update that in the next change.
Test Plan: Configured local environment in cluster mode with multiple masters, saw both hosts' status reported in web UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11044
Differential Revision: https://secure.phabricator.com/D16847
Summary:
Ref T11044. This moves toward partitioned application databases:
- You can define multiple masters.
- Convert all the easily-convertible code to become multi-master aware.
This doesn't convert most of `bin/storage` or "Config > Database (Stuff)" yet, as both are quite involved. They still work for now, but only operate on the first master instead of all masters.
Test Plan: Configured multiple masters, browsed around, ran `bin/storage` commands, ran `bin/storage --host ...`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11044
Differential Revision: https://secure.phabricator.com/D16115
Summary:
This has been replaced by `PolicyCodex` after D16830. Also:
- Rebuild Celerity map to fix grumpy unit test.
- Fix one issue on the policy exception workflow to accommodate the new code.
Test Plan:
- `arc unit --everything`
- Viewed policy explanations.
- Viewed policy errors.
Reviewers: chad
Reviewed By: chad
Subscribers: hach-que, PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D16831
Summary:
Fixes T11746. The opcache docs are on a different page, so point there if we're raising opcache issues.
(It's possible for a setup issue to say "configure X, or configure Y", where X is opcache and Y is non-opcache, so we may want to render both links.)
Test Plan: {F1867109}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11746
Differential Revision: https://secure.phabricator.com/D16685
Summary: Creates a background that renders inside the Quicksand frame, through sorcery.
Test Plan: Turn on Quicksand, visit lots of pages. See correct background colors. This probably blows something up I'm not testing.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Differential Revision: https://secure.phabricator.com/D16642
Summary:
Ref T11672. At low loads, this causes us to use more connections, which is pushing some installs over the default limits.
Rather than trying to walk users through changing `max_connections`, `open_files_limit`, `fs.file-max`, `ulimit`, etc., just put things back for now. After T11044 we should have headroom to use persistent connections within the default limits on all reasonable systems..
Test Plan: Loaded Phabricator, poked around.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11672
Differential Revision: https://secure.phabricator.com/D16591