1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2025-01-12 07:41:04 +01:00
Commit graph

12909 commits

Author SHA1 Message Date
epriestley
467c4e84e5 Add an edge table to the search database
Summary:
Fixes T10778. This is a result of T10262: when we save a form configuration and adjust the policy, we try to scramble attached file secrets.

There aren't going to be any attached files, but there's also no edge table, so we fail.

We could skip this code, but we'll likely need an edge table here sooner or later so it's probably simpler in the long run to just add an empty one.

Test Plan:
   - Ran `bin/storage upgrade`, got a clean bill of health.
   - Saved a form configuration after making a policy edit, no more `edge` exception.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10778

Differential Revision: https://secure.phabricator.com/D15803
2016-04-26 11:26:26 -07:00
epriestley
dc3a13c5e8 Add bin/repository clusterize and document setup and migration for clusters
Summary: Ref T4292. This provides at least some sort of hint about how to set up cluster repositories.

Test Plan:
  - Read documentation.
  - Ran `bin/repository clusterize` to add + remove clusters.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15798
2016-04-26 10:07:17 -07:00
epriestley
8d9bc401e4 Improve Diffusion hosting setup instructions somewhat?
Summary:
Ref T10866. Fixes T10386. This attempts to make it a little more plausible to follow these directions:

  - Use simpler language in general.
  - Remove language suggesting that HTTP requires no additional configuration.
  - Suggest using a load balancer or an ugly port number instead of swapping SSH to a different port.
  - Be more granular about `sudo` setup.
  - Organize better?

Test Plan: Read documentation.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10386, T10866

Differential Revision: https://secure.phabricator.com/D15796
2016-04-26 10:06:59 -07:00
epriestley
3fda965288 When multiple web hosts are in service, don't require setup warnings to be dismissed on each one
Summary:
Fixes T10876. Currently, we can end up with a setup warning banner sticking on each web device, since the state is stored in local cache.

Instead:

  - When we actually run the setup checks, save the current state in the database.
  - Before we show a cached banner, make sure the database still says the checks are a problem.

This could lead to some inconsistencies if setup checks legitimately pass on some hosts but not on others. For example, if you have `git` installed on one machine but not on another, we may raise a setup warning ("No Git Binary!") about it on one host only.

For now, assume users have their operational environments in some sort of reasonable shape and can install the same stuff everywhere. In the future, we could split the issues into "global" and "per-host" issues if we run into problems with this.

Test Plan:
This is somewhat tricky to test locally since you really need multiple webservers to test it properly, but I:

  - Created some setup issues, saw banner.
  - Ignored/cleared them, saw banner go away.
  - Verified database cache writes were occurring properly.

Then I sort of faked it like this:

  - Created a setup issue.
  - Manually set the database cache value to `[]` ("no issues").
  - Reloaded page.
  - No more banner.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10876

Differential Revision: https://secure.phabricator.com/D15802
2016-04-26 10:03:45 -07:00
epriestley
8606fb588f Port "Staging Area" repository section to new management UI
Summary: Ref T10748. Brings this over and adds EditEngine support for it.

Test Plan: Viewed and edited staging area information.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10748

Differential Revision: https://secure.phabricator.com/D15801
2016-04-26 08:11:53 -07:00
epriestley
8e4a7742eb Port local storage path to new repository Manage UI
Summary: Ref T10748. This merges "Storage" and "Cluster" into a single UI which combines the information of both.

Test Plan: {F1246882}

Reviewers: chad

Reviewed By: chad

Subscribers: hach-que

Maniphest Tasks: T10748

Differential Revision: https://secure.phabricator.com/D15800
2016-04-26 07:59:22 -07:00
epriestley
0630fef9fc Prevent web queries from running for more than 30 seconds
Summary:
Ref T10849. This enforces a global 30-second per-query time limit for anything not coming from the CLI.

If we run into another issue with MySQL hanging in the future, this should prevent it from being nearly as bad as it was.

Test Plan:
  - Set value to 0, verified the UI threw an exception immediately.
  - Set value back to 30, browsed around a bunch of pages.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10849

Differential Revision: https://secure.phabricator.com/D15799
2016-04-26 07:59:09 -07:00
epriestley
2c870bad86 Document how to register cluster devices with Almanac
Summary:
Ref T4292. This is a required step in configuring a cluster: document and explain it.

Previously `bin/almanac register` could //also// add and trust keys. I've removed this capability since I think it's needless and complicated. If there's some real use for it eventually, we could add a `bin/almanac add-key` or whatever. The workflow is simpler and has better guard rails that point you in the correct direction now.

Test Plan:
  - Read documentation.
  - Ran `bin/almanac` with various good/bad flags.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15795
2016-04-25 14:58:58 -07:00
epriestley
550a82d438 Fix two minor formatting issues with bin/repository move-paths
Summary: This gets over-escaped instead of bolded right now, but I only ever hit it when exporting/importing and never both cleaning it up.

Test Plan: Ran `bin/repository move-paths`, saw bolded "Move" instead of ANSI escape sequences.

Reviewers: chad

Reviewed By: chad

Differential Revision: https://secure.phabricator.com/D15797
2016-04-25 12:29:15 -07:00
epriestley
892a9a1f07 Make cluster repositories more resistant to freezing
Summary:
Ref T10860. This allows us to recover if the connection to the database is lost during a push.

If we lose the connection to the master database during a push, we would previously freeze the repository. This is very safe, but not very operator-friendly since you have to go manually unfreeze it.

We don't need to be quite this aggressive about freezing things. The repository state is still consistent after we've "upgraded" the lock by setting `isWriting = 1`, so we're actually fine even if we lost the global lock.

Instead of just freezing the repository immediately, sit there in a loop waiting for the master to come back up for a few minutes. If it recovers, we can release the lock and everything will be OK again.

Basically, the changes are:

  - If we can't release the lock at first, sit in a loop trying really hard to release it for a while.
  - Add a unique lock identifier so we can be certain we're only releasing //our// lock no matter what else is going on.
  - Do the version reads on the same connection holding the lock, so we can be sure we haven't lost the lock before we do that read.

Test Plan:
  - Added a `sleep(10)` after accepting the write but before releasing the lock so I could run `mysqld stop` and force this issue to occur.
  - Pushed like this:

```
$ echo D >> record && git commit -am D && git push
[master 707ecc3] D
 1 file changed, 1 insertion(+)
# Push received by "local001.phacility.net", forwarding to cluster host.
# Waiting up to 120 second(s) for a cluster write lock...
# Acquired write lock immediately.
# Waiting up to 120 second(s) for a cluster read lock on "local001.phacility.net"...
# Acquired read lock immediately.
# Device "local001.phacility.net" is already a cluster leader and does not need to be synchronized.
# Ready to receive on cluster host "local001.phacility.net".
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 254 bytes | 0 bytes/s, done.
Total 3 (delta 1), reused 0 (delta 0)
BEGIN SLEEP
```

  - Here, I stopped `mysqld` from the CLI in another terminal window.

```
END SLEEP
# CRITICAL. Failed to release cluster write lock!
# The connection to the master database was lost while receiving the write.
# This process will spend 300 more second(s) attempting to recover, then give up.
```

  - Here, I started `mysqld` again.

```
# RECOVERED. Link to master database was restored.
# Released cluster write lock.
To ssh://local@localvault.phacility.com/diffusion/26/locktopia.git
   2cbf87c..707ecc3  master -> master
```

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10860

Differential Revision: https://secure.phabricator.com/D15792
2016-04-25 11:37:31 -07:00
epriestley
d0b5dac36b Make cluster repositories more chatty
Summary:
Ref T10860. At least in Git over SSH, we can freely echo a bunch of stuff to stderr and Git will print it to the console, so we can tell users what's going on.

This should make debugging, etc., easier. We could tone this down a little bit once things are more stable if it's a little too chatty.

Test Plan:
```
$ echo D >> record && git commit -am D && git push
[master ca5efff] D
 1 file changed, 1 insertion(+)
# Push received by "local001.phacility.net", forwarding to cluster host.
# Waiting up to 120 second(s) for a cluster write lock...
# Acquired write lock immediately.
# Waiting up to 120 second(s) for a cluster read lock on "local001.phacility.net"...
# Acquired read lock immediately.
# Device "local001.phacility.net" is already a cluster leader and does not need to be synchronized.
# Ready to receive on cluster host "local001.phacility.net".
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 256 bytes | 0 bytes/s, done.
Total 3 (delta 1), reused 0 (delta 0)
To ssh://local@localvault.phacility.com/diffusion/26/locktopia.git
   8616189..ca5efff  master -> master
```

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10860

Differential Revision: https://secure.phabricator.com/D15791
2016-04-25 11:20:57 -07:00
epriestley
dc75b4bd06 Move all cluster locking logic to a separate class
Summary: Ref T10860. This doesn't change anything, it just separates all this stuff out of `PhabricatorRepository` since I'm planning to add a bit more state to it and it's already pretty big and fairly separable.

Test Plan: Pulled, pushed, browsed Diffusion.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10860

Differential Revision: https://secure.phabricator.com/D15790
2016-04-25 11:20:29 -07:00
epriestley
1c0980a26a Fix two issues with Remarkup in Pholio
Summary:
Fixes T10865.

  - Mock descriptions did not markup.
  - Image descriptions did not get a proper container `<div />`.

Test Plan:
  - Created a mock with remarkup in the mock description and in an image description.
  - Viewed mock detail.
  - Saw list styles render properly in both mock description and image description.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10865

Differential Revision: https://secure.phabricator.com/D15793
2016-04-25 08:16:23 -07:00
epriestley
623ed1f434 Include directory-ownership note in sshd setup instructions
Summary: Fixes T9560. We suggest a root-owned location, but users who choose their own location instead can run into trouble.

Test Plan:
  - Changed parent directory to have an non-root owner, verified that `ssh` no longer worked.
  - Changed parent directory back to a root owner, verified `ssh` worked again.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T9560

Differential Revision: https://secure.phabricator.com/D15794
2016-04-25 08:16:14 -07:00
epriestley
aa9395e38f Fix bad variable causing aphlict to fail to start with no "logs" config
Summary: Fixes T10863. See that task for discussion.

Test Plan:
  - Configured `aphlict` with no "logs".
  - Started `aphlict`.
    - Before change: exception.
    - After change: worked.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10863

Differential Revision: https://secure.phabricator.com/D15788
2016-04-24 11:20:42 -07:00
epriestley
9d0891c7e1 Correct Aphlict documentation for Nginx proxying
Summary: Fixes T10857. This documentation did not accurately reflect proper configuration: in the Aphlict config, SSL is inferred from the presence of `ssl.*` configuration.

Test Plan: Read documentation.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10857

Differential Revision: https://secure.phabricator.com/D15787
2016-04-24 06:31:58 -07:00
epriestley
00885edc47 Don't try to synchronize repositories with no working copy
Summary:
Ref T4292. Sometimes, we may not have a working copy for a repository. The easiest way to get into this condition is to deactivate a repository.

We could try to clone + fetch in this case, but that's kind of complex, and there's an easy command that administrators can run manually. For now, just tell them to do that.

This affects the inactive repositories on `secure`, like rGITCOINS.

Test Plan: Removed working copy, got message.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15786
2016-04-22 08:12:19 -07:00
epriestley
ab20f243b3 Improve consistency of file access policies, particularly for LFS
Summary:
Ref T7789. Currently, we use different viewers if you have `security.alternate-file-domain` configured vs if you do not.

This is largely residual from the days of one-time-tokens, and can cause messy configuration-dependent bugs like the one in T7789#172057.

Instead, always use the omnipotent viewer. Knowledge of the secret key alone is sufficient to access a file.

Test Plan:
  - Disabled `security.alternate-file-domain`.
  - Reproduced an issue similar to the one described on T7789.
  - Applied change.
  - Clean LFS interaction.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T7789

Differential Revision: https://secure.phabricator.com/D15784
2016-04-22 08:12:08 -07:00
epriestley
711f13660e Synchronize working copies before doing a "bypassCache" commit read
Summary:
Ref T4292. When the daemons make a query for repository information, we need to make sure the working copy on disk is up to date before we serve the response, since we might not have the inforamtion we need to respond otherwise.

We do this automatically for almost all Diffusion methods, but this particular method is a little unusual and does not get this check for free. Add this check.

Test Plan:
  - Made this code throw.
  - Ran `bin/repository reparse --message ...`, saw the code get hit.
  - Ran `bin/repository lookup-user ...`, saw this code get hit.
  - Made this code not throw.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15783
2016-04-22 08:11:43 -07:00
epriestley
0f0105e783 Send the aphlict process log to the node log
Summary: I've possibly seen a couple of `aphlict` processes exit under suspicious circumstances (maybe?). Make sure any PHP errors get captured into the log.

Test Plan:
  - Added an exception after forking.
  - Before change: vanished into thin air.
  - After change: visible in the log.

Reviewers: chad

Reviewed By: chad

Differential Revision: https://secure.phabricator.com/D15782
2016-04-21 17:50:47 -07:00
epriestley
43935d5916 Don't cache resources we can't generate properly
Summary:
Fixes T10843. In a multi-server setup, we can do this:

  - Two servers, A and B.
  - You push an update.
  - A gets pushed first.
  - After A has been pushed, but before B has been pushed, a user loads a page from A.
  - It generates resource URIs like `/stuff/new/package.css`.
  - Those requests hit B.
  - B doesn't have the new resources yet.
  - It responds with old resources.
  - Your CDN caches things. You now have a poisoned CDN: old data is saved in a new URL.

To try to avoid this with as little work as possible and generally make it hard to get wrong, check the URL hash against the hash we would generate.

If they don't match, serve our best guess at the resource, but don't cache it. This should make things mostly keep working during the push, but prevent caches from becoming poisoned, and everyone should get a working version of everything after the push finishes.

Test Plan:
  - `curl`'d a resource, got a cacheable one.
  - Changed the hash a little, `curl`'d again. This time: valid resource, but not cacheable.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10843

Differential Revision: https://secure.phabricator.com/D15775
2016-04-21 11:56:54 -07:00
epriestley
9656fe48bc Add a "Repository Servers" cluster administration panel
Summary: Ref T4292. This adds a new high-level overview panel.

Test Plan: {F1238854}

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15772
2016-04-21 11:56:44 -07:00
lkassianik
bd8969a23c Calendar event list items 'Attending:' field should only show users who have confirmed attendance
Summary: Fixes T8897

Test Plan: Open any list view of Calendar events, every event should only show "Attending: ..." with users who are attending event.

Reviewers: chad, epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin

Maniphest Tasks: T8897

Differential Revision: https://secure.phabricator.com/D15779
2016-04-21 11:06:49 -07:00
epriestley
fb2b88a4a8 Fix Phriction link syntax a little more
Summary: This still wasn't quite right -- a link like `[[ Porcupine Facts ]]` with a space would not lookup correctly, and would render as `porcupine_facts`.

Test Plan: Verified that `[[ Porcupine Facts ]]` now works correctly.

Reviewers: chad

Reviewed By: chad

Differential Revision: https://secure.phabricator.com/D15780
2016-04-21 10:29:08 -07:00
epriestley
c986caebb2 Put all cluster docs in the right documentation group
Summary: Some of these had the wrong `@group` header.

Test Plan: `grep`

Reviewers: chad

Reviewed By: chad

Differential Revision: https://secure.phabricator.com/D15778
2016-04-20 20:38:08 -07:00
epriestley
34c488e165 Normalize Phriction links when looking them up in remarkup
Summary: Fixes T10845.

Test Plan: Verified that `[[ quack ]]` and `[[ QUACK ]]` both work. Previously, the link had to exactly match the capitalization of the target.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10845

Differential Revision: https://secure.phabricator.com/D15777
2016-04-20 19:07:01 -07:00
Joshua Spence
93e341fbda Fix ./bin/aphlict status
Summary: Fixes T10844. After recent changes to Aphlict (T6915 and T10697), `./bin/status` needs to be aware of the configuration file. As such, it is now necessary to run `./bin/aphlict status --config /path/to/config.json` rather than `./bin/aphlict status`.

Test Plan: Ran `./bin/aphlict start ...` and `./bin/aphlict status` and saw "Aphlict (`$PID`) is running".

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin

Maniphest Tasks: T10844

Differential Revision: https://secure.phabricator.com/D15776
2016-04-21 09:40:32 +10:00
epriestley
9419e4f13a Reduce strength of Herald and user subscription stories
Summary:
Fixes T8952. When Herald changes subscribers, it is zzzzz very boring.

When users change subscribers, it is still super boring (more boring than a merge, for example).

Test Plan: Viewed feed, saw fewer Herald stories.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T8952

Differential Revision: https://secure.phabricator.com/D15774
2016-04-20 14:40:49 -07:00
epriestley
df8c3c4fa5 Give application actors in feed reasonable icons
Summary:
Ref T8952. Currently, when an application (most commonly Herald, but sometimes Drydock, Diffusion, etc) publishes a feed story, we get an empty grey box for it in feed.

Instead, give the story a little application icon kind of "profile picture"-like thing.

Test Plan:
Here's how it looks:

{F1239003}

Feel free to tweak/counter-diff.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T8952

Differential Revision: https://secure.phabricator.com/D15773
2016-04-20 14:38:55 -07:00
epriestley
3b154a34c7 Use less hip lingo
Summary: Woah man.

Test Plan: spellcheck

Reviewers: chad, eadler

Reviewed By: chad, eadler

Differential Revision: https://secure.phabricator.com/D15771
2016-04-20 10:59:36 -07:00
epriestley
bd4fb3c9fa Implement bin/repository thaw for unfreezing cluster repositories
Summary:
Ref T10751. Add support tooling for manually prying your way out of trouble if disaster strikes.

Refine documentation, try to refer to devices as "devices" more consistently instead of sometimes calling them "nodes".

Test Plan: Promoted and demoted repository devices with `bin/repository thaw`.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10751

Differential Revision: https://secure.phabricator.com/D15768
2016-04-20 10:45:58 -07:00
epriestley
11aa902bd1 Show "Last Writer" and "Last Write At" in the UI, add more documentation
Summary:
Ref T10751. Make the UI more useful and explain what failure states mean and how to get out of them.

The `bin/repository thaw` command does not exist yet, I'll write that soon.

Test Plan: {F1238241}

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10751

Differential Revision: https://secure.phabricator.com/D15766
2016-04-20 10:45:03 -07:00
lkassianik
d9275da2d4 Better wording for cancelling/reinstating recurring events
Summary: Fixes T10744

Test Plan: Create recurring event, cancel one instance, cancel the parent event, reinstate event. Wording in the reinstating dialog should be clear about reinstating only instances that haven't been individually cancelled.

Reviewers: chad, epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin

Maniphest Tasks: T10744

Differential Revision: https://secure.phabricator.com/D15770
2016-04-20 10:10:59 -07:00
lkassianik
eeccaf99b6 When scrolling forward a month in calendar date picker from 1/31, next chosen date should be 2/29, not 3/1.
Summary: Fixes T9295

Test Plan: Create event, open datepicker for start date, choose 1/31/2016, open datepicker again, click right button to scroll month. New suggested date should be 2/29/2016

Reviewers: chad, epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin

Maniphest Tasks: T9295

Differential Revision: https://secure.phabricator.com/D15727
2016-04-20 09:15:45 -07:00
epriestley
7f15e8fbe8 Formally deprecate owners.query Conduit API method
Summary: This is completely obsoleted by `owners.search`. See D15472.

Test Plan: Viewed API method in UI console.

Reviewers: avivey, chad

Reviewed By: chad

Differential Revision: https://secure.phabricator.com/D15769
2016-04-20 09:04:45 -07:00
epriestley
11f8fffe5b Fix Phriction document linking in mail bodies
Summary:
Fixes T10840. When rendering mail, this rule wasn't falling through in quite the right way.

Also adjust where the rules are for this so the special styles show up in Maniphest, etc.

Test Plan:
Made this comment:

{F1238266}

Which produced this HTML:

{F1238267}

...and sent this mail:

{F1238283}

Reviewers: hach-que, chad

Reviewed By: chad

Maniphest Tasks: T10840

Differential Revision: https://secure.phabricator.com/D15767
2016-04-20 06:55:00 -07:00
epriestley
b9cf9e6f0d Fix an issue with PHID/handle management in push logs
Summary: Ref T10751. This cleans this up so it's a little more modern, and fixes a possible bad access on the log detail page.

Test Plan: Viewed push log list, viewed push log detail.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10751

Differential Revision: https://secure.phabricator.com/D15765
2016-04-20 04:47:10 -07:00
epriestley
48b015a3fa Add slightly more cluster repository documentation
Summary: Ref T10751. There are still some missing support tools here, but explain some of this a little better.

Test Plan: Read documentation.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10751

Differential Revision: https://secure.phabricator.com/D15764
2016-04-20 04:46:40 -07:00
epriestley
bab3690b54 Fill in missing cluster database documentation
Summary:
Ref T10751. Provide some guidance on replicas and promotion.

I'm not trying to walk administrators through the gritty details of this. It's not too complex, they should understand it, and the MySQL documentation is pretty thorough.

Test Plan: Read documentation.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10751

Differential Revision: https://secure.phabricator.com/D15763
2016-04-20 04:46:25 -07:00
Aviv Eyal
1344dda756 Parse Tags in commits message for revisions
Summary: This will stop breaking if you have subscribers and tags when updating a revision (`Error parsing field "Subscribers": The objects you have listed include objects which do not exist (Tags:)`), which I broke in D15749.

Test Plan: run through arc-diff --update that failed earlier.

Reviewers: chad, #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: Korvin

Differential Revision: https://secure.phabricator.com/D15762
2016-04-20 01:46:17 +00:00
epriestley
287e761f19 Make repository synchronization safer when leaders are ambiguous
Summary:
Ref T4292. Right now, repository versions only get marked when a write happens.

This potentially creates a problem: if I pushed all the sync code to `secure` and enabled `secure002` as a repository host, the daemons would create empty copies of all the repositories on that host.

Usually, this would be fine. Most repositories have already received a write on `secure001`, so that working copy has a verison and is a leader.

However, when a write happened to a rarely-used repository (say, rKEYSTORE) that hadn't received any write recently, it might be sent to `secure002` randomly. Now, we'd try to figure out if `secure002` has the most up-to-date copy of the repository or not.

We wouldn't be able to, since we don't have any information about which node has the data on it, since we never got a write before. The old code could guess wrong and decide that `secure002` is a leader, then accept the write. Since this would bump the version on `secure002`, that would //make// it an authoritative leader, and `secure001` would synchronize from it passively (or on the next read or write), which would potentially destroy data.

Instead:

  - Refuse to continue in situations like this.
  - When a repository is on exactly one device, mark it as a leader with version "0".
  - When a repository is created into a cluster service, mark its version as "0" on all devices (they're all leaders, since the repository is empty).

This should mean that we won't lose data no matter how much weird stuff we run into.

Test Plan:
  - In single-node mode, used `repository update` to verify that `0` was written properly.
  - With multiple nodes, used `repository update` to verify that we refuse to continue.
  - Created a new repository, verified versions were initialized correctly.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15761
2016-04-19 13:07:02 -07:00
epriestley
6edf181a7e Record which cluster host received a push
Summary: Ref T4292. When we write a push log, also log which node received the request.

Test Plan: {F1230467}

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15759
2016-04-19 13:06:30 -07:00
epriestley
d87c500002 Synchronize (hosted, clustered, Git) repositories over Conduit + HTTP
Summary:
Ref T4292. We currently synchronize hosted, clustered, Git repositories when we receive an SSH pull or push.

Additionally:

  - Synchronize before HTTP reads and writes.
  - Synchronize reads before Conduit requests.

We could relax Conduit eventually and allow Diffusion to say "it's OK to give me stale data".

We could also redirect some set of these actions to just go to the up-to-date host instead of connecting to a random host and synchronizing it. However, this potentially won't work as well at scale: if you have a larger number of servers, it sends all of the traffic to the leader immediately following a write. That can cause "thundering herd" issues, and isn't efficient if replicas are in different geographical regions and the write just went to the east coast but most clients are on the west coast. In large-scale cases, it's better to go to the local replica, wait for an update, then serve traffic from it -- particularly given that writes are relatively rare. But we can finesse this later once things are solid.

Test Plan:
  - Pushed and pulled a Git repository over HTTP.
  - Browsed a Git repository from the web UI.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15758
2016-04-19 13:05:45 -07:00
epriestley
31bc023eff Synchronize (hosted, git, clustered, SSH) repositories prior to reads
Summary:
Ref T4292. Before we write or read a hosted, clustered Git repository over SSH, check if another version of the repository exists on another node that is more up-to-date.

If such a version does exist, fetch that version first. This allows reads and writes of any node to always act on the most up-to-date code.

Test Plan: Faked my way through this and got a fetch via `bin/repository update`; this is difficult to test locally and needs more work before we can put it in production.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15757
2016-04-19 13:05:17 -07:00
epriestley
c70f4815a9 Allow cluster devices to SSH to one another without acting as a user
Summary:
Ref T4292. When you run `git fetch` and connect to, say, `repo001.west.company.com`, we'll look at the current version of the repository in other nodes in the cluster.

If `repo002.east.company.com` has a newer version of the repository, we'll fetch that version first, then respond to your request.

To do this, we need to run `git fetch repo002.east.company.com ...` and have that connect to the other host and be able to fetch data.

This change allows us to run `PHABRICATOR_AS_DEVICE=1 git fetch ...` to use device credentials to do this fetch. (Device credentials are already supported and used, they just always connect as a user right now, but these fetches should be doable without having a user. We will have a valid user when you run `git fetch` yourself, but we won't have one if the daemons notice that a repository is out of date and want to update it, so the update code should not depend on having a user.)

Test Plan:
```
$ PHABRICATOR_AS_DEVICE=1 ./bin/ssh-connect local.phacility.com
Warning: Permanently added 'local.phacility.com' (RSA) to the list of known hosts.
PTY allocation request failed on channel 0
phabricator-ssh-exec: Welcome to Phabricator.

You are logged in as device/daemon.phacility.net.

You haven't specified a command to run. This means you're requesting an interactive shell, but Phabricator does not provide an interactive shell over SSH.

Usually, you should run a command like `git clone` or `hg push` rather than connecting directly with SSH.

Supported commands are: conduit, git-lfs-authenticate, git-receive-pack, git-upload-pack, hg, svnserve.
Connection to local.phacility.com closed.
```

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15755
2016-04-19 13:04:41 -07:00
epriestley
0db6eaca41 Consolidate handling of SSH usernames
Summary:
Ref T4292. This consolidates code for figuring out which user we should connect to hosts with.

Also narrows a lock window.

Test Plan: Browsed Diffusion, pulled and pushed through an SSH proxy.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15754
2016-04-19 13:04:04 -07:00
Eitan Adler
c9daa2b0ad Consistently refer to 'Projects' as 'Tags'
Summary:
In calendar, dashboard, diffusion, diviner, feed, fund,
maniphest, pholio, ponder, and slowvote use the term 'tags' if possible.

This intenctionally skips diffusion, differential, and the projects application itself.

Ref T10326 Ref T10349

Test Plan: inspection on a running, locally modified, system

Reviewers: avivey, epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin

Maniphest Tasks: T10835, T10326, T10349

Differential Revision: https://secure.phabricator.com/D15753
2016-04-19 16:48:21 +00:00
epriestley
c30fe65ee9 Remove the warning about the Git 2GB pathname issue
Summary:
Ref T10832. In practice, `git --version` is not a useful test for this issue:

  - Vendors like Debian have backported the patch into custom versions like `0.0.0.1-debian-lots-of-patches.3232`.
  - Vendors like Ubuntu distribute multiple different versions which report the same string from `git --version`, some of which are patched and some of which are not.

In other cases, we can perform an empirical test for the vulnerability. Here, we can not, because we can't write a 2GB path in a reasonable amount of time.

Since vendors (other than Apple) //generally// seem to be on top of this and any warning we try to raise based on `git --version` will frequently be incorrect, don't raise this warning.

I'll note this in the changelog instead.

Test Plan: Looked at setup issues, no more warning for vulnerable git version.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10832

Differential Revision: https://secure.phabricator.com/D15756
2016-04-19 07:01:45 -07:00
epriestley
575c01373e Extract repository command construction from Repositories
Summary:
Ref T4292. Ref T10366. Depends on D15751. Today, generating repository commands is purely a function of the repository, so they use protocols and credentials based on the repository configuration.

For example, a repository with an SSH "remote URI" always generate SSH "remote commands".

This needs to change in the future:

  - After T10366, repositories won't necessarily just have one type of remote URI. They can only have one at a time still, but the repository itself won't change based on which one is currently active.
  - For T4292, I need to generate intracluster commands, regardless of repository configuration. These will have different protocols and credentials.

Prepare for these cases by separating out command construction, so they'll be able to generate commands in a more flexible way.

Test Plan:
  - Added unit tests.
  - Browsed diffusion.
  - Ran `bin/phd debug pull` to pull a bunch of repos.
  - Ran daemons.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292, T10366

Differential Revision: https://secure.phabricator.com/D15752
2016-04-19 04:51:48 -07:00
Aviv Eyal
a3bb35e9d2 make Trigger Daemon sleep correctly when one-time triggers exist
Summary:
Trigger daemon is trying to find the next event to invoke before sleeping, but the query includes already-elapsed triggers.
It then tries to sleep for 0 seconds.

Test Plan:
On a new instance, schedule a single trigger of type `PhabricatorOneTimeTriggerClock` to a very near time.

Use top to see trigger daemon not going to 100% CPU once the event has elapsed.

Reviewers: #blessed_reviewers, epriestley

Subscribers: Korvin

Differential Revision: https://secure.phabricator.com/D15750
2016-04-18 14:17:10 -07:00