Summary:
Our code is quite complex in areas where we prevents the 1+N queries problem explained in [[ http://www.phabricator.com/docs/phabricator/article/Performance_N+1_Query_Problem.html | a performance chapter ]].
This diff adds an abstraction for preventing this code.
Test Plan:
Run all examples mentioned in the doc-comments with logging the queries.
Generate and read docs.
Reviewers: epriestley, btrahan
Reviewed By: epriestley
CC: aran, Koolvin
Differential Revision: https://secure.phabricator.com/D2557
Summary:
It's also more readable so I think it's OK.
I've also filed a bug for HHVM.
Test Plan: `arc unit` in HHVM
Reviewers: epriestley, jungejason
Reviewed By: jungejason
CC: aran, Koolvin
Differential Revision: https://secure.phabricator.com/D2551
Summary:
- We currently write every PHID we generate to a table. This was motivated by two concerns:
- **Understanding Data**: At Facebook, the data was sometimes kind of a mess. You could look at a random user in the ID tool and see 9000 assocs with random binary data attached to them, pointing at a zillion other objects with no idea how any of it got there. I originally created this table to have a canonical source of truth about PHID basics, at least. In practice, our data model has been really tidy and consistent, and we don't use any of the auxiliary data in this table (or even write it). The handle abstraction is powerful and covers essentially all of the useful data in the app, and we have human-readable types in the keys. So I don't think we have a real need here, and this table isn't serving it if we do.
- **Uniqueness**: With a unique key, we can be sure they're unique, even if we get astronomically unlucky and get a collision. But every table we use them in has a unique key anyway. So we actually get pretty much nothing here, except maybe some vague guarantee that we won't reallocate a key later if the original object is deleted. But it's hard to imagine any install will ever have a collision, given that the key space is 36^20 per object type.
- We also currently use PHIDs and Users in tests sometimes. This is silly and can break (see D2461).
- Drop the PHID database.
- Introduce a "Harbormaster" database (the eventual CI tool, after Drydock).
- Add a scratch table to the Harbormaster database for doing unit test meta-tests.
- Now, PHID generation does no writes, and unit tests are isolated from the application.
- @csilvers: This should slightly improve the performance of the large query-bound tail in D2457.
Test Plan: Ran unit tests. Ran storage upgrade.
Reviewers: btrahan, vrana, jungejason
Reviewed By: btrahan
CC: csilvers, aran, nh, edward
Differential Revision: https://secure.phabricator.com/D2466
Summary:
Also add some formatting and links.
Also fix test broken by D2393.
Test Plan:
diviner .
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Koolvin
Differential Revision: https://secure.phabricator.com/D2461
Summary: We raise an improved exception for missing tables/columns, but not databases.
Test Plan: Hit a "no such database error", got a better error message pointing me at storage upgrades.
Reviewers: btrahan, jungejason
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D2429
Summary:
- We used to have connection-level caching, so we needed getTransactionKey() to make sure there was one transaction state per real connection. We now cache in Lisk and each Connection object is guaranteed to represent a real, unique connection, so we can make this a non-static.
- I kept the classes separate because it was a little easier, but maybe we should merge them?
- Also track/implement read/write locking.
- (The advantage of this over just writing LOCK IN SHARE MODE is that you can use, e.g., some Query class even if you don't have access to the queries it runs.)
Test Plan: Can you come up with a way to write unit tests for this? It seems like testing that it works requires deadlocking MySQL if the test is running in one process.
Reviewers: vrana, btrahan
Reviewed By: vrana
CC: aran
Differential Revision: https://secure.phabricator.com/D2398
Summary: Generally moves us toward having a sane approach to transaction handling.
Test Plan: See test case, which fails before this patch and passes afterwards.
Reviewers: vrana, btrahan, jungejason
Reviewed By: vrana
CC: aran
Differential Revision: https://secure.phabricator.com/D2394
Summary:
- Unit tests can request storage fixtures.
- We build one fixture across all tests in the process, which can quickstart (takes roughly 1s to build, 200ms to destroy for me). This is a one-time cost for running an arbitrary number of fixture-based tests.
- We isolate all the connections inside transactions for each test, so individual tests don't affect one another.
Test Plan: Ran unit tests, which cover the important properties of fixtures.
Reviewers: btrahan, vrana, jungejason, edward
Reviewed By: btrahan
CC: aran, davidreuss
Maniphest Tasks: T140
Differential Revision: https://secure.phabricator.com/D2345
Summary:
- Currently, connections are responsible for connection caching. However, I want unit tests to be able to say "throw away the entire connection cache" with storage fixtures, and this is difficult/impossible when connections are responsible for the cache.
- The only behavioral change is that previously we would use the same connection for read-mode and write-mode queries. We'll now establish two connections. No installs actually differentiate between the modes so it isn't particularly relevant what we do here. In the long term, we should probably check the "w" cache before building a new "r" connection, so transactional code which involves reads and writes works (we don't have any such code right now).
Test Plan: Loaded pages, verified only one connection was established per database. Ran unit tests.
Reviewers: btrahan, vrana, jungejason, edward
Reviewed By: vrana
CC: aran
Maniphest Tasks: T140
Differential Revision: https://secure.phabricator.com/D2342
Summary:
This addresses three issues with the current patch management system:
# Two people developing at the same time often pick the same SQL patch number, and then have to go rename it. The system catches this, but it's silly.
# Second/third-party developers can't use the same system to manage auxiliary storage they may want to add.
# There's no way to build mock databases for unit tests that need to do reads.
To resolve these things, you can now name your patches whatever you want and conflicts are just merge conflicts, which are less of a pain to fix than filename conflicts.
Dependencies are now a DAG, with implicit dependencies created on the prior patch if no dependencies are specified. Developers can add new concrete subclasses of `PhabricatorSQLPatchList` to add storage management, and define the dependency branchpoint of their patches so they apply in the correct order (although, generally, they should not depend on the mainline patches, presumably).
The commands `storage upgrade --namespace test1234` and `storage destroy --namespace test1234` will allow unit tests to build and destroy MySQL storage.
A "quickstart" mode allows an upgrade from scratch in ~1200ms. Destruction takes about 200ms. These seem like fairily reasonable costs to actually use in tests. Building from scratch patch-by-patch takes about 6000ms.
Test Plan:
- Created new databases from scratch with and without quickstart in a separate test namespace. Pointed the webapp at the test namespaces, browsed around, everything looked good.
- Compared quickstart and no-quickstart dump states, they're identical except for mysqldump timestamps and a few similar things.
- Upgraded a legacy database to the new storage format.
- Destroyed / dumped storage.
Reviewers: edward, vrana, btrahan, jungejason
Reviewed By: btrahan
CC: aran, nh
Maniphest Tasks: T140, T345
Differential Revision: https://secure.phabricator.com/D2323
Summary: This separates common MySQL stuff (identifiers and comments escaping, error codes, connection retries) from PHP extension specific stuff (connect, query, fetch, errors, escape string).
Test Plan:
/
Use `AphrontMySQLiDatabaseConnection` in `PhabricatorLiskDAO`, load homepage, edit task, save task.
Reviewers: epriestley
Reviewed By: epriestley
CC: nh, aran
Differential Revision: https://secure.phabricator.com/D2113
Summary:
MySQL doesn't treat `\` as escaping character in ##``##.
This isn't probably SQL injection hole because I've found no calls of this method with user input.
But better safe than sorry.
See also [[http://dev.mysql.com/doc/refman/5.1/en/server-sql-mode.html#sqlmode_no_backslash_escapes | NO_BACKSLASH_ESCAPES]].
Test Plan:
lang=sql
SELECT `a\`b`; -- Throws: Syntax error near '`'.
-- Should throw: Unknown column 'a`b'.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran
Differential Revision: https://secure.phabricator.com/D2109
Summary:
This is in preparation for getting the "View Options" dropdown working on audits.
- Use Files to serve raw data so we get all the security benefits of the alternate file domain. Although the difficulty of exploiting this is high (you need commit access to the repo) there's no reason to leave it dangling.
- Add a "contentHash" to Files so we can lookup files by content rather than adding some weird linker table. We can do other things with this later, potentially.
- Don't use 'data' URIs since they're crazy and we can just link to the file URI.
- When showing a binary file or an image, don't give options like "show highlighted text with blame" or "edit in external editor" since they don't make any sense.
- Use the existing infrastructure to figure out if things are images or binaries instead of an ad-hoc thing in this class.
Test Plan: Looked at text, image and binary files in Diffusion. Verified we reuse existing files if we've already generated them.
Reviewers: btrahan, vrana
Reviewed By: btrahan
CC: aran, epriestley
Maniphest Tasks: T904
Differential Revision: https://secure.phabricator.com/D1899
Summary: Last of the big final patches. Left a few debatable classes (12 out of about 400) that I'll deal with individually eventually.
Test Plan: Ran testEverythingImplemented.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran, epriestley
Maniphest Tasks: T795
Differential Revision: https://secure.phabricator.com/D1881
Summary: This is the script used for conversion: P319
Test Plan:
Update diff with UTF-8 characters in description.
`sql/upgrade_schema.php`
Verify data in DB and that it looks good on web.
Reviewers: epriestley, nh
Reviewed By: epriestley
CC: aran, epriestley
Maniphest Tasks: T327
Differential Revision: https://secure.phabricator.com/D1830
Summary:
These are all unambiguously unextensible. Issues I hit:
- Maniphest Change/Diff controllers, just consolidated them.
- Some search controllers incorrectly extend from "Search" but should extend from "SearchBase". This has no runtime effects.
- D1836 introduced a closure, which we don't handle correctly (somewhat on purpose; we target PHP 5.2). See T962.
Test Plan: Ran "testEverythingImplemented" unit test to identify classes extending from `final` classes. Resolved issues.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran, epriestley
Maniphest Tasks: T795
Differential Revision: https://secure.phabricator.com/D1843
Summary:
Make any Lisk object ephemeral, aka read-only, so that we can fiddle
around with their state safe in the knowledge that we'll never end up
writing that updated state back to the db.
Test Plan: Added a new test; ran test suite.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, epriestley
Differential Revision: https://secure.phabricator.com/D1836
Summary: New implicit fallthrough linter detected a few issues; none of these have behavioral impacts but they can clearly be tightened up. See D1824.
Test Plan: Lint; inspection.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran, epriestley
Differential Revision: https://secure.phabricator.com/D1825
Summary: These are because they forgot to upgrade_schema.php like 99% of the time.
Test Plan: Hit such an error, got a better error message than before.
Reviewers: btrahan, jungejason
Reviewed By: jungejason
CC: aran, epriestley
Differential Revision: https://secure.phabricator.com/D1786
Summary:
In D1515, I introduced some excessively-complicated semantics for detecting
connections that are lost while transactional. These semantics cause us to
reenter establishConnection() and establish twice as many connections as we need
in the common case.
We don't need a hook there at all -- it's sufficient to throw the exception
rather than retrying the query when we encounter it. This doesn't have
reentrancy problems.
Test Plan:
- Added some encapsulation-violating hooks and a unit test for them
- Verified we no longer double-connect.
Reviewers: btrahan, nh
Reviewed By: btrahan
CC: aran, epriestley
Maniphest Tasks: T835
Differential Revision: https://secure.phabricator.com/D1576
Summary:
Restores a (simplified and improved) version of Lisk transactions.
This doesn't actually use transactions anywhere yet. DifferentialRevisionEditor
is the #1 (and only?) case where we have transaction problems right now, but
sticking save() inside a transaction unconditionally will leave us holding a
transaction open for like a million years while we run Herald rules, etc. I want
to do some refactoring there separately from this diff before making it
transactional.
NOTE: @jungejason / @nh, can one of you verify these unit tests pass on
HPHP/i/vm when you get a chance? I vaguely recall there was some problem with
(int)$resource. We can land this safely without verifying that, but should check
before we start using it anywhere.
Test Plan: Ran unit tests.
Reviewers: btrahan, nh, jungejason
Reviewed By: btrahan
CC: aran, epriestley
Maniphest Tasks: T605
Differential Revision: https://secure.phabricator.com/D1515
Summary:
We retried if a db connection was lost when executing a query, but not when
establishing a connection. I've seen a lot of failures establishing connections
in our install (they go away when retrying), so this diff retries when
establishing connections, and logs when we retry.
Test Plan:
- Loaded phabricator in a sandbox
- Temporarily added a check in the try block to throw if there were still
retries (to test logging, retry logic)
Reviewers: epriestley, blair
Reviewed By: epriestley
CC: aran, btrahan
Differential Revision: https://secure.phabricator.com/D1460
Summary:
This is kind of expensive and can be significant on, e.g., the
Maniphest task list view. Do a little more caching and some clever nonsense to
improve performance.
Test Plan:
Local cost on Maniphest "all tasks" view for this method dropped from
##82,856us## to ##24,607us## on 9,061 calls.
I wrote some unit test / microbenchmark things:
public function testGetIDCost() {
$u = new PhabricatorUser();
$n = 100000;
while ($n--) {
$u->getID();
}
$this->assertEqual(1, 1);
}
public function testGetCost() {
$u = new PhabricatorUser();
$n = 100000;
while ($n--) {
$u->getUsername();
}
$this->assertEqual(1, 1);
}
public function testSetCost() {
$u = new PhabricatorUser();
$n = 100000;
while ($n--) {
$u->setID(1);
}
$this->assertEqual(1, 1);
}
Before:
PASS 598ms testSetCost
PASS 584ms testGetCost
PASS 272ms testGetIDCost
After:
PASS 170ms testSetCost
PASS 207ms testGetCost
PASS 29ms testGetIDCost
Also, ran unit tests.
Reviewers: nh, btrahan, jungejason
Reviewed By: nh
CC: aran, epriestley, nh
Differential Revision: https://secure.phabricator.com/D1291
Summary: While we eventually need a plan to make this less of a toy black hole,
we can support DELETE now -- it's just SELECT that's tricky.
Test Plan: Ran unit tests.
Reviewers: zeeg, jungejason, btrahan
Reviewed By: btrahan
CC: aran, btrahan
Differential Revision: 1226
Summary:
- Use DifferentialRevisionQuery, not DifferentialRevisionListData, to select
revisions.
- Make UI simpler (I hope?) and more flexible, similar to Maniphest. It now
shows "Active", "Revisions", "Reviews" and "Subscribed" instead of a hodge-podge
of miscellaneous stuff. All now really has all revisions, not just open
revisions.
- Allow views to be filtered and sorted more flexibly.
- Allow anonymous users to use the per-user views, just don't default them
there.
NOTE: This might have performance implications! I need some help evaluating
them.
@nh / @jungejason / @aran, can one of you run some queries agianst FB's corpus?
The "active revisions" view is built much differently now. Before, we issued two
queries:
- SELECT (open revisions you authored that need revision) UNION ALL (open
revisions you are reviewing that need review)
- SELECT (open revisions you authored that need review) UNION ALL (open
revisions you are reviewing that need revision)
These two queries generate the "Action Required" and "Waiting on Others" views,
and are available in P247.
Now, we issue only one query:
- SELECT (open revisions you authored or are reviewing)
Then we divide them into the two tables in PHP. That query is available in P246.
On the secure.phabricator.com data, this new approach seems to be much better
(like, 10x better). But the secure.phabricator.com data isn't very large. Can
someone run it against Facebook's data (using a few heavy-hitting PHIDs, like
ola or something) to make sure it won't cause a regression?
In particular:
- Run the queries and make sure the new version doesn't take too long.
- Run the queries with EXPLAIN and give me the output maybe?
Test Plan:
- Looked at different filters.
- Changed "View User" PHID.
- Changed open/all.
- Changed sort order.
- Ran EXPLAIN / select against secure.phabricator.com corpus.
Reviewers: btrahan, nh, jungejason
Reviewed By: btrahan
CC: cpiro, aran, btrahan, epriestley, jungejason, nh
Maniphest Tasks: T586
Differential Revision: 1186
test
Summary: @jungejason reported seeing test failures here. I can't reproduce them
and my read of the code doesn't suggest why they might be happening, but add a
little more debug info in hopes of chasing this down.
Test Plan:
- Ran test in a loop for a long time, couldn't get it to fail.
- Changed assertEquals() condition to force test to fail, verified output
message was informative.
Reviewers: jungejason, btrahan
Reviewed By: jungejason
CC: aran, epriestley, jungejason
Differential Revision: 1212
Summary:
I broke this a little bit in my overzealous D1174, since this block validates
both '%nd' (nullable integer) and '%d' (non-nullable integer).
Clean up the conditional checks so we catch the bad case ('%d' on a PHID
converting to 0) but let the good case ('%nd' with null) through.
Test Plan: Unit tests failed; applied patch; unit tests pass.
Reviewers: btrahan, jungejason
Reviewed By: btrahan
CC: aran, btrahan
Maniphest Tasks: T670
Differential Revision: 1201
Summary: See T547. One of these I just missed in D1000; the comment change just
makes it easier to audit use of hash functions by cleaning up "grep" output.
Test Plan: Ran isolation unit test.
Reviewers: btrahan, jungejason, nh, tuomaspelkonen, aran
Reviewed By: jungejason
CC: aran, jungejason
Differential Revision: 1124
Summary:
I added some type checks in D990 to make sure $columns is an array, but was
overzealous and forgot that loadRawDataWhere needs to be able to take null
as $columns.
Test Plan:
Loaded phabricator and saw the error "Argument 1 passed to LiskDAO::loadRawDataWhere() must be an instance of array, null given" go away
Reviewers: epriestley
CC:
Differential Revision: 991
Summary:
Change the differential typeahead to only load columns that it needs. To do
this, I also enabled partial objects for PhabricatorUser (and made necessary
changes to support this). I also changed the functionality of Lisk's loadColumns
to either accept columns as multiple string arguments or a single array of
strings.
Test Plan:
With tokenizer.ondemand set to false, checked that the typeahead loaded and I
can type multiple people's names. Set tokenizer.ondemand to true and tried
again. In both cases, the typeahead worked.
Reviewers: epriestley
Reviewed By: epriestley
CC: jungejason, aran, epriestley, nh
Differential Revision: 990
Summary:
Added loadColumns, loadColumnsWhere instance methods to Lisk, so when you only
need some fields of your object loaded, you can do so. This will be useful for
places where we fetch a large number of rows, but only care about a few columns.
In that situation, these functions can be used so the db doesn't have to return
as much data.
Test Plan:
Loaded a typeahead to check that the existing lisk functions still work.
Modified typeahead to fetch data using loadColumns instead of loadAll and
checked that it still works.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, epriestley, nh, jungejason
Differential Revision: 947
Summary:
This diff changes the way Lisk should be used for custom setters and getters,
changing it from having subclasses of Lisk implement their custom setter or
getter to having them override the readField and writeField methods (which get
called by the getters and setters). This diff also has a configurable option
to throw an exception if a subclass of Lisk implements a custom setter or
getter.
Test Plan:
Without the config set to throw, tested in sandbox by browsing differential
and playing with the differential typeahead. With the config set to throw,
tried to load a phabricator page and saw in the error log an exception thrown
by Lisk because of custom getters in PhabricatorUser.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, jungejason, epriestley
Differential Revision: 974
Summary: These queries are safe to run without a CSRF token, and we need them
for the query analyzer in DarkConsole.
Test Plan: "Analyze Query Plans" works again.
Reviewers: jungejason, nh, tuomaspelkonen, aran
Reviewed By: nh
CC: aran, epriestley, nh
Differential Revision: 895
Summary:
Provide a catchall mechanism to find unprotected writes.
- Depends on D758.
- Similar to WriteOnHTTPGet stuff from Facebook's stack.
- Since we have a small number of storage mechanisms and highly structured
read/write pathways, we can explicitly answer the question "is this page
performing a write?".
- Never allow writes without CSRF checks.
- This will probably break some things. That's fine: they're CSRF
vulnerabilities or weird edge cases that we can fix. But don't push to Facebook
for a few days unless you're prepared to deal with this.
- **>>> MEGADERP: All Conduit write APIs are currently vulnerable to CSRF!
<<<**
Test Plan:
- Ran some scripts that perform writes (scripts/search indexers), no issues.
- Performed normal CSRF submits.
- Added writes to an un-CSRF'd page, got an exception.
- Executed conduit methods.
- Did login/logout (this works because the logged-out user validates the
logged-out csrf "token").
- Did OAuth login.
- Did OAuth registration.
Reviewers: pedram, andrewjcg, erling, jungejason, tuomaspelkonen, aran,
codeblock
Commenters: pedram
CC: aran, epriestley, pedram
Differential Revision: 777
query
Summary:
- Provide an example unit test, and document it.
- Document database isolation better.
- When we issue an unsimulated query to the isolated connection, throw a
helpful message.
- Pygments is complaining about my madeup "lang=demo", change it to
"lang=text".
Test Plan:
- Ran the unit test (sanity check).
- Ran all other unit tests (verify I didn't break isolation).
- Added a queryfx(..., 'SELECT 1') to a test and verified it throws.
- Read the documentation.
Reviewed By: edward
Reviewers: edward, jungejason, tuomaspelkonen, aran
CC: aran, edward
Differential Revision: 773
Summary:
- Services: Show summary panel of total service call costs and relative page weight.
- Services: Add "Analyze Query Plans" button, which issues EXPLAIN for each query and flags problems.
- XHPRof: iframe the profile.
Test Plan: Used the new query plan analysis to find missing keys causing table scans, see D627.
Reviewers: jungejason, tuomaspelkonen, aran
CC:
Differential Revision: 628
Summary:
Get rid of the Phabricator-level DarkConsole-specific API and use the more
general Phutil-level one.
Test Plan:
Loaded DarkConsole services plugin, viewed Diffusion, got execs in the trace.
Reviewed By: aran
Reviewers: aran, jungejason, tuomaspelkonen
CC: aran
Differential Revision: 293
Summary:
Currently you can still punch through Lisk isolation by calling
establishConnection(), and we do that all over the place. Rename getConnection()
to establishConnection() so that all existing callers are safe, and rename
establishConnection() to establishLiveConnection() so that it's not surprising
when this fails to stub in unit tests.
Not wedded to the name if anyone thinks "establishExternalConnection" or
something is clearer.
Test Plan:
Loaded site, browsed around, ran unit tests.
Reviewed By: aran
Reviewers: aran, tuomaspelkonen, jungejason
CC: aran
Differential Revision: 201
Summary:
Allow Lisk to be put into process-isolated mode which establishes
only isolated connections. By default, put it into this mode when running
unit tests. Build some simple unit tests around object insertion and
updating.
NOTE: The one flaw in this is that $dao->establishConnection() still
punches through the isolation layer. I need to do an API change to fix this
though so I'm holding it for now. It will probably just rename getConnection()
to establishConnection() and then rename establishConnection() to something
scary like establishLiveExternalConnection().
Test Plan:
Ran unit tests.
Reviewed By: aran
Reviewers: aran, tuomaspelkonen, jungejason
CC: aran, epriestley
Differential Revision: 194
Summary:
This provides a new connection which doesn't connect to
anything, so effects can be isolated to the current process (for
unit testing).
Test Plan:
Ran unit tests.
Reviewed By: aran
Reviewers: aran, tuomaspelkonen, jungejason
CC: aran, epriestley
Differential Revision: 193
Summary:
In a basically reasonable configuration where you connect
with a non-privileged user from the web workflow, upgrade_schema.php
won't have enough privileges. Allow the user to override the normal
auth with -u and -p.
Test Plan:
Tried to do a schema upgrade with an underprivileged user,
got a useful error message instead of garbage.
Reviewed By: Girish
Reviewers: Girish, davidrecordon, jungejason, tuomaspelkonen, aran
CC: aran, epriestley, Girish
Differential Revision: 191