1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-11-10 00:42:41 +01:00

Separate internal and external Query Cursors more cleanly, to fix pagination against broken objects

Summary:
Ref T13259.

(NOTE) This is "infrastructure/guts" only and breaks some stuff in Query subclasses. I'll fix that stuff in a followup, it's just going to be a larger diff that's mostly mechanical.

When a user clicks "Next Page" on a tasks view and gets `?after=100`, we want to show them the next 100 //visible// tasks. It's possible that tasks 1-100 are visible, but tasks 101-788 are not, and the next visible task is 789.

We load task ID `100` first, to make sure they can actually see it: you aren't allowed to page based on objects you can't see. If we let you, you could use "order=title&after=100", plus creative retitling of tasks, to discover the title of task 100: create tasks named "A", "B", etc., and see which one is returned first "after" task 100. If it's "D", you know task 100 must start with "C".

Assume the user can see task 100. We run a query like `id > 100` to get the next 100 tasks.

However, it's possible that few (or none) of these tasks can be seen. If the next visible task is 789, none of the tasks in the next page of results will survive policy filtering.

So, for queries after the initial query, we need to be able to page based on tasks that the user can not see: we want to be able to issue `id > 100`, then `id > 200`, and so on, until we overheat or find a page of results (if 789-889 are visible, we'll make it there before overheating).

Currently, we do this in a not-so-great way:

  - We pass the external cursor (`100`) directly to the subquery.
  - We query for that object using `getPagingViewer()`, which is a piece of magic that returns the real viewer on the first page and the omnipotent viewer on the 2nd..nth page. This is very sketchy.
  - The subquery builds paging values based on that object (`array('id' => 100)`).
  - We turn the last result from the subquery back into an external cursor (`200`) and save it for the next time.

Note that the last step happens BEFORE policy (and other) filtering.

The problems with this are:

  - The phantom-schrodinger's-omnipotent-viewer thing isn't explicity bad, but it's sketchy and generally not good. It feels like it could easily lead to a mistake or bug eventually.
  - We issue an extra query each time we page results, to convert the external cursor back into a map (`100`, `200`, `300`, etc).
  - In T13259, there's a new problem: this only works if the object is filtered out for policy reasons and the omnipotent viewer can still see it. It doesn't work if the object is filtered for some other reason.

To expand on the third point: in T13259, we hit a case where 100+ consecutive objects are broken (they point to a nonexistent `repositoryID`). These objects get filtered unconditionally. It doesn't matter if the viewer is omnipotent or not.

In that case: we set the next external cursor from the raw results (e.g., `200`). Then we try to load it (using the omnipotent viewer) to turn it into a map of values for paging. This fails because the object isn't loadable, even as the omnipotent viewer.

---

To fix this stuff, the new approach steps back a little bit. Primarily, I'm separating "external cursors" from "internal cursors".

An "External Cursor" is a string that we can pass in `?after=X` URIs. It generally identifies an object which the user can see.

An "Internal Cursor" is a raw result from `loadPage()`, i.e. before policy filtering. Usually, (but not always) this is a `LiskDAO` object that doesn't have anything attached yet and hasn't been policy filtered.

We now do this, broadly:

  - Convert the external cursor to an internal cursor.
  - Execute the query using internal cursors.
  - If necessary, convert the last visible result back into an external cursor at the very end.

This fixes all the problems:

  - Sketchy Omnipotent Viewer: We no longer ever use an omnipotent viewer. (We pick cursors out of the result set earlier, instead.)
  - Too Many Queries: We only issue one query at the beginning, when going from "external" to "internal". This query is generally unavoidable since we need to make sure the viewer can see the object and that it's a real / legitimate object. We no longer have to query an extra time for each page.
  - Total Failure on Invalid Objects: we now page directly with objects out of `loadPage()`, before any filtering, so we can page over invisible or invalid objects without issues.

This change switches us over to internal/external cursors, and makes simple cases (ID-based ordering) work correctly. It doesn't work for complex cases yet since subclasses don't know how to get paging values out of an internal cursor yet. I'll update those in a followup.

Test Plan: For now, poked around a bit. Some stuff is broken, but normal ID-based lists load correctly and page properly. See next diff for a more detailed test plan.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13259

Differential Revision: https://secure.phabricator.com/D20291
This commit is contained in:
epriestley 2019-03-18 10:39:56 -07:00
parent a6fd8f0479
commit 1b0ef43910
4 changed files with 221 additions and 102 deletions

View file

@ -4195,6 +4195,7 @@ phutil_register_library_map(array(
'PhabricatorPygmentSetupCheck' => 'applications/config/check/PhabricatorPygmentSetupCheck.php',
'PhabricatorQuery' => 'infrastructure/query/PhabricatorQuery.php',
'PhabricatorQueryConstraint' => 'infrastructure/query/constraint/PhabricatorQueryConstraint.php',
'PhabricatorQueryCursor' => 'infrastructure/query/policy/PhabricatorQueryCursor.php',
'PhabricatorQueryIterator' => 'infrastructure/storage/lisk/PhabricatorQueryIterator.php',
'PhabricatorQueryOrderItem' => 'infrastructure/query/order/PhabricatorQueryOrderItem.php',
'PhabricatorQueryOrderTestCase' => 'infrastructure/query/order/__tests__/PhabricatorQueryOrderTestCase.php',
@ -10294,6 +10295,7 @@ phutil_register_library_map(array(
'PhabricatorPygmentSetupCheck' => 'PhabricatorSetupCheck',
'PhabricatorQuery' => 'Phobject',
'PhabricatorQueryConstraint' => 'Phobject',
'PhabricatorQueryCursor' => 'Phobject',
'PhabricatorQueryIterator' => 'PhutilBufferedIterator',
'PhabricatorQueryOrderItem' => 'Phobject',
'PhabricatorQueryOrderTestCase' => 'PhabricatorTestCase',

View file

@ -4,6 +4,7 @@
* A query class which uses cursor-based paging. This paging is much more
* performant than offset-based paging in the presence of policy filtering.
*
* @task cursors Query Cursors
* @task clauses Building Query Clauses
* @task appsearch Integration with ApplicationSearch
* @task customfield Integration with CustomField
@ -15,8 +16,10 @@
abstract class PhabricatorCursorPagedPolicyAwareQuery
extends PhabricatorPolicyAwareQuery {
private $afterID;
private $beforeID;
private $externalCursorString;
private $internalCursorObject;
private $isQueryOrderReversed = false;
private $applicationSearchConstraints = array();
private $internalPaging;
private $orderVector;
@ -33,54 +36,182 @@ abstract class PhabricatorCursorPagedPolicyAwareQuery
private $ferretQuery;
private $ferretMetadata = array();
protected function getPageCursors(array $page) {
/* -( Cursors )------------------------------------------------------------ */
protected function newExternalCursorStringForResult($object) {
if (!($object instanceof LiskDAO)) {
throw new Exception(
pht(
'Expected to be passed a result object of class "LiskDAO" in '.
'"newExternalCursorStringForResult()", actually passed "%s". '.
'Return storage objects from "loadPage()" or override '.
'"newExternalCursorStringForResult()".',
phutil_describe_type($object)));
}
return (string)$object->getID();
}
protected function newInternalCursorFromExternalCursor($cursor) {
return $this->newInternalCursorObjectFromID($cursor);
}
protected function newPagingMapFromCursorObject(
PhabricatorQueryCursor $cursor,
array $keys) {
$object = $cursor->getObject();
return array(
$this->getResultCursor(head($page)),
$this->getResultCursor(last($page)),
'id' => (int)$object->getID(),
);
}
protected function getResultCursor($object) {
if (!is_object($object)) {
final protected function newInternalCursorObjectFromID($id) {
$viewer = $this->getViewer();
$query = newv(get_class($this), array());
$query
->setParentQuery($this)
->setViewer($viewer)
->withIDs(array((int)$id));
// We're copying our order vector to the subquery so that the subquery
// knows it should generate any supplemental information required by the
// ordering.
// For example, Phriction documents may be ordered by title, but the title
// isn't a column in the "document" table: the query must JOIN the
// "content" table to perform the ordering. Passing the ordering to the
// subquery tells it that we need it to do that JOIN and attach relevant
// paging information to the internal cursor object.
// We only expect to load a single result, so the actual result order does
// not matter. We only want the internal cursor for that result to look
// like a cursor this parent query would generate.
$query->setOrderVector($this->getOrderVector());
// We're executing the subquery normally to make sure the viewer can
// actually see the object, and that it's a completely valid object which
// passes all filtering and policy checks. You aren't allowed to use an
// object you can't see as a cursor, since this can leak information.
$result = $query->executeOne();
if (!$result) {
// TODO: Raise a more tailored exception here and make the UI a little
// prettier?
throw new Exception(
pht(
'Expected object, got "%s".',
gettype($object)));
'Cursor "%s" does not identify a valid object in query "%s".',
$id,
get_class($this)));
}
return $object->getID();
// Now that we made sure the viewer can actually see the object the
// external cursor identifies, return the internal cursor the query
// generated as a side effect while loading the object.
return $query->getInternalCursorObject();
}
protected function nextPage(array $page) {
// See getPagingViewer() for a description of this flag.
$this->internalPaging = true;
final private function getExternalCursorStringForResult($object) {
$cursor = $this->newExternalCursorStringForResult($object);
if ($this->beforeID !== null) {
$page = array_reverse($page, $preserve_keys = true);
list($before, $after) = $this->getPageCursors($page);
$this->beforeID = $before;
} else {
list($before, $after) = $this->getPageCursors($page);
$this->afterID = $after;
if (!is_string($cursor)) {
throw new Exception(
pht(
'Expected "newExternalCursorStringForResult()" in class "%s" to '.
'return a string, but got "%s".',
get_class($this),
phutil_describe_type($cursor)));
}
return $cursor;
}
final public function setAfterID($object_id) {
$this->afterID = $object_id;
final private function getExternalCursorString() {
return $this->externalCursorString;
}
final private function setExternalCursorString($external_cursor) {
$this->externalCursorString = $external_cursor;
return $this;
}
final protected function getAfterID() {
return $this->afterID;
final private function getIsQueryOrderReversed() {
return $this->isQueryOrderReversed;
}
final public function setBeforeID($object_id) {
$this->beforeID = $object_id;
final private function setIsQueryOrderReversed($is_reversed) {
$this->isQueryOrderReversed = $is_reversed;
return $this;
}
final protected function getBeforeID() {
return $this->beforeID;
final private function getInternalCursorObject() {
return $this->internalCursorObject;
}
final private function setInternalCursorObject(
PhabricatorQueryCursor $cursor) {
$this->internalCursorObject = $cursor;
return $this;
}
final private function getInternalCursorFromExternalCursor(
$cursor_string) {
$cursor_object = $this->newInternalCursorFromExternalCursor($cursor_string);
if (!($cursor_object instanceof PhabricatorQueryCursor)) {
throw new Exception(
pht(
'Expected "newInternalCursorFromExternalCursor()" to return an '.
'object of class "PhabricatorQueryCursor", but got "%s" (in '.
'class "%s").',
phutil_describe_type($cursor_object),
get_class($this)));
}
return $cursor_object;
}
final private function getPagingMapFromCursorObject(
PhabricatorQueryCursor $cursor,
array $keys) {
$map = $this->newPagingMapFromCursorObject($cursor, $keys);
if (!is_array($map)) {
throw new Exception(
pht(
'Expected "newPagingMapFromCursorObject()" to return a map of '.
'paging values, but got "%s" (in class "%s").',
phutil_describe_type($map),
get_class($this)));
}
foreach ($keys as $key) {
if (!array_key_exists($key, $map)) {
throw new Exception(
pht(
'Map returned by "newPagingMapFromCursorObject()" in class "%s" '.
'omits required key "%s".',
get_class($this),
$key));
}
}
return $map;
}
final protected function nextPage(array $page) {
if (!$page) {
return;
}
$cursor = id(new PhabricatorQueryCursor())
->setObject(last($page));
$this->setInternalCursorObject($cursor);
}
final public function getFerretMetadata() {
@ -218,7 +349,7 @@ abstract class PhabricatorCursorPagedPolicyAwareQuery
}
final protected function didLoadResults(array $results) {
if ($this->beforeID) {
if ($this->getIsQueryOrderReversed()) {
$results = array_reverse($results, $preserve_keys = true);
}
@ -230,10 +361,11 @@ abstract class PhabricatorCursorPagedPolicyAwareQuery
$this->setLimit($limit + 1);
if ($pager->getAfterID()) {
$this->setAfterID($pager->getAfterID());
if (strlen($pager->getAfterID())) {
$this->setExternalCursorString($pager->getAfterID());
} else if ($pager->getBeforeID()) {
$this->setBeforeID($pager->getBeforeID());
$this->setExternalCursorString($pager->getBeforeID());
$this->setIsQueryOrderReversed(true);
}
$results = $this->execute();
@ -241,15 +373,22 @@ abstract class PhabricatorCursorPagedPolicyAwareQuery
$sliced_results = $pager->sliceResults($results);
if ($sliced_results) {
list($before, $after) = $this->getPageCursors($sliced_results);
// If we have results, generate external-facing cursors from the visible
// results. This stops us from leaking any internal details about objects
// which we loaded but which were not visible to the viewer.
if ($pager->getBeforeID() || ($count > $limit)) {
$pager->setNextPageID($after);
$last_object = last($sliced_results);
$cursor = $this->getExternalCursorStringForResult($last_object);
$pager->setNextPageID($cursor);
}
if ($pager->getAfterID() ||
($pager->getBeforeID() && ($count > $limit))) {
$pager->setPrevPageID($before);
$head_object = head($sliced_results);
$cursor = $this->getExternalCursorStringForResult($head_object);
$pager->setPrevPageID($cursor);
}
}
@ -423,38 +562,39 @@ abstract class PhabricatorCursorPagedPolicyAwareQuery
$orderable = $this->getOrderableColumns();
$vector = $this->getOrderVector();
if ($this->beforeID !== null) {
$cursor = $this->beforeID;
$reversed = true;
} else if ($this->afterID !== null) {
$cursor = $this->afterID;
$reversed = false;
} else {
// No paging is being applied to this query so we do not need to
// construct a paging clause.
// If we don't have a cursor object yet, it means we're trying to load
// the first result page. We may need to build a cursor object from the
// external string, or we may not need a paging clause yet.
$cursor_object = $this->getInternalCursorObject();
if (!$cursor_object) {
$external_cursor = $this->getExternalCursorString();
if ($external_cursor !== null) {
$cursor_object = $this->getInternalCursorFromExternalCursor(
$external_cursor);
}
}
// If we still don't have a cursor object, this is the first result page
// and we aren't paging it. We don't need to build a paging clause.
if (!$cursor_object) {
return qsprintf($conn, '');
}
$reversed = $this->getIsQueryOrderReversed();
$keys = array();
foreach ($vector as $order) {
$keys[] = $order->getOrderKey();
}
$value_map = $this->getPagingValueMap($cursor, $keys);
$value_map = $this->getPagingMapFromCursorObject(
$cursor_object,
$keys);
$columns = array();
foreach ($vector as $order) {
$key = $order->getOrderKey();
if (!array_key_exists($key, $value_map)) {
throw new Exception(
pht(
'Query "%s" failed to return a value from getPagingValueMap() '.
'for column "%s".',
get_class($this),
$key));
}
$column = $orderable[$key];
$column['value'] = $value_map[$key];
@ -476,48 +616,6 @@ abstract class PhabricatorCursorPagedPolicyAwareQuery
}
/**
* @task paging
*/
protected function getPagingValueMap($cursor, array $keys) {
return array(
'id' => $cursor,
);
}
/**
* @task paging
*/
protected function loadCursorObject($cursor) {
$query = newv(get_class($this), array())
->setViewer($this->getPagingViewer())
->withIDs(array((int)$cursor));
$this->willExecuteCursorQuery($query);
$object = $query->executeOne();
if (!$object) {
throw new Exception(
pht(
'Cursor "%s" does not identify a valid object in query "%s".',
$cursor,
get_class($this)));
}
return $object;
}
/**
* @task paging
*/
protected function willExecuteCursorQuery(
PhabricatorCursorPagedPolicyAwareQuery $query) {
return;
}
/**
* Simplifies the task of constructing a paging clause across multiple
* columns. In the general case, this looks like:
@ -1072,10 +1170,7 @@ abstract class PhabricatorCursorPagedPolicyAwareQuery
array $parts,
$for_union = false) {
$is_query_reversed = false;
if ($this->getBeforeID()) {
$is_query_reversed = !$is_query_reversed;
}
$is_query_reversed = $this->getIsQueryOrderReversed();
$sql = array();
foreach ($parts as $key => $part) {

View file

@ -282,6 +282,13 @@ abstract class PhabricatorPolicyAwareQuery extends PhabricatorOffsetPagedQuery {
$this->didFilterResults($removed);
// NOTE: We call "nextPage()" before checking if we've found enough
// results because we want to build the internal cursor object even
// if we don't need to execute another query: the internal cursor may
// be used by a parent query that is using this query to translate an
// external cursor into an internal cursor.
$this->nextPage($page);
foreach ($visible as $key => $result) {
++$count;
@ -312,8 +319,6 @@ abstract class PhabricatorPolicyAwareQuery extends PhabricatorOffsetPagedQuery {
break;
}
$this->nextPage($page);
if ($overheat_limit && ($total_seen >= $overheat_limit)) {
$this->isOverheated = true;
break;

View file

@ -0,0 +1,17 @@
<?php
final class PhabricatorQueryCursor
extends Phobject {
private $object;
public function setObject($object) {
$this->object = $object;
return $this;
}
public function getObject() {
return $this->object;
}
}