1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-12-02 11:42:42 +01:00
phorge-phorge/src/applications/transactions/engineextension/PhabricatorTransactionsFulltextEngineExtension.php
epriestley 14e911a0d8 Index only the first 1,000 comments on any object
Summary:
Depends on D19502. Ref T13151. See PHI719. An install ended up with an object with 111,000+ comments on it because someone wrote a script to treat it like a logfile.

Although we seem to do mostly okay with this (locally, it only takes about 30s to index a similar object) we'll hit a wall somewhere (since we need to hold everything in memory), and it's hard to imagine a legitimate object with more than 1,000 comments. Just ignore comments past the first thousand.

(Conpherence threads may legitimately have more than 1,000 comments, but go through a different indexer.)

Test Plan:
  - Piped some comments into `maniphest.edit` in a loop to create a task with 100K comments.
  - Ran `bin/search index Txxx --force` to reindex it, with `--trace`.
    - Before: task indexed in about 30s.
    - After: script loaded comments with LIMIT 1000 and indexed in a couple seconds.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13151

Differential Revision: https://secure.phabricator.com/D19503
2018-06-22 17:41:05 -07:00

61 lines
1.7 KiB
PHP

<?php
final class PhabricatorTransactionsFulltextEngineExtension
extends PhabricatorFulltextEngineExtension {
const EXTENSIONKEY = 'transactions';
public function getExtensionName() {
return pht('Comments');
}
public function shouldEnrichFulltextObject($object) {
return ($object instanceof PhabricatorApplicationTransactionInterface);
}
public function enrichFulltextObject(
$object,
PhabricatorSearchAbstractDocument $document) {
$query = PhabricatorApplicationTransactionQuery::newQueryForObject($object);
if (!$query) {
return;
}
$query
->setViewer($this->getViewer())
->withObjectPHIDs(array($object->getPHID()))
->withComments(true)
->needComments(true);
// See PHI719. Users occasionally create objects with huge numbers of
// comments, which can be slow to index. We handle this with reasonable
// grace: at time of writing, we can index a task with 100K comments in
// about 30 seconds. However, we do need to hold all the comments in
// memory in the AbstractDocument, so there's some practical limit to what
// we can realistically index.
// Since objects with more than 1,000 comments are not likely to be
// legitimate objects with actual discussion, index only the first
// thousand comments.
$query
->setOrderVector(array('-id'))
->setLimit(1000);
$xactions = $query->execute();
foreach ($xactions as $xaction) {
if (!$xaction->hasComment()) {
continue;
}
$comment = $xaction->getComment();
$document->addField(
PhabricatorSearchDocumentFieldType::FIELD_COMMENT,
$comment->getContent());
}
}
}