1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-11-09 16:32:39 +01:00

Add a generic multistep Markup cache

Summary:
The immediate issue this addresses is T1366, adding a rendering cache to Phriction. For wiki pages with code blocks especially, rerendering them each time is expensive.

The broader issue is that out markup caches aren't very good right now. They have three major problems:

**Problem 1: the data is stored in the wrong place.** We currently store remarkup caches on objects. This means we're always loading it and passing it around even when we don't need it, can't genericize cache management code (e.g., have one simple script to drop/GC caches), need to update authoritative rows to clear caches, and can't genericize rendering code since each object is different.

To solve this, I created a dedicated cache database that I plan to move all markup caches to use.

**Problem 2: time-variant rules break when cached.** Some rules like `**bold**` are time-invariant and always produce the same output, but some rules like `{Tnnn}` and `@username` are variant and may render differently (because a task was closed or a user is on vacation). Currently, we cache the raw output, so these time-variant rules get locked at whatever values they had when they were first rendered. This is the main reason Phriction doesn't have a cache right now -- I wanted `{Tnnn}` rules to reflect open/closed tasks.

To solve this, I split markup into a "preprocessing" phase (which does all the parsing and evaluates all time-invariant rules) and a "postprocessing" phase (which evaluates time-variant rules only). The preprocessing phase is most of the expense (and, notably, includes syntax highlighting) so this is nearly as good as caching the final output. I did most of the work here in D737 / D738, but we never moved to use it in Phabricator -- we currently just do the two operations serially in all cases.

This diff splits them apart and caches the output of preprocessing only, so we benefit from caching but also get accurate time-variant rendering.

**Problem 3: cache access isn't batched/pipelined optimally.** When we're rendering a list of markup blocks, we should be able to batch datafetching better than we do. D738 helped with this (fetching is batched within a single hunk of markup) and this improves batching on cache access. We could still do better here, but this is at least a step forward.

Also fixes a bug with generating a link in the Phriction history interface ($uri gets clobbered).

I'm using PHP serialization instead of JSON serialization because Remarkup does some stuff with non-ascii characters that might not survive JSON.

Test Plan:
  - Created a Phriction document and verified that previews don't go to cache (no rows appear in the cache table).
  - Verified that published documents come out of cache.
  - Verified that caches generate/regenerate correctly, time-variant rules render properly and old documents hit the right caches.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T1366

Differential Revision: https://secure.phabricator.com/D2945
This commit is contained in:
epriestley 2012-07-09 15:20:56 -07:00
parent 918ac5755c
commit 2b0b9a1573
10 changed files with 534 additions and 20 deletions

View file

@ -4,7 +4,7 @@ CREATE TABLE {$NAMESPACE}_harbormaster.harbormaster_object (
name VARCHAR(255) COLLATE utf8_general_ci,
dateCreated INT UNSIGNED NOT NULL,
dateModified INT UNSIGNED NOT NULL
);
) ENGINE=InnoDB, COLLATE utf8_general_ci;
CREATE TABLE {$NAMESPACE}_harbormaster.edge (
src VARCHAR(64) NOT NULL COLLATE utf8_bin,

View file

@ -0,0 +1,10 @@
CREATE TABLE {$NAMESPACE}_cache.cache_markupcache (
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
cacheKey VARCHAR(128) NOT NULL collate utf8_bin,
cacheData LONGTEXT NOT NULL COLLATE utf8_bin,
metadata LONGTEXT NOT NULL COLLATE utf8_bin,
dateCreated INT UNSIGNED NOT NULL,
dateModified INT UNSIGNED NOT NULL,
UNIQUE KEY (cacheKey),
KEY (dateCreated)
) ENGINE=InnoDB, COLLATE utf8_general_ci;

View file

@ -563,6 +563,7 @@ phutil_register_library_map(array(
'PhabricatorAuthController' => 'applications/auth/controller/PhabricatorAuthController.php',
'PhabricatorBaseEnglishTranslation' => 'infrastructure/internationalization/PhabricatorBaseEnglishTranslation.php',
'PhabricatorBuiltinPatchList' => 'infrastructure/storage/patch/PhabricatorBuiltinPatchList.php',
'PhabricatorCacheDAO' => 'applications/cache/storage/PhabricatorCacheDAO.php',
'PhabricatorCalendarBrowseController' => 'applications/calendar/controller/PhabricatorCalendarBrowseController.php',
'PhabricatorCalendarController' => 'applications/calendar/controller/PhabricatorCalendarController.php',
'PhabricatorCalendarDAO' => 'applications/calendar/storage/PhabricatorCalendarDAO.php',
@ -738,7 +739,9 @@ phutil_register_library_map(array(
'PhabricatorMailImplementationSendGridAdapter' => 'applications/metamta/adapter/PhabricatorMailImplementationSendGridAdapter.php',
'PhabricatorMailImplementationTestAdapter' => 'applications/metamta/adapter/PhabricatorMailImplementationTestAdapter.php',
'PhabricatorMailReplyHandler' => 'applications/metamta/replyhandler/PhabricatorMailReplyHandler.php',
'PhabricatorMarkupCache' => 'applications/cache/storage/PhabricatorMarkupCache.php',
'PhabricatorMarkupEngine' => 'infrastructure/markup/PhabricatorMarkupEngine.php',
'PhabricatorMarkupInterface' => 'infrastructure/markup/PhabricatorMarkupInterface.php',
'PhabricatorMercurialGraphStream' => 'applications/repository/daemon/PhabricatorMercurialGraphStream.php',
'PhabricatorMetaMTAAttachment' => 'applications/metamta/storage/PhabricatorMetaMTAAttachment.php',
'PhabricatorMetaMTAController' => 'applications/metamta/controller/PhabricatorMetaMTAController.php',
@ -1590,6 +1593,7 @@ phutil_register_library_map(array(
'PhabricatorAuthController' => 'PhabricatorController',
'PhabricatorBaseEnglishTranslation' => 'PhabricatorTranslation',
'PhabricatorBuiltinPatchList' => 'PhabricatorSQLPatchList',
'PhabricatorCacheDAO' => 'PhabricatorLiskDAO',
'PhabricatorCalendarBrowseController' => 'PhabricatorCalendarController',
'PhabricatorCalendarController' => 'PhabricatorController',
'PhabricatorCalendarDAO' => 'PhabricatorLiskDAO',
@ -1742,6 +1746,7 @@ phutil_register_library_map(array(
'PhabricatorMailImplementationPHPMailerLiteAdapter' => 'PhabricatorMailImplementationAdapter',
'PhabricatorMailImplementationSendGridAdapter' => 'PhabricatorMailImplementationAdapter',
'PhabricatorMailImplementationTestAdapter' => 'PhabricatorMailImplementationAdapter',
'PhabricatorMarkupCache' => 'PhabricatorCacheDAO',
'PhabricatorMetaMTAController' => 'PhabricatorController',
'PhabricatorMetaMTADAO' => 'PhabricatorLiskDAO',
'PhabricatorMetaMTAEmailBodyParserTestCase' => 'PhabricatorTestCase',
@ -2037,7 +2042,11 @@ phutil_register_library_map(array(
'PhrictionDAO' => 'PhabricatorLiskDAO',
'PhrictionDeleteController' => 'PhrictionController',
'PhrictionDiffController' => 'PhrictionController',
'PhrictionDocument' => 'PhrictionDAO',
'PhrictionDocument' =>
array(
0 => 'PhrictionDAO',
1 => 'PhabricatorMarkupInterface',
),
'PhrictionDocumentController' => 'PhrictionController',
'PhrictionDocumentPreviewController' => 'PhrictionController',
'PhrictionDocumentStatus' => 'PhrictionConstants',

View file

@ -0,0 +1,25 @@
<?php
/*
* Copyright 2012 Facebook, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
abstract class PhabricatorCacheDAO extends PhabricatorLiskDAO {
public function getApplicationName() {
return 'cache';
}
}

View file

@ -0,0 +1,34 @@
<?php
/*
* Copyright 2012 Facebook, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
final class PhabricatorMarkupCache extends PhabricatorCacheDAO {
protected $cacheKey;
protected $cacheData;
protected $metadata;
public function getConfiguration() {
return array(
self::CONFIG_SERIALIZATION => array(
'cacheData' => self::SERIALIZATION_PHP,
'metadata' => self::SERIALIZATION_JSON,
),
) + parent::getConfiguration();
}
}

View file

@ -61,7 +61,7 @@ final class PhrictionHistoryController
$rows = array();
foreach ($history as $content) {
$uri = PhrictionDocument::getSlugURI($document->getSlug());
$slug_uri = PhrictionDocument::getSlugURI($document->getSlug());
$version = $content->getVersion();
$diff_uri = new PhutilURI('/phriction/diff/'.$document->getID().'/');
@ -102,7 +102,7 @@ final class PhrictionHistoryController
phutil_render_tag(
'a',
array(
'href' => $uri.'?v='.$version,
'href' => $slug_uri.'?v='.$version,
),
'Version '.$version),
$handles[$content->getAuthorPHID()]->renderLink(),

View file

@ -17,9 +17,14 @@
*/
/**
* @task markup Markup Interface
*
* @group phriction
*/
final class PhrictionContent extends PhrictionDAO {
final class PhrictionContent extends PhrictionDAO
implements PhabricatorMarkupInterface {
const MARKUP_FIELD_BODY = 'markup:body';
protected $id;
protected $documentID;
@ -35,11 +40,55 @@ final class PhrictionContent extends PhrictionDAO {
protected $changeRef;
public function renderContent() {
$engine = PhabricatorMarkupEngine::newPhrictionMarkupEngine();
$markup = $engine->markupText($this->getContent());
return PhabricatorMarkupEngine::renderOneObject(
$this,
self::MARKUP_FIELD_BODY);
}
/* -( Markup Interface )--------------------------------------------------- */
/**
* @task markup
*/
public function getMarkupFieldKey($field) {
if ($this->shouldUseMarkupCache($field)) {
$id = $this->getID();
} else {
$id = PhabricatorHash::digest($this->getMarkupText($field));
}
return "phriction:{$field}:{$id}";
}
/**
* @task markup
*/
public function getMarkupText($field) {
return $this->getContent();
}
/**
* @task markup
*/
public function newMarkupEngine($field) {
return PhabricatorMarkupEngine::newPhrictionMarkupEngine();
}
/**
* @task markup
*/
public function didMarkupText(
$field,
$output,
PhutilMarkupEngine $engine) {
$toc = PhutilRemarkupEngineRemarkupHeaderBlockRule::renderTableOfContents(
$engine);
if ($toc) {
$toc =
'<div class="phabricator-remarkup-toc">'.
@ -53,8 +102,17 @@ final class PhrictionContent extends PhrictionDAO {
return
'<div class="phabricator-remarkup">'.
$toc.
$markup.
$output.
'</div>';
}
/**
* @task markup
*/
public function shouldUseMarkupCache($field) {
return (bool)$this->getID();
}
}

View file

@ -16,41 +16,268 @@
* limitations under the License.
*/
class PhabricatorMarkupEngine {
/**
* Manages markup engine selection, configuration, application, caching and
* pipelining.
*
* @{class:PhabricatorMarkupEngine} can be used to render objects which
* implement @{interface:PhabricatorMarkupInterface} in a batched, cache-aware
* way. For example, if you have a list of comments written in remarkup (and
* the objects implement the correct interface) you can render them by first
* building an engine and adding the fields with @{method:addObject}.
*
* $field = 'field:body'; // Field you want to render. Each object exposes
* // one or more fields of markup.
*
* $engine = new PhabricatorMarkupEngine();
* foreach ($comments as $comment) {
* $engine->addObject($comment, $field);
* }
*
* Now, call @{method:process} to perform the actual cache/rendering
* step. This is a heavyweight call which does batched data access and
* transforms the markup into output.
*
* $engine->process();
*
* Finally, do something with the results:
*
* $results = array();
* foreach ($comments as $comment) {
* $results[] = $engine->getOutput($comment, $field);
* }
*
* If you have a single object to render, you can use the convenience method
* @{method:renderOneObject}.
*
* @task markup Markup Pipeline
* @task engine Engine Construction
*/
final class PhabricatorMarkupEngine {
public static function extractPHIDsFromMentions(array $content_blocks) {
$mentions = array();
private $objects = array();
$engine = self::newDifferentialMarkupEngine();
foreach ($content_blocks as $content_block) {
$engine->markupText($content_block);
$phids = $engine->getTextMetadata(
PhabricatorRemarkupRuleMention::KEY_MENTIONED,
array());
$mentions += $phids;
}
/* -( Markup Pipeline )---------------------------------------------------- */
return $mentions;
/**
* Convenience method for pushing a single object through the markup
* pipeline.
*
* @param PhabricatorMarkupInterface The object to render.
* @param string The field to render.
* @return string Marked up output.
* @task markup
*/
public static function renderOneObject(
PhabricatorMarkupInterface $object,
$field) {
return id(new PhabricatorMarkupEngine())
->addObject($object, $field)
->process()
->getOutput($object, $field);
}
/**
* Queue an object for markup generation when @{method:process} is
* called. You can retrieve the output later with @{method:getOutput}.
*
* @param PhabricatorMarkupInterface The object to render.
* @param string The field to render.
* @return this
* @task markup
*/
public function addObject(PhabricatorMarkupInterface $object, $field) {
$key = $this->getMarkupFieldKey($object, $field);
$this->objects[$key] = array(
'object' => $object,
'field' => $field,
);
return $this;
}
/**
* Process objects queued with @{method:addObject}. You can then retrieve
* the output with @{method:getOutput}.
*
* @return this
* @task markup
*/
public function process() {
$keys = array();
foreach ($this->objects as $key => $info) {
if (!isset($info['markup'])) {
$keys[] = $key;
}
}
if (!$keys) {
return;
}
$objects = array_select_keys($this->objects, $keys);
// Build all the markup engines. We need an engine for each field whether
// we have a cache or not, since we still need to postprocess the cache.
$engines = array();
foreach ($objects as $key => $info) {
$engines[$key] = $info['object']->newMarkupEngine($info['field']);
}
// Load or build the preprocessor caches.
$blocks = $this->loadPreprocessorCaches($engines, $objects);
// Finalize the output.
foreach ($objects as $key => $info) {
$data = $blocks[$key]->getCacheData();
$engine = $engines[$key];
$field = $info['field'];
$object = $info['object'];
$output = $engine->postprocessText($data);
$output = $object->didMarkupText($field, $output, $engine);
$this->objects[$key]['output'] = $output;
}
return $this;
}
/**
* Get the output of markup processing for a field queued with
* @{method:addObject}. Before you can call this method, you must call
* @{method:process}.
*
* @param PhabricatorMarkupInterface The object to retrieve.
* @param string The field to retrieve.
* @return string Processed output.
* @task markup
*/
public function getOutput(PhabricatorMarkupInterface $object, $field) {
$key = $this->getMarkupFieldKey($object, $field);
if (empty($this->objects[$key])) {
throw new Exception(
"Call addObject() before getOutput() (key = '{$key}').");
}
if (!isset($this->objects[$key]['output'])) {
throw new Exception(
"Call process() before getOutput().");
}
return $this->objects[$key]['output'];
}
/**
* @task markup
*/
private function getMarkupFieldKey(
PhabricatorMarkupInterface $object,
$field) {
return $object->getMarkupFieldKey($field);
}
/**
* @task markup
*/
private function loadPreprocessorCaches(array $engines, array $objects) {
$blocks = array();
$use_cache = array();
foreach ($objects as $key => $info) {
if ($info['object']->shouldUseMarkupCache($info['field'])) {
$use_cache[$key] = true;
}
}
if ($use_cache) {
$blocks = id(new PhabricatorMarkupCache())->loadAllWhere(
'cacheKey IN (%Ls)',
array_keys($use_cache));
$blocks = mpull($blocks, null, 'getCacheKey');
}
foreach ($objects as $key => $info) {
if (isset($blocks[$key])) {
// If we already have a preprocessing cache, we don't need to rebuild
// it.
continue;
}
$text = $info['object']->getMarkupText($info['field']);
$data = $engines[$key]->preprocessText($text);
// NOTE: This is just debugging information to help sort out cache issues.
// If one machine is misconfigured and poisoning caches you can use this
// field to hunt it down.
$metadata = array(
'host' => php_uname('n'),
);
$blocks[$key] = id(new PhabricatorMarkupCache())
->setCacheKey($key)
->setCacheData($data)
->setMetadata($metadata);
if (isset($use_cache[$key])) {
// This is just filling a cache and always safe, even on a read pathway.
$unguarded = AphrontWriteGuard::beginScopedUnguardedWrites();
try {
$blocks[$key]->save();
} catch (AphrontQueryDuplicateKeyException $ex) {
// Ignore this, we just raced to write the cache.
}
unset($unguarded);
}
}
return $blocks;
}
/* -( Engine Construction )------------------------------------------------ */
/**
* @task engine
*/
public static function newManiphestMarkupEngine() {
return self::newMarkupEngine(array(
));
}
/**
* @task engine
*/
public static function newPhrictionMarkupEngine() {
return self::newMarkupEngine(array(
'header.generate-toc' => true,
));
}
/**
* @task engine
*/
public static function newPhameMarkupEngine() {
return self::newMarkupEngine(array(
'macros' => false,
));
}
/**
* @task engine
*/
public static function newFeedMarkupEngine() {
return self::newMarkupEngine(
array(
@ -61,6 +288,10 @@ class PhabricatorMarkupEngine {
));
}
/**
* @task engine
*/
public static function newDifferentialMarkupEngine(array $options = array()) {
return self::newMarkupEngine(array(
'custom-inline' => PhabricatorEnv::getEnvConfig(
@ -71,21 +302,37 @@ class PhabricatorMarkupEngine {
));
}
/**
* @task engine
*/
public static function newDiffusionMarkupEngine(array $options = array()) {
return self::newMarkupEngine(array(
));
}
/**
* @task engine
*/
public static function newProfileMarkupEngine() {
return self::newMarkupEngine(array(
));
}
/**
* @task engine
*/
public static function newSlowvoteMarkupEngine() {
return self::newMarkupEngine(array(
));
}
/**
* @task engine
*/
private static function getMarkupEngineDefaultConfiguration() {
return array(
'pygments' => PhabricatorEnv::getEnvConfig('pygments.enabled'),
@ -104,6 +351,10 @@ class PhabricatorMarkupEngine {
);
}
/**
* @task engine
*/
private static function newMarkupEngine(array $options) {
$options += self::getMarkupEngineDefaultConfiguration();
@ -198,4 +449,20 @@ class PhabricatorMarkupEngine {
return $engine;
}
public static function extractPHIDsFromMentions(array $content_blocks) {
$mentions = array();
$engine = self::newDifferentialMarkupEngine();
foreach ($content_blocks as $content_block) {
$engine->markupText($content_block);
$phids = $engine->getTextMetadata(
PhabricatorRemarkupRuleMention::KEY_MENTIONED,
array());
$mentions += $phids;
}
return $mentions;
}
}

View file

@ -0,0 +1,103 @@
<?php
/*
* Copyright 2012 Facebook, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/**
* An object which has one or more fields containing markup that can be
* rendered into a display format. Commonly, the fields contain Remarkup and
* are rendered into HTML. Implementing this interface allows you to render
* objects through @{class:PhabricatorMarkupEngine} and benefit from caching
* and pipelining infrastructure.
*
* An object may have several "fields" of markup. For example, Differential
* revisions have a "summary" and a "test plan". In these cases, the `$field`
* parameter is used to identify which field is being operated on. For simple
* objects like comments, you might only have one field (say, "body"). In
* these cases, the implementation can largely ignore the `$field` parameter.
*
* @task markup Markup Interface
*
* @group markup
*/
interface PhabricatorMarkupInterface {
/* -( Markup Interface )--------------------------------------------------- */
/**
* Get a key to identify this field. This should uniquely identify the block
* of text to be rendered and be usable as a cache key. If the object has a
* PHID, using the PHID and the field name is likley reasonable:
*
* "{$phid}:{$field}"
*
* @param string Field name.
* @return string Cache key.
*
* @task markup
*/
public function getMarkupFieldKey($field);
/**
* Build the engine the field should use.
*
* @param string Field name.
* @return PhutilRemarkupEngine Markup engine to use.
* @task markup
*/
public function newMarkupEngine($field);
/**
* Return the contents of the specified field.
*
* @param string Field name.
* @return string The raw markup contained in the field.
* @task markup
*/
public function getMarkupText($field);
/**
* Callback for final postprocessing of output. Normally, you can return
* the output unmodified.
*
* @param string Field name.
* @param string The finalized output of the engine.
* @param string The engine which generated the output.
* @return string Final output.
* @task markup
*/
public function didMarkupText(
$field,
$output,
PhutilMarkupEngine $engine);
/**
* Determine if the engine should try to use the markup cache or not.
* Generally you should use the cache for durable/permanent content but
* should not use the cache for temporary/draft content.
*
* @return bool True to use the markup cache.
* @task markup
*/
public function shouldUseMarkupCache($field);
}

View file

@ -155,6 +155,10 @@ final class PhabricatorBuiltinPatchList extends PhabricatorSQLPatchList {
'type' => 'db',
'name' => 'xhpastview',
),
'db.cache' => array(
'type' => 'db',
'name' => 'cache',
),
'0000.legacy.sql' => array(
'type' => 'sql',
'name' => $this->getPatchPath('0000.legacy.sql'),
@ -903,6 +907,10 @@ final class PhabricatorBuiltinPatchList extends PhabricatorSQLPatchList {
'type' => 'sql',
'name' => $this->getPatchPath('harbormasterobject.sql'),
),
'markupcache.sql' => array(
'type' => 'sql',
'name' => $this->getPatchPath('markupcache.sql'),
),
);
}