mirror of
https://we.phorge.it/source/phorge.git
synced 2024-11-22 06:42:42 +01:00
Disallow webcrawlers to follow Paste line number anchor links
Summary: Paste provides line anchor links in every single line of a paste. If webcrawlers follow these links, they index the very same Paste again. Thus disallow in robots.txt to reduce unneeded traffic and indexing time. Closes T15662 Test Plan: Go to `/robots.txt` in the web browser. Cross fingers that more webcrawlers abide by RFC 9309. Reviewers: O1 Blessed Committers, valerio.bozzolan Reviewed By: O1 Blessed Committers, valerio.bozzolan Subscribers: tobiaswiese, valerio.bozzolan, Matthew, Cigaryno Maniphest Tasks: T15662 Differential Revision: https://we.phorge.it/D25461
This commit is contained in:
parent
f42dd5819e
commit
76ed0c7ff7
1 changed files with 7 additions and 0 deletions
|
@ -19,6 +19,13 @@ final class PhabricatorRobotsPlatformController
|
||||||
$out[] = 'Disallow: /diffusion/';
|
$out[] = 'Disallow: /diffusion/';
|
||||||
$out[] = 'Disallow: /source/';
|
$out[] = 'Disallow: /source/';
|
||||||
|
|
||||||
|
// See T15662. Prevent indexing line anchor links in Pastes. Per RFC 9309
|
||||||
|
// section 2.2.3, percentage-encode "$" to avoid interpretation as end of
|
||||||
|
// match pattern. However, crawlers may not abide by it but follow the
|
||||||
|
// original standard at https://www.robotstxt.org/orig.html with no mention
|
||||||
|
// how to interpret characters like "$" and thus entirely ignore this rule.
|
||||||
|
$out[] = 'Disallow: /P*%24*';
|
||||||
|
|
||||||
// Add a small crawl delay (number of seconds between requests) for spiders
|
// Add a small crawl delay (number of seconds between requests) for spiders
|
||||||
// which respect it. The intent here is to prevent spiders from affecting
|
// which respect it. The intent here is to prevent spiders from affecting
|
||||||
// performance for users. The possible cost is slower indexing, but that
|
// performance for users. The possible cost is slower indexing, but that
|
||||||
|
|
Loading…
Reference in a new issue