1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-09-19 16:58:48 +02:00

Rate limit requests by IP

Summary:
Fixes T3923. On `secure.phabricator.com`, we occasionally get slowed to a crawl when someone runs a security scanner against us, or 5 search bots decide to simultaneously index every line of every file in Diffusion.

Every time a user makes a request, give their IP address some points. If they get too many points in 5 minutes, start blocking their requests automatically for a while.

We give fewer points for logged in requests. We could futher refine this (more points for a 404, more points for a really slow page, etc.) but let's start simply.

Also, provide a mechanism for configuring this, and configuring the LB environment stuff at the same time (this comes up rarely, but we don't have a good answer right now).

Test Plan: Used `ab` and reloading over and over again to hit rate limits. Read documentation.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: chad, epriestley

Maniphest Tasks: T3923

Differential Revision: https://secure.phabricator.com/D8713
This commit is contained in:
epriestley 2014-04-08 18:36:21 -07:00
parent 597c6c07f7
commit 4d0935ba5e
5 changed files with 398 additions and 1 deletions

3
.gitignore vendored
View file

@ -33,6 +33,9 @@
# User-accessible hook for adhoc debugging scripts
/support/debug.php
# User-accessible hook for adhoc startup code
/support/preamble.php
# Users can link binaries here
/support/bin/*

View file

@ -197,6 +197,9 @@ Continue by:
@{article:Configuring Accounts and Registration}; or
- understanding advanced configuration topics with
@{article:Configuration User Guide: Advanced Configuration}; or
- configuring a preamble script to set up the environment properly behind a
load balancer, or adjust rate limiting with
@{article:Configuring a Preamble Script}; or
- configuring where uploaded files and attachments will be stored with
@{article:Configuring File Storage}; or
- configuring Phabricator so it can send mail with

View file

@ -0,0 +1,114 @@
@title Configuring a Preamble Script
@group config
Adjust environmental settings (SSL, remote IP, rate limiting) using a preamble
script.
= Overview =
If Phabricator is deployed in an environment where HTTP headers behave oddly
(usually, because it is behind a load balancer), it may not be able to detect
some environmental features (like the client's IP, or the presence of SSL)
correctly.
You can use a special preamble script to make arbitrary adjustments to the
environment and some parts of Phabricator's configuration in order to fix these
problems and set up the environment which Phabricator expects.
NOTE: This is an advanced feature. Most installs should not need to configure
a preamble script.
= Creating a Preamble Script =
To create a preamble script, write a file to:
phabricator/support/preamble.php
(This file is in Phabricator's `.gitignore`, so you do not need to worry about
colliding with `git` or interacting with updates.)
This file should be a valid PHP script. If you aren't very familiar with PHP,
you can check for syntax errors with `php -l`:
phabricator/ $ php -l support/preamble.php
No syntax errors detected in support/preamble.php
If present, this script will be executed at the very beginning of each web
request, allowing you to adjust the environment. For common adjustments and
examples, see the next sections.
= Adjusting Client IPs =
If your install is behind a load balancer, Phabricator may incorrectly detect
all requests as originating from the load balancer, rather than from the correct
client IPs. If this is the case and some other header (like `X-Forwarded-For`)
is known to be trustworthy, you can overwrite the `REMOTE_ADDR` setting so
Phabricator can figure out the client IP correctly:
```
name=Overwrite REMOTE_ADDR with X-Forwarded-For
<?php
$_SERVER['REMOTE_ADDR'] = $_SERVER['HTTP_X_FORWARDED_FOR'];
```
You should do this //only// if the `X-Forwarded-For` header is always
trustworthy. In particular, if users can make requests to the web server
directly, they can provide an arbitrary `X-Forwarded-For` header, and thereby
spoof an arbitrary client IP.
= Adjusting SSL =
If your install is behind an SSL terminating load balancer, Phabricator may
detect requests as HTTP when the client sees them as HTTPS. This can cause
Phabricator to generate links with the wrong protocol, issue cookies without
the SSL-only flag, or reject requests outright.
To fix this, you can set `$_SERVER['HTTPS']` explicitly:
```
name=Explicitly Configure SSL Availability
<?php
$_SERVER['HTTPS'] = true;
```
You can also set this value to `false` to explicitly tell Phabricator that a
request is not an SSL request.
= Adjusting Rate Limiting =
Phabricator performs coarse, IP-based rate limiting by default. In most
situations the default settings should be reasonable: they are set fairly high,
and intended to prevent only significantly abusive behavior.
However, if legitimate traffic is being rate limited (or you want to make the
limits more strict) you can adjust the limits in the preamble script.
```
name=Adjust Rate Limiting Behavior
<?php
// The default is 1000, so a value of 2000 increases the limit by a factor
// of 2: users will be able to make twice as many requests before being
// rate limited.
// You can set the limit to 0 to disable rate limiting.
PhabricatorStartup::setMaximumRate(2000);
```
By examining `$_SERVER['REMOTE_ADDR']` or similar parameters, you could also
adjust the rate limit dynamically: for example, remove it for requests from an
internal network, but impose a strict limit for external requests.
Rate limiting needs to be configured in this way in order to make it as cheap as
possible to activate after a client is rate limited. The limiting checks execute
before any libraries or configuration are loaded, and can emit a response within
a few milliseconds.
= Next Steps =
Continue by:
- returning to the @{article:Configuration Guide}.

View file

@ -8,10 +8,32 @@
* NOTE: This class MUST NOT have any dependencies. It runs before libraries
* load.
*
* Rate Limiting
* =============
*
* Phabricator limits the rate at which clients can request pages, and issues
* HTTP 429 "Too Many Requests" responses if clients request too many pages too
* quickly. Although this is not a complete defense against high-volume attacks,
* it can protect an install against aggressive crawlers, security scanners,
* and some types of malicious activity.
*
* To perform rate limiting, each page increments a score counter for the
* requesting user's IP. The page can give the IP more points for an expensive
* request, or fewer for an authetnicated request.
*
* Score counters are kept in buckets, and writes move to a new bucket every
* minute. After a few minutes (defined by @{method:getRateLimitBucketCount}),
* the oldest bucket is discarded. This provides a simple mechanism for keeping
* track of scores without needing to store, access, or read very much data.
*
* Users are allowed to accumulate up to 1000 points per minute, averaged across
* all of the tracked buckets.
*
* @task info Accessing Request Information
* @task hook Startup Hooks
* @task apocalypse In Case Of Apocalypse
* @task validation Validation
* @task ratelimit Rate Limiting
*/
final class PhabricatorStartup {
@ -19,6 +41,7 @@ final class PhabricatorStartup {
private static $globals = array();
private static $capturingOutput;
private static $rawInput;
private static $maximumRate = 1000;
/* -( Accessing Request Information )-------------------------------------- */
@ -93,6 +116,10 @@ final class PhabricatorStartup {
self::setupPHP();
self::verifyPHP();
if (isset($_SERVER['REMOTE_ADDR'])) {
self::rateLimitRequest($_SERVER['REMOTE_ADDR']);
}
self::normalizeInput();
self::verifyRewriteRules();
@ -521,4 +548,229 @@ final class PhabricatorStartup {
"'post_max_size' is set to '{$config}'.");
}
/* -( Rate Limiting )------------------------------------------------------ */
/**
* Adjust the permissible rate limit score.
*
* By default, the limit is `1000`. You can use this method to set it to
* a larger or smaller value. If you set it to `2000`, users may make twice
* as many requests before rate limiting.
*
* @param int Maximum score before rate limiting.
* @return void
* @task ratelimit
*/
public static function setMaximumRate($rate) {
self::$maximumRate = $rate;
}
/**
* Check if the user (identified by `$user_identity`) has issued too many
* requests recently. If they have, end the request with a 429 error code.
*
* The key just needs to identify the user. Phabricator uses both user PHIDs
* and user IPs as keys, tracking logged-in and logged-out users separately
* and enforcing different limits.
*
* @param string Some key which identifies the user making the request.
* @return void If the user has exceeded the rate limit, this method
* does not return.
* @task ratelimit
*/
public static function rateLimitRequest($user_identity) {
if (!self::canRateLimit()) {
return;
}
$score = self::getRateLimitScore($user_identity);
if ($score > (self::$maximumRate * self::getRateLimitBucketCount())) {
// Give the user some bonus points for getting rate limited. This keeps
// bad actors who keep slamming the 429 page locked out completely,
// instead of letting them get a burst of requests through every minute
// after a bucket expires.
self::addRateLimitScore($user_identity, 50);
self::didRateLimit($user_identity);
}
}
/**
* Add points to the rate limit score for some user.
*
* If users have earned more than 1000 points per minute across all the
* buckets they'll be locked out of the application, so awarding 1 point per
* request roughly corresponds to allowing 1000 requests per second, while
* awarding 50 points roughly corresponds to allowing 20 requests per second.
*
* @param string Some key which identifies the user making the request.
* @param float The cost for this request; more points pushes them toward
* the limit faster.
* @return void
* @task ratelimit
*/
public static function addRateLimitScore($user_identity, $score) {
if (!self::canRateLimit()) {
return;
}
$current = self::getRateLimitBucket();
// There's a bit of a race here, if a second process reads the bucket before
// this one writes it, but it's fine if we occasionally fail to record a
// user's score. If they're making requests fast enough to hit rate
// limiting, we'll get them soon.
$bucket_key = self::getRateLimitBucketKey($current);
$bucket = apc_fetch($bucket_key);
if (!is_array($bucket)) {
$bucket = array();
}
if (empty($bucket[$user_identity])) {
$bucket[$user_identity] = 0;
}
$bucket[$user_identity] += $score;
apc_store($bucket_key, $bucket);
}
/**
* Determine if rate limiting is available.
*
* Rate limiting depends on APC, and isn't available unless the APC user
* cache is available.
*
* @return bool True if rate limiting is available.
* @task ratelimit
*/
private static function canRateLimit() {
if (!self::$maximumRate) {
return false;
}
if (!function_exists('apc_fetch')) {
return false;
}
return true;
}
/**
* Get the current bucket for storing rate limit scores.
*
* @return int The current bucket.
* @task ratelimit
*/
private static function getRateLimitBucket() {
return (int)(time() / 60);
}
/**
* Get the total number of rate limit buckets to retain.
*
* @return int Total number of rate limit buckets to retain.
* @task ratelimit
*/
private static function getRateLimitBucketCount() {
return 5;
}
/**
* Get the APC key for a given bucket.
*
* @param int Bucket to get the key for.
* @return string APC key for the bucket.
* @task ratelimit
*/
private static function getRateLimitBucketKey($bucket) {
return 'rate:bucket:'.$bucket;
}
/**
* Get the APC key for the smallest stored bucket.
*
* @return string APC key for the smallest stored bucket.
* @task ratelimit
*/
private static function getRateLimitMinKey() {
return 'rate:min';
}
/**
* Get the current rate limit score for a given user.
*
* @param string Unique key identifying the user.
* @return float The user's current score.
* @task ratelimit
*/
private static function getRateLimitScore($user_identity) {
$min_key = self::getRateLimitMinKey();
// Identify the oldest bucket stored in APC.
$cur = self::getRateLimitBucket();
$min = apc_fetch($min_key);
// If we don't have any buckets stored yet, store the current bucket as
// the oldest bucket.
if (!$min) {
apc_store($min_key, $cur);
$min = $cur;
}
// Destroy any buckets that are older than the minimum bucket we're keeping
// track of. Under load this normally shouldn't do anything, but will clean
// up an old bucket once per minute.
$count = self::getRateLimitBucketCount();
for ($cursor = $min; $cursor < ($cur - $count); $cursor++) {
apc_delete(self::getRateLimitBucketKey($cursor));
apc_store($min_key, $cursor + 1);
}
// Now, sum up the user's scores in all of the active buckets.
$score = 0;
for (; $cursor <= $cur; $cursor++) {
$bucket = apc_fetch(self::getRateLimitBucketKey($cursor));
if (isset($bucket[$user_identity])) {
$score += $bucket[$user_identity];
}
}
return $score;
}
/**
* Emit an HTTP 429 "Too Many Requests" response (indicating that the user
* has exceeded application rate limits) and exit.
*
* @return exit This method **does not return**.
* @task ratelimit
*/
private static function didRateLimit() {
$message =
"TOO MANY REQUESTS\n".
"You are issuing too many requests too quickly.\n".
"To adjust limits, see \"Configuring a Preamble Script\" in the ".
"documentation.";
header(
'Content-Type: text/plain; charset=utf-8',
$replace = true,
$http_error = 429);
echo $message;
exit(1);
}
}

View file

@ -1,6 +1,14 @@
<?php
require_once dirname(dirname(__FILE__)).'/support/PhabricatorStartup.php';
$phabricator_root = dirname(dirname(__FILE__));
require_once $phabricator_root.'/support/PhabricatorStartup.php';
// If the preamble script exists, load it.
$preamble_path = $phabricator_root.'/support/preamble.php';
if (file_exists($preamble_path)) {
require_once $preamble_path;
}
PhabricatorStartup::didStartup();
$show_unexpected_traces = false;
@ -142,6 +150,23 @@ try {
));
DarkConsoleXHProfPluginAPI::saveProfilerSample($access_log);
// Add points to the rate limits for this request.
if (isset($_SERVER['REMOTE_ADDR'])) {
$user_ip = $_SERVER['REMOTE_ADDR'];
// The base score for a request allows users to make 30 requests per
// minute.
$score = (1000 / 30);
// If the user was logged in, let them make more requests.
if ($request->getUser() && $request->getUser()->getPHID()) {
$score = $score / 5;
}
PhabricatorStartup::addRateLimitScore($user_ip, $score);
}
} catch (Exception $ex) {
PhabricatorStartup::didEncounterFatalException(
'Core Exception',