phorge-phorge/src/applications/diffusion/protocol/DiffusionMercurialWireProtocol.php

<?php

final class DiffusionMercurialWireProtocol {

  public static function getCommandArgs($command) {
    // We need to enumerate all of the Mercurial wire commands because the
    // argument encoding varies based on the command. "Why?", you might ask,
    // "Why would you do this?".

    $commands = array(
      'batch' => array('cmds', '*'),
      'between' => array('pairs'),
      'branchmap' => array(),
      'branches' => array('nodes'),
      'capabilities' => array(),
      'changegroup' => array('roots'),
      'changegroupsubset' => array('bases heads'),
      'debugwireargs' => array('one two *'),
      'getbundle' => array('*'),
      'heads' => array(),
      'hello' => array(),
      'known' => array('nodes', '*'),
      'listkeys' => array('namespace'),
      'lookup' => array('key'),
      'pushkey' => array('namespace', 'key', 'old', 'new'),
      'stream_out' => array(''),
      'unbundle' => array('heads'),
    );

    if (!isset($commands[$command])) {
      throw new Exception(pht("Unknown Mercurial command '%s!", $command));
    }

    return $commands[$command];
  }

  public static function isReadOnlyCommand($command) {
    $read_only = array(
      'between' => true,
      'branchmap' => true,
      'branches' => true,
      'capabilities' => true,
      'changegroup' => true,
      'changegroupsubset' => true,
      'debugwireargs' => true,
      'getbundle' => true,
      'heads' => true,
      'hello' => true,
      'known' => true,
      'listkeys' => true,
      'lookup' => true,
      'stream_out' => true,
    );

    // Notably, the write commands are "pushkey" and "unbundle". The
    // "batch" command is theoretically read only, but we require explicit
    // analysis of the actual commands.

    return isset($read_only[$command]);
  }

  public static function isReadOnlyBatchCommand($cmds) {
    if (!strlen($cmds)) {
      // We expect a "batch" command to always have a "cmds" string, so err
      // on the side of caution and throw if we don't get any data here. This
      // either indicates a mangled command from the client or a programming
      // error in our code.
      throw new Exception(pht("Expected nonempty '%s' specification!", 'cmds'));
    }

    // For "batch" we get a "cmds" argument like:
    //
    //   heads ;known nodes=
    //
    // We need to examine the commands (here, "heads" and "known") to make sure
    // they're all read-only.

    // NOTE: Mercurial has some code to escape semicolons, but it does not
    // actually function for command separation. For example, these two batch
    // commands will produce completely different results (the former will run
    // the lookup; the latter will fail with a parser error):
    //
    //  lookup key=a:xb;lookup key=z* 0
    //  lookup key=a:;b;lookup key=z* 0
    //               ^
    //               |
    //               +-- Note semicolon.
    //
    // So just split unconditionally.

    $cmds = explode(';', $cmds);
    foreach ($cmds as $sub_cmd) {
      $name = head(explode(' ', $sub_cmd, 2));
      if (!self::isReadOnlyCommand($name)) {
        return false;
      }
    }

    return true;
  }

}
Allow Phabricator to serve Mercurial repositories over HTTP Summary: Ref T2230. This is easily the worst thing I've had to write in a while. I'll leave some notes inline. Test Plan: Ran `hg clone http://...` on a hosted repo. Ran `hg push` on the same. Changed sync'd both ways. Reviewers: asherkin, btrahan Reviewed By: btrahan CC: aran Maniphest Tasks: T2230 Differential Revision: https://secure.phabricator.com/D7520 2013-11-07 03:00:42 +01:00			`<?php`

			`final class DiffusionMercurialWireProtocol {`

			`public static function getCommandArgs($command) {`
			`// We need to enumerate all of the Mercurial wire commands because the`
			`// argument encoding varies based on the command. "Why?", you might ask,`
			`// "Why would you do this?".`

			`$commands = array(`
			`'batch' => array('cmds', '*'),`
			`'between' => array('pairs'),`
			`'branchmap' => array(),`
			`'branches' => array('nodes'),`
			`'capabilities' => array(),`
			`'changegroup' => array('roots'),`
			`'changegroupsubset' => array('bases heads'),`
			`'debugwireargs' => array('one two *'),`
			`'getbundle' => array('*'),`
			`'heads' => array(),`
			`'hello' => array(),`
			`'known' => array('nodes', '*'),`
			`'listkeys' => array('namespace'),`
			`'lookup' => array('key'),`
			`'pushkey' => array('namespace', 'key', 'old', 'new'),`
			`'stream_out' => array(''),`
			`'unbundle' => array('heads'),`
			`);`

			`if (!isset($commands[$command])) {`
phtize all the things Summary: `pht`ize a whole bunch of strings in rP. Test Plan: Intense eyeballing. Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: hach-que, Korvin, epriestley Differential Revision: https://secure.phabricator.com/D12797 2015-05-22 09:27:56 +02:00			`throw new Exception(pht("Unknown Mercurial command '%s!", $command));`
Allow Phabricator to serve Mercurial repositories over HTTP Summary: Ref T2230. This is easily the worst thing I've had to write in a while. I'll leave some notes inline. Test Plan: Ran `hg clone http://...` on a hosted repo. Ran `hg push` on the same. Changed sync'd both ways. Reviewers: asherkin, btrahan Reviewed By: btrahan CC: aran Maniphest Tasks: T2230 Differential Revision: https://secure.phabricator.com/D7520 2013-11-07 03:00:42 +01:00			`}`

			`return $commands[$command];`
			`}`

			`public static function isReadOnlyCommand($command) {`
			`$read_only = array(`
			`'between' => true,`
			`'branchmap' => true,`
			`'branches' => true,`
			`'capabilities' => true,`
			`'changegroup' => true,`
			`'changegroupsubset' => true,`
			`'debugwireargs' => true,`
			`'getbundle' => true,`
			`'heads' => true,`
			`'hello' => true,`
			`'known' => true,`
			`'listkeys' => true,`
			`'lookup' => true,`
			`'stream_out' => true,`
			`);`

			`// Notably, the write commands are "pushkey" and "unbundle". The`
			`// "batch" command is theoretically read only, but we require explicit`
			`// analysis of the actual commands.`

			`return isset($read_only[$command]);`
			`}`

Enable Mercurial reads and writes over SSH Summary: Ref T2230. This is substantially more complicated than Git, but mostly because Mercurial's protocol is a like 50 ad-hoc extensions cobbled together. Because we must decode protocol frames in order to determine if a request is read or write, 90% of this is implementing a stream parser for the protocol. Mercurial's own parser is simpler, but relies on blocking reads. Since we don't even have methods for blocking reads right now and keeping the whole thing non-blocking is conceptually better, I made the parser nonblocking. It ends up being a lot of stuff. I made an effort to cover it reasonably well with unit tests, and to make sure we fail closed (i.e., reject requests) if there are any parts of the protocol I got wrong. A lot of the complexity is sharable with the HTTP stuff, so it ends up being not-so-bad, just very hard to verify by inspection as clearly correct. Test Plan: - Ran `hg clone` over SSH. - Ran `hg fetch` over SSH. - Ran `hg push` over SSH, to a read-only repo (error) and a read-write repo (success). Reviewers: btrahan, asherkin Reviewed By: btrahan CC: aran Maniphest Tasks: T2230 Differential Revision: https://secure.phabricator.com/D7553 2013-11-11 21:18:27 +01:00			`public static function isReadOnlyBatchCommand($cmds) {`
			`if (!strlen($cmds)) {`
			`// We expect a "batch" command to always have a "cmds" string, so err`
			`// on the side of caution and throw if we don't get any data here. This`
			`// either indicates a mangled command from the client or a programming`
			`// error in our code.`
phtize all the things Summary: `pht`ize a whole bunch of strings in rP. Test Plan: Intense eyeballing. Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: hach-que, Korvin, epriestley Differential Revision: https://secure.phabricator.com/D12797 2015-05-22 09:27:56 +02:00			`throw new Exception(pht("Expected nonempty '%s' specification!", 'cmds'));`
Enable Mercurial reads and writes over SSH Summary: Ref T2230. This is substantially more complicated than Git, but mostly because Mercurial's protocol is a like 50 ad-hoc extensions cobbled together. Because we must decode protocol frames in order to determine if a request is read or write, 90% of this is implementing a stream parser for the protocol. Mercurial's own parser is simpler, but relies on blocking reads. Since we don't even have methods for blocking reads right now and keeping the whole thing non-blocking is conceptually better, I made the parser nonblocking. It ends up being a lot of stuff. I made an effort to cover it reasonably well with unit tests, and to make sure we fail closed (i.e., reject requests) if there are any parts of the protocol I got wrong. A lot of the complexity is sharable with the HTTP stuff, so it ends up being not-so-bad, just very hard to verify by inspection as clearly correct. Test Plan: - Ran `hg clone` over SSH. - Ran `hg fetch` over SSH. - Ran `hg push` over SSH, to a read-only repo (error) and a read-write repo (success). Reviewers: btrahan, asherkin Reviewed By: btrahan CC: aran Maniphest Tasks: T2230 Differential Revision: https://secure.phabricator.com/D7553 2013-11-11 21:18:27 +01:00			`}`

			`// For "batch" we get a "cmds" argument like:`
			`//`
			`// heads ;known nodes=`
			`//`
			`// We need to examine the commands (here, "heads" and "known") to make sure`
			`// they're all read-only.`

			`// NOTE: Mercurial has some code to escape semicolons, but it does not`
			`// actually function for command separation. For example, these two batch`
			`// commands will produce completely different results (the former will run`
			`// the lookup; the latter will fail with a parser error):`
			`//`
			`// lookup key=a:xb;lookup key=z* 0`
			`// lookup key=a:;b;lookup key=z* 0`
			`// ^`
			`// \|`
			`// +-- Note semicolon.`
			`//`
			`// So just split unconditionally.`

			`$cmds = explode(';', $cmds);`
			`foreach ($cmds as $sub_cmd) {`
			`$name = head(explode(' ', $sub_cmd, 2));`
			`if (!self::isReadOnlyCommand($name)) {`
			`return false;`
			`}`
			`}`

			`return true;`
			`}`

Allow Phabricator to serve Mercurial repositories over HTTP Summary: Ref T2230. This is easily the worst thing I've had to write in a while. I'll leave some notes inline. Test Plan: Ran `hg clone http://...` on a hosted repo. Ran `hg push` on the same. Changed sync'd both ways. Reviewers: asherkin, btrahan Reviewed By: btrahan CC: aran Maniphest Tasks: T2230 Differential Revision: https://secure.phabricator.com/D7520 2013-11-07 03:00:42 +01:00			`}`