phorge-phorge/src/applications/files/query/PhabricatorFileChunkQuery.php

<?php

final class PhabricatorFileChunkQuery
  extends PhabricatorCursorPagedPolicyAwareQuery {

  private $chunkHandles;
  private $rangeStart;
  private $rangeEnd;
  private $isComplete;
  private $needDataFiles;

  public function withChunkHandles(array $handles) {
    $this->chunkHandles = $handles;
    return $this;
  }

  public function withByteRange($start, $end) {
    $this->rangeStart = $start;
    $this->rangeEnd = $end;
    return $this;
  }

  public function withIsComplete($complete) {
    $this->isComplete = $complete;
    return $this;
  }

  public function needDataFiles($need) {
    $this->needDataFiles = $need;
    return $this;
  }

  protected function loadPage() {
    $table = new PhabricatorFileChunk();
    $conn_r = $table->establishConnection('r');

    $data = queryfx_all(
      $conn_r,
      'SELECT * FROM %T %Q %Q %Q',
      $table->getTableName(),
      $this->buildWhereClause($conn_r),
      $this->buildOrderClause($conn_r),
      $this->buildLimitClause($conn_r));

    return $table->loadAllFromArray($data);
  }

  protected function willFilterPage(array $chunks) {

    if ($this->needDataFiles) {
      $file_phids = mpull($chunks, 'getDataFilePHID');
      $file_phids = array_filter($file_phids);
      if ($file_phids) {
        $files = id(new PhabricatorFileQuery())
          ->setViewer($this->getViewer())
          ->setParentQuery($this)
          ->withPHIDs($file_phids)
          ->execute();
        $files = mpull($files, null, 'getPHID');
      } else {
        $files = array();
      }

      foreach ($chunks as $key => $chunk) {
        $data_phid = $chunk->getDataFilePHID();
        if (!$data_phid) {
          $chunk->attachDataFile(null);
          continue;
        }

        $file = idx($files, $data_phid);
        if (!$file) {
          unset($chunks[$key]);
          $this->didRejectResult($chunk);
          continue;
        }

        $chunk->attachDataFile($file);
      }

      if (!$chunks) {
        return $chunks;
      }
    }

    return $chunks;
  }

  private function buildWhereClause(AphrontDatabaseConnection $conn_r) {
    $where = array();

    if ($this->chunkHandles !== null) {
      $where[] = qsprintf(
        $conn_r,
        'chunkHandle IN (%Ls)',
        $this->chunkHandles);
    }

    if ($this->rangeStart !== null) {
      $where[] = qsprintf(
        $conn_r,
        'byteEnd > %d',
        $this->rangeStart);
    }

    if ($this->rangeEnd !== null) {
      $where[] = qsprintf(
        $conn_r,
        'byteStart < %d',
        $this->rangeEnd);
    }

    if ($this->isComplete !== null) {
      if ($this->isComplete) {
        $where[] = qsprintf(
          $conn_r,
          'dataFilePHID IS NOT NULL');
      } else {
        $where[] = qsprintf(
          $conn_r,
          'dataFilePHID IS NULL');
      }
    }

    $where[] = $this->buildPagingClause($conn_r);

    return $this->formatWhereClause($where);
  }

  public function getQueryApplicationClass() {
    return 'PhabricatorFilesApplication';
  }

}
Add a chunking storage engine for files Summary: Ref T7149. This isn't complete and isn't active yet, but does basically work. I'll shore it up in the next few diffs. The new workflow goes like this: > Client, file.allocate(): I'd like to upload a file with length L, metadata M, and hash H. Then the server returns `upload` (a boolean) and `filePHID` (a PHID). These mean: \| upload \| filePHID \| means \| \|---\|---\|---\| \| false \| false \| Server can't accept file. \| false \| true \| File data already known, file created from hash. \| true \| false \| Just upload normally. \| true \| true \| Query chunks to start or resume a chunked upload. All but the last case are uninteresting and work like exising uploads with `file.uploadhash` (which we can eventually deprecate). In the last case: > Client, file.querychunks(): Give me a list of chunks that I should upload. This returns all the chunks for the file. Chunks have a start byte, an end byte, and a "complete" flag to indicate that the server already has the data. Then, the client fills in chunks by sending them: > Client, file.uploadchunk(): Here is the data for one chunk. This stuff doesn't work yet or has some caveats: - I haven't tested resume much. - Files need an "isPartial()" flag for partial uploads, and the UI needs to respect it. - The JS client needs to become chunk-aware. - Chunk size is set crazy low to make testing easier. - Some debugging flags that I'll remove soon-ish. - Downloading works, but still streams the whole file into memory. - This storage engine is disabled by default (hardcoded as a unit test engine) because it's still sketchy. - Need some code to remove the "isParital" flag when the last chunk is uploaded. - Maybe do checksumming on chunks. Test Plan: - Hacked up `arc upload` (see next diff) to be chunk-aware and uploaded a readme in 18 32-byte chunks. Then downloaded it. Got the same file back that I uploaded. - File UI now shows some basic chunk info for chunked files: {F336434} Reviewers: btrahan Reviewed By: btrahan Subscribers: joshuaspence, epriestley Maniphest Tasks: T7149 Differential Revision: https://secure.phabricator.com/D12060 2015-03-13 11:30:02 -07:00			`<?php`

			`final class PhabricatorFileChunkQuery`
			`extends PhabricatorCursorPagedPolicyAwareQuery {`

			`private $chunkHandles;`
			`private $rangeStart;`
			`private $rangeEnd;`
Add support for partially uploaded files Summary: Ref T7149. This flags allocated but incomplete files and doesn't explode when trying to download them. Files are marked complete when the last chunk is uploaded. I added a key on `<authorPHID, isPartial>` so we can show you a list of partially uploaded files and prompt you to resume them at some point down the road. Test Plan: Massaged debugging settings and uploaded README.md very slowly in 32b chunks. Saw the file lose its "Partial" flag when the last chunk finished. Reviewers: btrahan Reviewed By: btrahan Subscribers: joshuaspence, epriestley Maniphest Tasks: T7149 Differential Revision: https://secure.phabricator.com/D12063 2015-03-13 11:30:24 -07:00			`private $isComplete;`
Add a chunking storage engine for files Summary: Ref T7149. This isn't complete and isn't active yet, but does basically work. I'll shore it up in the next few diffs. The new workflow goes like this: > Client, file.allocate(): I'd like to upload a file with length L, metadata M, and hash H. Then the server returns `upload` (a boolean) and `filePHID` (a PHID). These mean: \| upload \| filePHID \| means \| \|---\|---\|---\| \| false \| false \| Server can't accept file. \| false \| true \| File data already known, file created from hash. \| true \| false \| Just upload normally. \| true \| true \| Query chunks to start or resume a chunked upload. All but the last case are uninteresting and work like exising uploads with `file.uploadhash` (which we can eventually deprecate). In the last case: > Client, file.querychunks(): Give me a list of chunks that I should upload. This returns all the chunks for the file. Chunks have a start byte, an end byte, and a "complete" flag to indicate that the server already has the data. Then, the client fills in chunks by sending them: > Client, file.uploadchunk(): Here is the data for one chunk. This stuff doesn't work yet or has some caveats: - I haven't tested resume much. - Files need an "isPartial()" flag for partial uploads, and the UI needs to respect it. - The JS client needs to become chunk-aware. - Chunk size is set crazy low to make testing easier. - Some debugging flags that I'll remove soon-ish. - Downloading works, but still streams the whole file into memory. - This storage engine is disabled by default (hardcoded as a unit test engine) because it's still sketchy. - Need some code to remove the "isParital" flag when the last chunk is uploaded. - Maybe do checksumming on chunks. Test Plan: - Hacked up `arc upload` (see next diff) to be chunk-aware and uploaded a readme in 18 32-byte chunks. Then downloaded it. Got the same file back that I uploaded. - File UI now shows some basic chunk info for chunked files: {F336434} Reviewers: btrahan Reviewed By: btrahan Subscribers: joshuaspence, epriestley Maniphest Tasks: T7149 Differential Revision: https://secure.phabricator.com/D12060 2015-03-13 11:30:02 -07:00			`private $needDataFiles;`

			`public function withChunkHandles(array $handles) {`
			`$this->chunkHandles = $handles;`
			`return $this;`
			`}`

			`public function withByteRange($start, $end) {`
			`$this->rangeStart = $start;`
			`$this->rangeEnd = $end;`
			`return $this;`
			`}`

Add support for partially uploaded files Summary: Ref T7149. This flags allocated but incomplete files and doesn't explode when trying to download them. Files are marked complete when the last chunk is uploaded. I added a key on `<authorPHID, isPartial>` so we can show you a list of partially uploaded files and prompt you to resume them at some point down the road. Test Plan: Massaged debugging settings and uploaded README.md very slowly in 32b chunks. Saw the file lose its "Partial" flag when the last chunk finished. Reviewers: btrahan Reviewed By: btrahan Subscribers: joshuaspence, epriestley Maniphest Tasks: T7149 Differential Revision: https://secure.phabricator.com/D12063 2015-03-13 11:30:24 -07:00			`public function withIsComplete($complete) {`
			`$this->isComplete = $complete;`
			`return $this;`
			`}`

Add a chunking storage engine for files Summary: Ref T7149. This isn't complete and isn't active yet, but does basically work. I'll shore it up in the next few diffs. The new workflow goes like this: > Client, file.allocate(): I'd like to upload a file with length L, metadata M, and hash H. Then the server returns `upload` (a boolean) and `filePHID` (a PHID). These mean: \| upload \| filePHID \| means \| \|---\|---\|---\| \| false \| false \| Server can't accept file. \| false \| true \| File data already known, file created from hash. \| true \| false \| Just upload normally. \| true \| true \| Query chunks to start or resume a chunked upload. All but the last case are uninteresting and work like exising uploads with `file.uploadhash` (which we can eventually deprecate). In the last case: > Client, file.querychunks(): Give me a list of chunks that I should upload. This returns all the chunks for the file. Chunks have a start byte, an end byte, and a "complete" flag to indicate that the server already has the data. Then, the client fills in chunks by sending them: > Client, file.uploadchunk(): Here is the data for one chunk. This stuff doesn't work yet or has some caveats: - I haven't tested resume much. - Files need an "isPartial()" flag for partial uploads, and the UI needs to respect it. - The JS client needs to become chunk-aware. - Chunk size is set crazy low to make testing easier. - Some debugging flags that I'll remove soon-ish. - Downloading works, but still streams the whole file into memory. - This storage engine is disabled by default (hardcoded as a unit test engine) because it's still sketchy. - Need some code to remove the "isParital" flag when the last chunk is uploaded. - Maybe do checksumming on chunks. Test Plan: - Hacked up `arc upload` (see next diff) to be chunk-aware and uploaded a readme in 18 32-byte chunks. Then downloaded it. Got the same file back that I uploaded. - File UI now shows some basic chunk info for chunked files: {F336434} Reviewers: btrahan Reviewed By: btrahan Subscribers: joshuaspence, epriestley Maniphest Tasks: T7149 Differential Revision: https://secure.phabricator.com/D12060 2015-03-13 11:30:02 -07:00			`public function needDataFiles($need) {`
			`$this->needDataFiles = $need;`
			`return $this;`
			`}`

			`protected function loadPage() {`
			`$table = new PhabricatorFileChunk();`
			`$conn_r = $table->establishConnection('r');`

			`$data = queryfx_all(`
			`$conn_r,`
			`'SELECT * FROM %T %Q %Q %Q',`
			`$table->getTableName(),`
			`$this->buildWhereClause($conn_r),`
			`$this->buildOrderClause($conn_r),`
			`$this->buildLimitClause($conn_r));`

			`return $table->loadAllFromArray($data);`
			`}`

			`protected function willFilterPage(array $chunks) {`

			`if ($this->needDataFiles) {`
			`$file_phids = mpull($chunks, 'getDataFilePHID');`
			`$file_phids = array_filter($file_phids);`
			`if ($file_phids) {`
			`$files = id(new PhabricatorFileQuery())`
			`->setViewer($this->getViewer())`
			`->setParentQuery($this)`
			`->withPHIDs($file_phids)`
			`->execute();`
			`$files = mpull($files, null, 'getPHID');`
			`} else {`
			`$files = array();`
			`}`

			`foreach ($chunks as $key => $chunk) {`
			`$data_phid = $chunk->getDataFilePHID();`
			`if (!$data_phid) {`
			`$chunk->attachDataFile(null);`
			`continue;`
			`}`

			`$file = idx($files, $data_phid);`
			`if (!$file) {`
			`unset($chunks[$key]);`
			`$this->didRejectResult($chunk);`
			`continue;`
			`}`

			`$chunk->attachDataFile($file);`
			`}`

			`if (!$chunks) {`
			`return $chunks;`
			`}`
			`}`

			`return $chunks;`
			`}`

			`private function buildWhereClause(AphrontDatabaseConnection $conn_r) {`
			`$where = array();`

			`if ($this->chunkHandles !== null) {`
			`$where[] = qsprintf(`
			`$conn_r,`
			`'chunkHandle IN (%Ls)',`
			`$this->chunkHandles);`
			`}`

			`if ($this->rangeStart !== null) {`
			`$where[] = qsprintf(`
			`$conn_r,`
			`'byteEnd > %d',`
			`$this->rangeStart);`
			`}`

			`if ($this->rangeEnd !== null) {`
			`$where[] = qsprintf(`
			`$conn_r,`
			`'byteStart < %d',`
			`$this->rangeEnd);`
			`}`

Add support for partially uploaded files Summary: Ref T7149. This flags allocated but incomplete files and doesn't explode when trying to download them. Files are marked complete when the last chunk is uploaded. I added a key on `<authorPHID, isPartial>` so we can show you a list of partially uploaded files and prompt you to resume them at some point down the road. Test Plan: Massaged debugging settings and uploaded README.md very slowly in 32b chunks. Saw the file lose its "Partial" flag when the last chunk finished. Reviewers: btrahan Reviewed By: btrahan Subscribers: joshuaspence, epriestley Maniphest Tasks: T7149 Differential Revision: https://secure.phabricator.com/D12063 2015-03-13 11:30:24 -07:00			`if ($this->isComplete !== null) {`
			`if ($this->isComplete) {`
			`$where[] = qsprintf(`
			`$conn_r,`
			`'dataFilePHID IS NOT NULL');`
			`} else {`
			`$where[] = qsprintf(`
			`$conn_r,`
			`'dataFilePHID IS NULL');`
			`}`
			`}`

Add a chunking storage engine for files Summary: Ref T7149. This isn't complete and isn't active yet, but does basically work. I'll shore it up in the next few diffs. The new workflow goes like this: > Client, file.allocate(): I'd like to upload a file with length L, metadata M, and hash H. Then the server returns `upload` (a boolean) and `filePHID` (a PHID). These mean: \| upload \| filePHID \| means \| \|---\|---\|---\| \| false \| false \| Server can't accept file. \| false \| true \| File data already known, file created from hash. \| true \| false \| Just upload normally. \| true \| true \| Query chunks to start or resume a chunked upload. All but the last case are uninteresting and work like exising uploads with `file.uploadhash` (which we can eventually deprecate). In the last case: > Client, file.querychunks(): Give me a list of chunks that I should upload. This returns all the chunks for the file. Chunks have a start byte, an end byte, and a "complete" flag to indicate that the server already has the data. Then, the client fills in chunks by sending them: > Client, file.uploadchunk(): Here is the data for one chunk. This stuff doesn't work yet or has some caveats: - I haven't tested resume much. - Files need an "isPartial()" flag for partial uploads, and the UI needs to respect it. - The JS client needs to become chunk-aware. - Chunk size is set crazy low to make testing easier. - Some debugging flags that I'll remove soon-ish. - Downloading works, but still streams the whole file into memory. - This storage engine is disabled by default (hardcoded as a unit test engine) because it's still sketchy. - Need some code to remove the "isParital" flag when the last chunk is uploaded. - Maybe do checksumming on chunks. Test Plan: - Hacked up `arc upload` (see next diff) to be chunk-aware and uploaded a readme in 18 32-byte chunks. Then downloaded it. Got the same file back that I uploaded. - File UI now shows some basic chunk info for chunked files: {F336434} Reviewers: btrahan Reviewed By: btrahan Subscribers: joshuaspence, epriestley Maniphest Tasks: T7149 Differential Revision: https://secure.phabricator.com/D12060 2015-03-13 11:30:02 -07:00			`$where[] = $this->buildPagingClause($conn_r);`

			`return $this->formatWhereClause($where);`
			`}`

			`public function getQueryApplicationClass() {`
			`return 'PhabricatorFilesApplication';`
			`}`

			`}`