Summary:
Depends on D18828. Ref T7789. See <https://discourse.phabricator-community.org/t/git-lfs-fails-with-large-images/584>.
Currently, when you upload a large (>4MB) image, we may try to assess the dimensions for the image and for each individual chunk.
At best, this is slow and not useful. At worst, it fatals or consumes a ton of memory and I/O we don't need to be using.
Instead:
- Don't try to assess dimensions for chunked files.
- Don't try to assess dimensions for the chunks themselves.
- Squelch errors for bad data, etc., that `gd` can't actually read, since we recover sensibly.
Test Plan:
- Created a 2048x2048 PNG in Photoshop using the "Random Noise" filter which weighs 8.5MB.
- Uploaded it.
- Before patch: got complaints in log about `imagecreatefromstring()` failing, although the actual upload went OK in my environment.
- After patch: clean log, no attempt to detect the size of a big image.
- Also uploaded a small image, got dimensions detected properly still.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T7789
Differential Revision: https://secure.phabricator.com/D18830
Summary:
Ref T12857. This is generally fairly fuzzy for now, but here's something concrete: when we build a large file with `diffusion.filecontentquery`, we compute the MIME type of all chunks, not just the initial chunk.
Instead, pass a dummy MIME type to non-initial chunks so we don't try to compute them. This mirrors logic elsewhere, in `file.uploadchunk`. This should perhaps be centralized at some point, but it's a bit tricky since the file doesn't know that it's a chunk until later.
Also, clean up the `TempFile` immediately -- this shouldn't actually affect anything, but we don't need it to live any longer than this.
Test Plan:
- Made `hashFileContent()` return `null` to skip the chunk cache.
- Added `phlog()` to the MIME type computation.
- Loaded a 12MB file in Diffusion.
- Before patch: Saw 3x MIME type computations, one for each 4MB chunk.
- After patch: Saw 1x MIME type computation, for initial chunk only.
Reviewers: chad, amckinley
Reviewed By: chad
Maniphest Tasks: T12857
Differential Revision: https://secure.phabricator.com/D18138
Summary: Fixes T12828. This API was tightened up D17616, but I missed this callsite.
Test Plan: I've just got a good feeling in my bones.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12828
Differential Revision: https://secure.phabricator.com/D18119
Summary: Ensures that newly-made `File` objects get indexed into the new ngrams index. Fixes T8788.
Test Plan:
- uploaded a file with daemons stopped; confirmed no new rows in ngrams table
- started daemons; confirmed indexing of previously-uploaded files happened
- uploaded a new file with daemons running; confirmed it got added to the index
Not sure how to test the changes to `PhabricatorFileUploadSource->writeChunkedFile()` and `PhabricatorChunkedFileStorageEngine->allocateChunks()`. I spent a few minutes trying to find their callers, but the first looks like it requires a Diffusion repo and the 2nd is only accessible via Conduit. I can test that stuff if necessary, but it's such a small change that I'm not worried about it.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Maniphest Tasks: T8788
Differential Revision: https://secure.phabricator.com/D17718
Summary: Fixes T12536. Nothing reads this parameter; `PhabricatorFile::newChunkedFile` sets the `isPartial` flag automatically.
Test Plan: Grepped for `isPartial`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12536
Differential Revision: https://secure.phabricator.com/D17654
Summary:
Ref T11357. When creating a file, callers can currently specify a `ttl`. However, it isn't unambiguous what you're supposed to pass, and some callers get it wrong.
For example, to mean "this file expires in 60 minutes", you might pass either of these:
- `time() + phutil_units('60 minutes in seconds')`
- `phutil_units('60 minutes in seconds')`
The former means "60 minutes from now". The latter means "1 AM, January 1, 1970". In practice, because the GC normally runs only once every four hours (at least, until recently), and all the bad TTLs are cases where files are normally accessed immediately, these 1970 TTLs didn't cause any real problems.
Split `ttl` into `ttl.relative` and `ttl.absolute`, and make sure the values are sane. Then correct all callers, and simplify out the `time()` calls where possible to make switching to `PhabricatorTime` easier.
Test Plan:
- Generated an SSH keypair.
- Viewed a changeset.
- Viewed a raw diff.
- Viewed a commit's file data.
- Viewed a temporary file's details, saw expiration date and relative time.
- Ran unit tests.
- (Didn't really test Phragment.)
Reviewers: chad
Reviewed By: chad
Subscribers: hach-que
Maniphest Tasks: T11357
Differential Revision: https://secure.phabricator.com/D17616
Summary:
Ref T7789. Ref T10604. This implements the `upload` action, which streams file data into Files.
This makes Git LFS actually work, at least roughly.
Test Plan:
- Tracked files in an LFS repository.
- Pushed LFS data (`git lfs track '*.png'; git add something.png; git commit -m ...; git push`).
- Pulled LFS data (`git checkout master^; rm -rf .git/lfs; git checkout master; open something.png`).
- Verified LFS refs show up in the gitlfsref table.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T7789, T10604
Differential Revision: https://secure.phabricator.com/D15492
Summary: Fixes T10273. The threshold is `null` if no chunk engines are available, but the code didn't handle this properly.
Test Plan: Disabled all chunk engines, reloaded, hit issue described in task. Applied patch, got clean file content.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10273
Differential Revision: https://secure.phabricator.com/D15179
Summary:
Fixes T10186. After D14970, `diffusion.filecontentquery` puts the content in a file and returns the file PHID.
However, it does this in a way that doesn't go through the chunking engine, so it will fail for files larger than the chunk threshold (generally, 8MB).
Instead, stream the file from the underlying command directly into chunked storage.
Test Plan:
- Made a commit including a really big file: 4dcd4c492b
- Used `diffusion.filecontentquery` to load file content.
- Parsed/imported commit locally.
- Used `diffusion.filecontentquery` to load content for smaller files (README, etc).
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10186
Differential Revision: https://secure.phabricator.com/D15072