Summary: Ref T13603. Migrate all code which interacts with the "ObjectHasFile" edge to use the "Attachments" table instead.
Test Plan:
- Edited a paste's view policy, confirmed associated file secret was scrambled.
- Verified I could still view paste content as a user who could not naturally view the underlying file.
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13603
Differential Revision: https://secure.phabricator.com/D21819
Summary: Ref T13603. This adds a second write to new "attachment" storage to all writers except one in Paste, which creates the file inline.
Test Plan:
- Updated a macro image, confirmed a write to "attachment" storage (transaction pathway).
- Updated a blog profile image, confirmed a write to "attachment" storage (legacy pathway).
Maniphest Tasks: T13603
Differential Revision: https://secure.phabricator.com/D21816
Summary:
Ref T13603. Currently, files are sometimes detached from objects. For example, when you change the image for a Macro, the old image is detached.
This is wrong: the image should remain attached so users who can view the macro can view the complete "alice change the image from X to Y" transaction. To be able to understand the change that was applied, you need to be able to view both files.
All workflows which currently detach files aren't conistent with the modern way applications behave, except maybe one callsite in a unit test, and that one's kind of moot.
Get rid of this stuff and just use PHID extraction to perform file attachment in all cases.
Test Plan: Created and edited macros, verified files were properly attached and remained attached across edits.
Maniphest Tasks: T13603
Differential Revision: https://secure.phabricator.com/D21815
Summary: Ref T13603. Prepare to move this relationship out of edge storage into dedicated storage so it is easier to formalize better in the UI.
Test Plan: Ran `bin/storage upgrade`.
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13603
Differential Revision: https://secure.phabricator.com/D21813
Summary:
Ref T13629.
- Allow files to have custom alt text.
- If a file doesn't have alt text, try to generate a plausible default alt text with the information we have.
Test Plan:
- Viewed image files in DocumentEngine diffs, files, `{Fxxx}` embeds, and lightboxes.
- Saw default alt text in all cases, or custom alt text if provided.
- Set, modified, and removed file alt text. Viewed timeline and feed.
- Pulled alt text with "conduit.search".
Maniphest Tasks: T13629
Differential Revision: https://secure.phabricator.com/D21596
Summary:
Depends on D19919. Ref T11351. This method appeared in D8802 (note that "get...Object" was renamed to "get...Transaction" there, so this method was actually "new" even though a method of the same name had existed before).
The goal at the time was to let Harbormaster post build results to Diffs and have them end up on Revisions, but this eventually got a better implementation (see below) where the Harbormaster-specific code can just specify a "publishable object" where build results should go.
The new `get...Object` semantics ultimately broke some stuff, and the actual implementation in Differential was removed in D10911, so this method hasn't really served a purpose since December 2014. I think that broke the Harbormaster thing by accident and we just lived with it for a bit, then Harbormaster got some more work and D17139 introduced "publishable" objects which was a better approach. This was later refined by D19281.
So: the original problem (sending build results to the right place) has a good solution now, this method hasn't done anything for 4 years, and it was probably a bad idea in the first place since it's pretty weird/surprising/fragile.
Note that `Comment` objects still have an unrelated method with the same name. In that case, the method ties the `Comment` storage object to the related `Transaction` storage object.
Test Plan: Grepped for `getApplicationTransactionObject`, verified that all remaining callsites are related to `Comment` objects.
Reviewers: amckinley
Reviewed By: amckinley
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T11351
Differential Revision: https://secure.phabricator.com/D19920
Summary:
Depends on D19918. Ref T11351. In D19918, I removed all calls to this method. Now, remove all implementations.
All of these implementations just `return $timeline`, only the three sites in D19918 did anything interesting.
Test Plan: Used `grep willRenderTimeline` to find callsites, found none.
Reviewers: amckinley
Reviewed By: amckinley
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T11351
Differential Revision: https://secure.phabricator.com/D19919
Summary:
Ref T13164. See PHI766. Currently, when file data is stored in small chunks, we submit each chunk to the indexing engine.
However, chunks are never surfaced directly and can never be found via any search/query, so this work is pointless. Just skip it.
(It would be nice to do this a little more formally on `IndexableInterface` or similar as `isThisAnIndexableObject()`, but we'd have to add like a million empty "yes, index this always" methods to do that, and it seems unlikely that we'll end up with too many other objects like these.)
Test Plan:
- Ran `bin/harbormaster rebuild-log --id ... --force` before and after change, saw about 200 fewer queries after the change.
- Uploaded a uniquely named file and searched for it to make sure I didn't break that.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13164
Differential Revision: https://secure.phabricator.com/D19563
Summary:
Depends on D19251. Ref T13105. This adds rendering engine support for PDFs.
It doesn't actually render them, it just renders a link which you can click to view them in a new window. This is much easier than actually rendering them inline and at least 95% as good most of the time (and probably more-than-100%-as-good some of the time).
This makes PDF a viewable MIME type by default and adds a narrow CSP exception for it. See also T13112.
Test Plan:
- Viewed PDFs in Files, got a link to view them in a new tab.
- Clicked the link in Safari, Chrome, and Firefox; got inline PDFs.
- Verified primary CSP is still `object-src 'none'` with `curl ...`.
- Interacted with the vanilla lightbox element to check that it still works.
Maniphest Tasks: T13105
Differential Revision: https://secure.phabricator.com/D19252
Summary: Ref T13103. Make favicons customizable, and perform dynamic compositing to add marker to indicate things like "unread messages".
Test Plan: Viewed favicons in Safari, Firefox and Chrome. With unread messages, saw pink dot composited into icon.
Maniphest Tasks: T13103
Differential Revision: https://secure.phabricator.com/D19209
Summary:
Depends on D19155. Ref T13094. Ref T4340.
We can't currently implement a strict `form-action 'self'` content security policy because some file downloads rely on a `<form />` which sometimes POSTs to the CDN domain.
Broadly, stop generating these forms. We just redirect instead, and show an interstitial confirm dialog if no CDN domain is configured. This makes the UX for installs with no CDN domain a little worse and the UX for everyone else better.
Then, implement the stricter Content-Security-Policy.
This also removes extra confirm dialogs for downloading Harbormaster build logs and data exports.
Test Plan:
- Went through the plain data export, data export with bulk jobs, ssh key generation, calendar ICS download, Diffusion data, Paste data, Harbormaster log data, and normal file data download workflows with a CDN domain.
- Went through all those workflows again without a CDN domain.
- Grepped for affected symbols (`getCDNURI()`, `getDownloadURI()`).
- Added an evil form to a page, tried to submit it, was rejected.
- Went through the ReCaptcha and Stripe flows again to see if they're submitting any forms.
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13094, T4340
Differential Revision: https://secure.phabricator.com/D19156
Summary:
Depends on D18952. Ref T13049. For files larger than 8MB, we need to engage the chunk storage engine. `PhabricatorFile::newFromFileData()` always writes a single chunk, and can't handle files larger than the mandatory chunk threshold (8MB).
Use `IteratorUploadSource`, which can, and "stream" the data into it. This should raise the limit from 8MB to 2GB (maximum size of a string in PHP).
If we need to go above 2GB we could stream CSV and text pretty easily, and JSON without too much trouble, but Excel might be trickier. Hopefully no one is trying to export 2GB+ datafiles, though.
Test Plan:
- Changed the JSON exporter to just export 8MB of the letter "q": `return str_repeat('q', 1024 * 1024 * 9);`.
- Before change: fatal, "no storage engine can store this file".
- After change: export works cleanly.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13049
Differential Revision: https://secure.phabricator.com/D18953
Summary:
Depends on D18828. Ref T7789. See <https://discourse.phabricator-community.org/t/git-lfs-fails-with-large-images/584>.
Currently, when you upload a large (>4MB) image, we may try to assess the dimensions for the image and for each individual chunk.
At best, this is slow and not useful. At worst, it fatals or consumes a ton of memory and I/O we don't need to be using.
Instead:
- Don't try to assess dimensions for chunked files.
- Don't try to assess dimensions for the chunks themselves.
- Squelch errors for bad data, etc., that `gd` can't actually read, since we recover sensibly.
Test Plan:
- Created a 2048x2048 PNG in Photoshop using the "Random Noise" filter which weighs 8.5MB.
- Uploaded it.
- Before patch: got complaints in log about `imagecreatefromstring()` failing, although the actual upload went OK in my environment.
- After patch: clean log, no attempt to detect the size of a big image.
- Also uploaded a small image, got dimensions detected properly still.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T7789
Differential Revision: https://secure.phabricator.com/D18830
Summary: See PHI176. Depends on D18733. We issue a query when deleting files that currently doesn't hit any keys.
Test Plan:
Ran `./bin/remove destroy --force --trace F56376` to get the query.
Ran ##SELECT * FROM `file` WHERE storageEngine = 'blob' AND storageHandle = '23366' LIMIT 1## before and after the change.
Before:
```
mysql> explain SELECT * FROM `file` WHERE storageEngine = 'blob' AND storageHandle = '23366' LIMIT 1;
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | file | ALL | NULL | NULL | NULL | NULL | 33866 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)
```
After:
```
mysql> explain SELECT * FROM `file` WHERE storageEngine = 'blob' AND storageHandle = '23366' LIMIT 1;
+----+-------------+-------+------+---------------+------------+---------+-------------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------------+---------+-------------+------+------------------------------------+
| 1 | SIMPLE | file | ref | key_engine | key_engine | 388 | const,const | 190 | Using index condition; Using where |
+----+-------------+-------+------+---------------+------------+---------+-------------+------+------------------------------------+
1 row in set (0.00 sec)
```
Reviewers: amckinley
Reviewed By: amckinley
Differential Revision: https://secure.phabricator.com/D18734
Summary:
Ref T13008. Depends on D18701. The overall goal here is to make turning `enable_post_data_reading` off not break things, so we can run rate limiting checks before we read file uploads.
The biggest blocker for this is that turning it off stops `$_FILES` from coming into existence.
This //appears// to mostly work. Specifically:
- Skip the `max_post_size` check when POST is off, since it's meaningless.
- Don't read or scrub $_POST at startup when POST is off.
- When we rebuild REQUEST and POST before processing requests, do multipart parsing if we need to and rebuild FILES.
- Skip the `is_uploaded_file()` check if we built FILES ourselves.
This probably breaks a couple of small things, like maybe `__profile__` and other DarkConsole triggers over POST, and probably some other weird stuff. The parsers may also need more work than they've received so far.
I also need to verify that this actually works (i.e., lets us run code without reading the request body) but I'll include that in the change where I update the actual rate limiting.
Test Plan:
- Disabled `enable_post_data_reading`.
- Uploaded a file with a vanilla upload form (project profile image).
- Uploaded a file with drag and drop.
- Used DarkConsole.
- Submitted comments.
- Created a task.
- Browsed around.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13008
Differential Revision: https://secure.phabricator.com/D18702
Summary:
Ref T12857. This is generally fairly fuzzy for now, but here's something concrete: when we build a large file with `diffusion.filecontentquery`, we compute the MIME type of all chunks, not just the initial chunk.
Instead, pass a dummy MIME type to non-initial chunks so we don't try to compute them. This mirrors logic elsewhere, in `file.uploadchunk`. This should perhaps be centralized at some point, but it's a bit tricky since the file doesn't know that it's a chunk until later.
Also, clean up the `TempFile` immediately -- this shouldn't actually affect anything, but we don't need it to live any longer than this.
Test Plan:
- Made `hashFileContent()` return `null` to skip the chunk cache.
- Added `phlog()` to the MIME type computation.
- Loaded a 12MB file in Diffusion.
- Before patch: Saw 3x MIME type computations, one for each 4MB chunk.
- After patch: Saw 1x MIME type computation, for initial chunk only.
Reviewers: chad, amckinley
Reviewed By: chad
Maniphest Tasks: T12857
Differential Revision: https://secure.phabricator.com/D18138
Summary: Fixes T12587. Adds a new `PhabricatorFileDeleteTransaction` that enqueues `File` delete tasks.
Test Plan:
- hack `PhabricatorFileQuery` to ignore isDeleted state
- stop daemons
- upload a file, delete it from the UI
- check that the DB has updated isDeleted = 1
- check timeline rendering in `File` detail view
- start daemons
- confirm rows are deleted from DB
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin, thoughtpolice
Maniphest Tasks: T12587
Differential Revision: https://secure.phabricator.com/D17723
Summary:
Deletion is a possibly time-intensive process, especially with large
files that are backed by high-latency, chunked storage (such as
S3). Even ~200mb objects take minutes to delete, which makes for an
unhappy experience. Fixes T10828.
Test Plan:
Delete a large file, and stare in awe of the swiftness with
which I am redirected to the main file application.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: thoughtpolice, Korvin
Maniphest Tasks: T10828
Differential Revision: https://secure.phabricator.com/D15743
Summary: Ensures that newly-made `File` objects get indexed into the new ngrams index. Fixes T8788.
Test Plan:
- uploaded a file with daemons stopped; confirmed no new rows in ngrams table
- started daemons; confirmed indexing of previously-uploaded files happened
- uploaded a new file with daemons running; confirmed it got added to the index
Not sure how to test the changes to `PhabricatorFileUploadSource->writeChunkedFile()` and `PhabricatorChunkedFileStorageEngine->allocateChunks()`. I spent a few minutes trying to find their callers, but the first looks like it requires a Diffusion repo and the 2nd is only accessible via Conduit. I can test that stuff if necessary, but it's such a small change that I'm not worried about it.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Maniphest Tasks: T8788
Differential Revision: https://secure.phabricator.com/D17718
Summary: Follows the outline in D15656 for implementing ngram search for names of File objects. Also created FileFullTextEngine, because without implementing `PhabricatorFulltextInterface`, `./bin/search` complains that `File` is not an indexable type.
Test Plan:
- ran `./bin/storage upgrade` to apply the schema change
- confirmed the presence of a new `file_filename_ngrams` table
- added a couple file objects
- ran `bin/search index --type file --force`
- confirmed the presence of rows in `file_filename_ngrams`
- did a few keyword searches and saw expected results
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Maniphest Tasks: T8788
Differential Revision: https://secure.phabricator.com/D17702
Summary:
Fixes T12531. Strictness fallout from adding typechecking in D17616.
- `chunkedHash` is not a real parameter, so the new typechecking was unhappy about it.
- `mime-type` no longer allows `null`.
Test Plan:
- Ran `arc upload --conduit-uri ... 12MB.zero` on a 12MB file full of zeroes.
- Before patch: badness, failure, fallback to one-shot uploads.
- After patch: success and glory.
Reviewers: chad
Subscribers: joshuaspence
Maniphest Tasks: T12531
Differential Revision: https://secure.phabricator.com/D17651
Summary:
Ref T12470. Provides an "integrity" utility which runs in these modes:
- Verify: check that hashes match.
- Compute: backfill missing hashes.
- Strip: remove hashes. Useful for upgrading across a hash change.
- Corrupt: intentionally corrupt hashes. Useful for debugging.
- Overwrite: force hash recomputation.
Users normally shouldn't need to run any of this stuff, but this provides a reasonable toolkit for managing integrity hashes.
I'll recommend existing installs use `bin/files integrity --compute all` in the upgrade guidance to backfill hashes for existing files.
Test Plan:
- Ran the script in many modes against various files, saw expected operation, including:
- Verified a file, corrupted it, saw it fail.
- Verified a file, stripped it, saw it have no hash.
- Stripped a file, computed it, got a clean verify.
- Stripped a file, overwrote it, got a clean verify.
- Corrupted a file, overwrote it, got a clean verify.
- Overwrote a file, overwrote again, got a no-op.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12470
Differential Revision: https://secure.phabricator.com/D17629
Summary:
Ref T12470. This helps defuse attacks where an adversary can directly take control of whatever storage engine files are being stored in and change data there. These attacks would require a significant level of access.
Such attackers could potentially attack ranges of AES-256-CBC encrypted files by using Phabricator as a decryption oracle if they were also able to compromise a Phabricator account with read access to the files.
By storing a hash of the data (and, in the case of AES-256-CBC files, the IV) when we write files, and verifying it before we decrypt or read them, we can detect and prevent this kind of tampering.
This also helps detect mundane corruption and integrity issues.
Test Plan:
- Added unit tests.
- Uploaded new files, saw them get integrity hashes.
- Manually corrupted file data, saw it fail. Used `bin/files cat --salvage` to read it anyway.
- Tampered with IVs, saw integrity failures.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12470
Differential Revision: https://secure.phabricator.com/D17625
Summary:
Fixes T12079. Currently, when a file is encrypted and a request has "Content-Range", we apply the range first, //then// decrypt the result. This doesn't work since you can't start decrypting something from somewhere in the middle (at least, not with our cipher selection).
Instead: decrypt the result, //then// apply the range.
Test Plan: Added failing unit tests, made them pass
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12079
Differential Revision: https://secure.phabricator.com/D17623
Summary:
Ref T12464. This defuses any possible SHA1-collision attacks by using SHA256, for which there is no known collision.
(SHA256 hashes are larger -- 256 bits -- so expand the storage column to 64 bytes to hold them.)
Test Plan:
- Uploaded the same file twice, saw the two files generate the same SHA256 content hash and use the same underlying data.
- Tried with a fake hash algorihtm ("quackxyz") to make sure the failure mode worked/degraded correctly if we don't have SHA256 for some reason. Got two valid files with two copies of the same data, as expected.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12464
Differential Revision: https://secure.phabricator.com/D17620
Summary:
Ref T12464. We currently use SHA1 to detect when two files have the same content so we don't have to store two copies of the data.
Now that a SHA1 collision is known, this is theoretically dangerous. T12464 describes the shape of a possible attack.
Before replacing this with something more robust, shore things up so things work correctly if we don't hash at all. This mechanism is entirely optional; it only helps us store less data if some files are duplicates.
(This mechanism is also less important now than it once was, before we added temporary files.)
Test Plan: Uploaded multiple identical files, saw the uploads work and the files store separate copies of the same data.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12464
Differential Revision: https://secure.phabricator.com/D17619
Summary:
Ref T12464. This is a very old method which can return an existing file instead of creating a new one, if there's some existing file with the same content.
In the best case this is a bad idea. This being somewhat reasonable predates policies, temporary files, etc. Modern methods like `newFromFileData()` do this right: they share underlying data in storage, but not the actual `File` records.
Specifically, this is the case where we get into trouble:
- I upload a private file with content "X".
- You somehow generate a file with the same content by, say, viewing a raw diff in Differential.
- If the diff had the same content, you get my file, but you don't have permission to see it or whatever so everything breaks and is terrible.
Just get rid of this.
Test Plan:
- Generated an SSH key.
- Viewed a raw diff in Differential.
- (Did not test Phragment.)
Reviewers: chad
Reviewed By: chad
Subscribers: hach-que
Maniphest Tasks: T12464
Differential Revision: https://secure.phabricator.com/D17617
Summary:
Ref T11357. When creating a file, callers can currently specify a `ttl`. However, it isn't unambiguous what you're supposed to pass, and some callers get it wrong.
For example, to mean "this file expires in 60 minutes", you might pass either of these:
- `time() + phutil_units('60 minutes in seconds')`
- `phutil_units('60 minutes in seconds')`
The former means "60 minutes from now". The latter means "1 AM, January 1, 1970". In practice, because the GC normally runs only once every four hours (at least, until recently), and all the bad TTLs are cases where files are normally accessed immediately, these 1970 TTLs didn't cause any real problems.
Split `ttl` into `ttl.relative` and `ttl.absolute`, and make sure the values are sane. Then correct all callers, and simplify out the `time()` calls where possible to make switching to `PhabricatorTime` easier.
Test Plan:
- Generated an SSH keypair.
- Viewed a changeset.
- Viewed a raw diff.
- Viewed a commit's file data.
- Viewed a temporary file's details, saw expiration date and relative time.
- Ran unit tests.
- (Didn't really test Phragment.)
Reviewers: chad
Reviewed By: chad
Subscribers: hach-que
Maniphest Tasks: T11357
Differential Revision: https://secure.phabricator.com/D17616
Summary: Ref T11357. Implements a modern `file.search` for files, and freezes `file.info`.
Test Plan: Ran `file.search` from the Conduit console.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11357
Differential Revision: https://secure.phabricator.com/D17612
Summary:
Ref T11357. This moves editing and commenting (but not creation) to EditEngine.
Since only the name is really editable, this is pretty straightforward.
Test Plan: Renamed files; commented on files.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11357
Differential Revision: https://secure.phabricator.com/D17611
Summary: Ref T11357. A lot of file creation doesn't go through transactions, so we only actually have one real transaction type: editing a file name.
Test Plan:
Created and edited files.
{F4559287}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11357
Differential Revision: https://secure.phabricator.com/D17610
Summary:
This has been replaced by `PolicyCodex` after D16830. Also:
- Rebuild Celerity map to fix grumpy unit test.
- Fix one issue on the policy exception workflow to accommodate the new code.
Test Plan:
- `arc unit --everything`
- Viewed policy explanations.
- Viewed policy errors.
Reviewers: chad
Reviewed By: chad
Subscribers: hach-que, PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D16831
Summary:
Ref T4190. Currently only have the endpoint and controller working. I added caching so subsequent attempts to proxy the same image should result in the same redirect URL. Still need to:
- Write a remarkup rule that uses the endpoint
Test Plan: Hit /file/imageproxy/?uri=http://i.imgur.com/nTvVrYN.jpg and are served the picture
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin, epriestley, yelirekim
Maniphest Tasks: T4190
Differential Revision: https://secure.phabricator.com/D16581
Summary:
Ref T11596. When exporting data from the Phacility cluster, we `bin/files migrate` data from S3 into a database dump on the `aux` tier.
With current semantics, this //moves// the data and destroys it in S3.
Add a `--copy` flag to //copy// the data instead. This leaves the old copy around, which is what we want for exports.
Test Plan:
- Ran `bin/files migrate` to go from `blob` to `disk` with `--copy`. Verified a copy was left in the database.
- Copied it back, verified a copy was left on disk (total: 2 database copies, 1 disk copy).
- Moved it back without copy, verified database was destroyed and disk was created (total: 1 database copy, 2 disk copies).
- Moved it back without copy, verified local disk was destroyed and blob was created (total: 2 datbabase copies, 1 disk copy).
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11596
Differential Revision: https://secure.phabricator.com/D16497
Summary:
Fixes T10750. Files have some outdated cache/key code which prevents recording an edit history on file comments.
Remove this ancient cruft.
(Users must `bin/storage adjust` after upgrading to this patch to reap the benefits.)
Test Plan:
- Ran `bin/storage adjust`.
- Edited a comment in Files.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10750
Differential Revision: https://secure.phabricator.com/D16312
Summary:
Fixes T11307. Fixes T8124. Currently, builtin files are tracked by using a special transform with an invalid source ID.
Just use a dedicated column instead. The transform thing is too clever/weird/hacky and exposes us to issues with the "file" and "transform" tables getting out of sync (possibly the issue in T11307?) and with race conditions.
Test Plan:
- Loaded profile "edit picture" page, saw builtins.
- Deleted all builtin files, put 3 second sleep in the storage engine write, loaded profile page in two windows.
- Before patch: one of them failed with a race.
- After patch: both of them loaded.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T8124, T11307
Differential Revision: https://secure.phabricator.com/D16271
Summary:
Fixes T11242. See that task for detailed discussion.
Previously, it didn't particularly matter that we don't MIME detect chunked files since they were all just big blobs of junk (PSDs, zips/tarballs, whatever) that we handled uniformly.
However, videos are large and the MIME type also matters.
- Detect the overall mime type by detecitng the MIME type of the first chunk. This appears to work properly, at least for video.
- Skip mime type detection on other chunks, which we were performing and ignoring. This makes uploading chunked files a little faster since we don't need to write stuff to disk.
Test Plan:
Uploaded a 50MB video locally, saw it as chunks with a "video/mp4" mime type, played it in the browser in Phabricator as an embedded HTML 5 video.
{F1706837}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11242
Differential Revision: https://secure.phabricator.com/D16204
Summary:
Ref T11140. This makes encryption actually work:
- Provide a new configuation option, `keyring`, for specifying encryption keys.
- One key may be marked as `default`. This activates AES256 encryption for Files.
- Add `bin/files generate-key`. This is helps when generating valid encryption keys.
- Add `bin/files encode`. This changes the storage encoding of a file, and helps test encodings and migrate existing data.
- Add `bin/files cycle`. This re-encodes the block key with a new master key, if your master key leaks or you're just paraonid.
- Document all these options and behaviors.
Test Plan:
- Configured a bad `keyring`, hit a bunch of different errors.
- Used `bin/files generate-key` to try to generate bad keys, got appropriate errors ("raw doesn't support keys", etc).
- Used `bin/files generate-key` to generate an AES256 key.
- Put the new AES256 key into the `keyring`, without `default`.
- Uploaded a new file, verified it still uploaded as raw data (no `default` key yet).
- Used `bin/files encode` to change a file to ROT13 and back to raw. Verified old data got deleted and new data got stored properly.
- Used `bin/files encode --key ...` to explicitly convert a file to AES256 with my non-default key.
- Forced a re-encode of an AES256 file, verified the old data was deleted and a new key and IV were generated.
- Used `bin/files cycle` to try to cycle raw/rot13 files, got errors.
- Used `bin/files cycle` to cycle AES256 files. Verified metadata changed but file data did not. Verified file data was still decryptable with metadata.
- Ran `bin/files cycle --all`.
- Ran `encode` and `cycle` on chunked files, saw commands fail properly. These commands operate on the underlying data blocks, not the chunk metadata.
- Set key to `default`, uploaded a file, saw it stored as AES256.
- Read documentation.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11140
Differential Revision: https://secure.phabricator.com/D16127
Summary:
Ref T11140. This doesn't do anything yet since there's no way to enable it and no way to store master keys.
Those are slightly tougher problems and I'm not totally satisfied that I have an approach I really like for either problem, so I may wait for a bit before tackling them. Once they're solved, this does the mechanical encrypt/decrypt stuff, though.
This design is substantially similar to the AWS S3 server-side encryption design, and intended as an analog for it. The decisions AWS has made in design generally seem reasonable to me.
Each block of file data is encrypted with a unique key and a unique IV, and then that key and IV are encrypted with the master key (and a distinct, unique IV). This is better than just encrypting with the master key directly because:
- You can rotate the master key later and only need to re-encrypt a small amount of key data (about 48 bytes per file chunk), instead of re-encrypting all of the actual file data (up to 4MB per file chunk).
- Instead of putting the master key on every server, you can put it on some dedicated keyserver which accepts encrypted keys, decrypts them, and returns plaintext keys, and can send it 32-byte keys for decryption instead of 4MB blocks of file data.
- You have to compromise the master key, the database, AND the file store to get the file data. This is probably not much of a barrier realistically, but it does make attacks very slightly harder.
The "KeyRing" thing may change once I figure out how I want users to store master keys, but it was the simplest approach to get the unit tests working.
Test Plan:
- Ran unit tests.
- Dumped raw data, saw encrypted blob.
- No way to actually use this in the real application yet so it can't be tested too extensively.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11140
Differential Revision: https://secure.phabricator.com/D16124
Summary:
Ref T11140. When reading and writing files, we optionally apply a "storage format" to them.
The default format is "raw", which means we just store the raw data.
This change modularizes formats and adds a "rot13" format, which proves formatting works and is testable. In the future, I'll add real encryption formats.
Test Plan:
- Added unit tests.
- Viewed files in web UI.
- Changed a file's format to rot13, saw the data get rotated on display.
- Set default format to rot13:
- Uploaded a small file, verified data was stored as rot13.
- Uploaded a large file, verified metadata was stored as "raw" (just a type, no actual data) and blob data was stored as rot13.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11140
Differential Revision: https://secure.phabricator.com/D16122
Summary: Ref T6916. Added video to remarkup using D7156 as reference.
Test Plan:
- Viewed video files (MP4, Ogg) in Safari, Chrome, Firefox (some don't work, e.g., OGG in Safari, but nothing we can really do about that).
- Used `alt`.
- Used `autoplay`.
- Used `loop`.
- Used `media=audio`.
- Viewed file detail page.
Reviewers: nateguchi2, chad, #blessed_reviewers
Reviewed By: chad, #blessed_reviewers
Subscribers: asherkin, ivo, joshuaspence, Korvin, epriestley
Tags: #remarkup
Maniphest Tasks: T6916
Differential Revision: https://secure.phabricator.com/D11297
Summary:
Ref T5187. This definitely feels a bit flimsy and I'm going to hold it until I cut the release since it changes a couple of things about Workflow in general, but it seems to work OK and most of it is fine.
The intent is described in T5187#176236.
In practice, most of that works like I describe, then the `phui-file-upload` behavior gets some weird glue to figure out if the input is part of the form. Not the most elegant system, but I think it'll hold until we come up with many reasons to write a lot more Javascript.
Test Plan:
Used both drag-and-drop and the upload dialog to upload files in Safari, Firefox and Chrome.
{F1653716}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T5187
Differential Revision: https://secure.phabricator.com/D15953
Summary:
Ref T10262. Currently, we always render a tag like this when you `{F123}` an image in remarkup:
```
<img src="/xform/preview/abcdef/" />
```
This either generates the preview or redirects to an existing preview. This is a good behavior in general, because the preview may take a while to generate and we don't want to wait for it to generate on the server side.
However, this flickers a lot in Safari. We might be able to cache this, but we really shouldn't, since the preview URI isn't a legitimately stable/permanent one.
Instead, do a (cheap) server-side check to see if the preview already exists. If it does, return a direct URI. This gives us a stable thumbnail in Safari.
Test Plan:
- Dragged a dog picture into comment box.
- Typed text.
- Thing didn't flicker like crazy all the time in Safari.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10262
Differential Revision: https://secure.phabricator.com/D15646
Summary:
Ref T10262. This removes one-time tokens and makes file data responses always-cacheable (for 30 days).
The URI will stop working once any attached object changes its view policy, or the file view policy itself changes.
Files with `canCDN` (totally public data like profile images, CSS, JS, etc) use "cache-control: public" so they can be CDN'd.
Files without `canCDN` use "cache-control: private" so they won't be cached by the CDN. They could still be cached by a misbehaving local cache, but if you don't want your users seeing one anothers' secret files you should configure your local network properly.
Our "Cache-Control" headers were also from 1999 or something, update them to be more modern/sane. I can't find any evidence that any browser has done the wrong thing with this simpler ruleset in the last ~10 years.
Test Plan:
- Configured alternate file domain.
- Viewed site: stuff worked.
- Accessed a file on primary domain, got redirected to alternate domain.
- Verified proper cache headers for `canCDN` (public) and non-`canCDN` (private) files.
- Uploaded a file to a task, edited task policy, verified it scrambled the old URI.
- Reloaded task, new URI generated transparently.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10262
Differential Revision: https://secure.phabricator.com/D15642
Summary:
Ref T10262. Files have an internal secret key which is partially used to control access to them, and determines part of the URL you need to access them. Scramble (regenerate) the secret when:
- the view policy for the file itself changes (and the new policy is not "public" or "all users"); or
- the view policy or space for an object the file is attached to changes (and the file policy is not "public" or "all users").
This basically means that when you change the visibility of a task, any old URLs for attached files stop working and new ones are implicitly generated.
Test Plan:
- Attached a file to a task, used `SELECT * FROM file WHERE id = ...` to inspect the secret.
- Set view policy to public, same secret.
- Set view policy to me, new secret.
- Changed task view policy, new secret.
- Changed task space, new secret.
- Changed task title, same old secret.
- Added and ran unit tests which cover this behavior.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10262
Differential Revision: https://secure.phabricator.com/D15641
Summary: Fixes T10603. This is the last of the ad-hoc temporary tokens.
Test Plan:
- Used a file token.
- Viewed type in {nav Config > Temporary Tokens}.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10603
Differential Revision: https://secure.phabricator.com/D15481