phorge-phorge

mirror of https://we.phorge.it/source/phorge.git synced 2024-12-11 16:16:14 +01:00

Author	SHA1	Message	Date
epriestley	ab579f2511	Never generate file download forms which point to the CDN domain, tighten "form-action" CSP Summary: Depends on D19155. Ref T13094. Ref T4340. We can't currently implement a strict `form-action 'self'` content security policy because some file downloads rely on a `<form />` which sometimes POSTs to the CDN domain. Broadly, stop generating these forms. We just redirect instead, and show an interstitial confirm dialog if no CDN domain is configured. This makes the UX for installs with no CDN domain a little worse and the UX for everyone else better. Then, implement the stricter Content-Security-Policy. This also removes extra confirm dialogs for downloading Harbormaster build logs and data exports. Test Plan: - Went through the plain data export, data export with bulk jobs, ssh key generation, calendar ICS download, Diffusion data, Paste data, Harbormaster log data, and normal file data download workflows with a CDN domain. - Went through all those workflows again without a CDN domain. - Grepped for affected symbols (`getCDNURI()`, `getDownloadURI()`). - Added an evil form to a page, tried to submit it, was rejected. - Went through the ReCaptcha and Stripe flows again to see if they're submitting any forms. Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam Maniphest Tasks: T13094, T4340 Differential Revision: https://secure.phabricator.com/D19156	2018-02-28 17:20:12 -08:00
epriestley	05a4c55c52	Explicitly add rel="noreferrer" to all external links Summary: See D19117. Instead of automatically figuring this out inside `phutil_tag()`, explicitly add rel="noreferrer" at the application level to all external links. Test Plan: - Grepped for `_blank`, `isValidRemoteURIForLink`, checked all callsites for user-controlled data. - Created a link menu item, verified noreferrer in markup. - Created a link custom field, verified no referrer in markup. - Verified noreferrer for `{nav href=...}`. Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam Differential Revision: https://secure.phabricator.com/D19118	2018-02-17 17:46:11 -08:00
epriestley	fe294d4034	Allow third-party code to extend upstream datasources via EngineExtension Summary: Depends on D19089. Fixes T13079. This is likely not the final form of this, but creates a defensible extension point. Test Plan: See T13079 for discussion. Maniphest Tasks: T13079 Differential Revision: https://secure.phabricator.com/D19090	2018-02-14 18:11:51 -08:00
epriestley	4bccb1547d	Modularize the "jump nav" behaviors in global search Summary: Depends on D19087. Ref T13079. This still doesn't feel like the most clean, general system in the world, but is a step forward from hard-coded `switch()` stuff. Test Plan: - Jumped to `r`. - Jumped to `a`. - Jumped to `r poe` (multiple results). - Jumped to `r poetry` (one result). - Jumped to `r syzygy` (no results). - Jumped to `p`. - Jumped to `p robot` (multiple results); `p assessment` (one result). - The behavior for `p <string>` has changed slightly but should be more powerful now (it's consistent with `r <string>`). - Jumped to `s <symbol>` and `s <context>-><symbol>`. - Jumped to `d`. - Jumped to `f`. - Jumped to `t`. - Jumped to `T123`, `D123`, `@dog`, `PHID-DREV-abcd`, etc. Maniphest Tasks: T13079 Differential Revision: https://secure.phabricator.com/D19088	2018-02-14 18:08:07 -08:00
epriestley	abe5fd57b0	Rename "QuickSearch" Engine/EngineExtension to "Datasource" Summary: Ref T13079. This recently-introduced Engine/EngineExtension are a good fit for adding more datasource functions in general, but we didn't think quite big enough in naming them. Test Plan: Used quick search typeahead, hit applications/users/monograms/symbols/etc. Maniphest Tasks: T13079 Differential Revision: https://secure.phabricator.com/D19087	2018-02-14 18:03:03 -08:00
epriestley	0d5379ee17	Fix an export bug where queries specified in the URI ("?param=value") were ignored when filtering the result set Summary: Depends on D18968. Ref T13049. Currently, if you visit `/query/?param=value`, there is no `queryKey` for the page but we build a query later on. Right now, we incorrectly link to `/query/all/export/` in this case (and export too many results), but we should actually link to `/query/<constructed query key>/export/` to export only the desired/previewed results. Swap the logic around a little bit so we look at the query we're actually executing, not the original URI, to figure out the query key we should use when building the link. Test Plan: Visited a `/?param=value` page, exported data, got a subset of the full data instead of everything. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18969	2018-01-30 11:19:37 -08:00
epriestley	a5b8be0316	Support export of user activity logs Summary: Depends on D18966. Ref T13049. Adds export support to user activity logs. These don't have PHIDs. We could add them, but just make the "phid" column test if the objects have PHIDs or not for now. Test Plan: - Exported user activity logs, got sensible output (with no PHIDs). - Exported some users to make sure I didn't break PHIDs, got an export with PHIDs. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18967	2018-01-30 11:12:32 -08:00
epriestley	84df122085	When exporting more than 1,000 records, export in the background Summary: Depends on D18961. Ref T13049. Currently, longer exports don't give the user any feedback, and exports that take longer than 30 seconds are likely to timeout. For small exports (up to 1,000 rows) continue doing the export in the web process. For large exports, queue a bulk job and do them in the workers instead. This sends the user through the bulk operation UI and is similar to bulk edits. It's a little clunky for now, but you get your data at the end, which is far better than hanging for 30 seconds and then fataling. Test Plan: Exported small result sets, got the same workflow as before. Exported very large result sets, went through the bulk flow, got reasonable results out. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18962	2018-01-29 16:08:02 -08:00
epriestley	ea58b6acea	Remove the old, non-modular Excel export workflow from Maniphest Summary: Depends on D18960. Ref T13049. Now that Maniphest fully supports "Export Data", remove the old hard-coded version. This is a backward compatibility break with the handful of installs that might have defined a custom export by subclassing `ManiphestExcelFormat`. I suspect this is almost zero installs, and that the additional data in the new format may serve most of the needs of this tiny number of installs. They can upgrade to `ExportEngineExtensions` fairly easily if this isn't true. Test Plan: - Viewed Maniphest, no longer saw the old export workflow. - Grepped for `export` and similar strings to try to hunt everything down. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18961	2018-01-29 16:06:13 -08:00
epriestley	00b4eae1f4	When PHPExcel is not installed, detect it and provide install instructions Summary: Depends on D18957. Ref T13049. To do Excel exports, PHPExcel needs to be installed on the system somewhere. This library is enormous (1K files, ~100K SLOC), which is why we don't just include it in `externals/`. This install process is a little weird and we could improve it, but users don't seem to have too much difficulty with it. This shouldn't be worse than the existing workflow in Maniphest, and I tried to make it at least slightly more clear. Test Plan: Uninstalled PHPExcel, got it marked "Unavailable" and got reasonably-helpful-ish guidance on how to get it to work. Reinstalled, exported, got a sheet. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18958	2018-01-29 16:03:34 -08:00
epriestley	61b8c12970	Make the data export format selector remember your last setting Summary: Depends on D18956. Ref T13049. Make the "Export Format" selector sticky. This is partly selfish, since it makes testing format changes a bit easier. It also seems like it's probably a good behavior in general: if you export to Excel once, that's probably what you're going to pick next time. Test Plan: Exported to excel. Exported again, got excel as the default option. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18957	2018-01-29 16:01:54 -08:00
epriestley	0409279595	Support Excel as a data export format Summary: Depends on D18954. Ref T13049. This brings over the existing Maniphest Excel export pipeline in a generic way. The `<Type>ExportField` classes know directly that `PHPExcel` exists, which is a little sketchy, but writing an Excel indirection layer sounds like a lot of work and I don't anticipate us changing Excel backends anytime soon, so trying to abstract this feels YAGNI. This doesn't bring over the install instructions for PHPExcel or the detection of whether or not it exists. I'll bring that over in a future change. Test Plan: Exported users as Excel, opened them up, got a sensible-looking Excel sheet. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18955	2018-01-29 16:00:41 -08:00
epriestley	a067f64ebb	Support export engine extensions and implement an extension for custom fields Summary: Depends on D18953. Ref T13049. Allow applications and infrastructure to supplement exportable fields for objects. Then, implement an extension for custom fields. Only a couple field types (int, string) are supported for now. Test Plan: Added some custom fields to Users, populated them, exported users. Saw custom fields in the export. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18954	2018-01-29 15:59:58 -08:00
epriestley	8b8a3142b3	Support export of data in files larger than 8MB Summary: Depends on D18952. Ref T13049. For files larger than 8MB, we need to engage the chunk storage engine. `PhabricatorFile::newFromFileData()` always writes a single chunk, and can't handle files larger than the mandatory chunk threshold (8MB). Use `IteratorUploadSource`, which can, and "stream" the data into it. This should raise the limit from 8MB to 2GB (maximum size of a string in PHP). If we need to go above 2GB we could stream CSV and text pretty easily, and JSON without too much trouble, but Excel might be trickier. Hopefully no one is trying to export 2GB+ datafiles, though. Test Plan: - Changed the JSON exporter to just export 8MB of the letter "q": `return str_repeat('q', 1024 * 1024 * 9);`. - Before change: fatal, "no storage engine can store this file". - After change: export works cleanly. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18953	2018-01-29 15:58:34 -08:00
epriestley	0de6210808	Give data exporters a header row Summary: Depends on D18951. Ref T13049. When we export to CSV or plain text, add a header row in the first line of the file to explain what each column means. This often isn't obvious with PHIDs, etc. JSON has keys and is essentially self-labeling, so don't do anything special. Test Plan: Exported CSV and text, saw new headers. Exported JSON, no changes. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18952	2018-01-29 15:17:30 -08:00
epriestley	213eb8e93d	Define common ID and PHID export fields in SearchEngine Summary: Ref T13049. All exportable objects should always have these fields, so make them builtins. This also sets things up for extensions (like custom fields). Test Plan: Exported user data, got the same export as before. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18951	2018-01-29 15:17:00 -08:00
epriestley	9d69118664	Add a discovery format hint for date fields in SearchEngine UIs Summary: See PHI316. Maniphest and other applications currently have controls like `Created After: [_____]` where you just get an empty text field. Although most formats work -- including relative formats like "3 days ago" -- and we validate inputs so you get an error if you enter something nonsensical, this still isn't very user friendly. T8060 or some other approach is likely the long term of this control. In the meantime, add placeholder text to suggest that `YYYY-MM-DD` or `X days ago` will work. Test Plan: Viewed date inputs, saw placeholder text. Reviewers: amckinley Reviewed By: amckinley Differential Revision: https://secure.phabricator.com/D18942	2018-01-26 13:11:10 -08:00
epriestley	0ec83132a8	Support basic export of user accounts Summary: Depends on D18934. Ref T13046. Add support for the new export flow to a second application. My goal here is mostly just to make sure that this is general enough to work in more than one place, and exporting user accounts seems plausible as a useful feature, although we do see occasional requests for this feature exactly (like <https://discourse.phabricator-community.org/t/users-export-to-csv/968>). The exported data may not truly be useful for much (no disabled/admin/verified/MFA flags, no external account data, no email addresses for policy reasons) but we can expand it as use cases arise. Test Plan: Exported user accounts in several formats. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13046 Differential Revision: https://secure.phabricator.com/D18935	2018-01-26 11:17:44 -08:00
epriestley	a79bb55f3f	Support CSV, JSON, and tab-separated text as export formats Summary: Depends on D18919. Ref T13046. Adds some simple modular exporters. Test Plan: Exported pull logs in each format. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13046 Differential Revision: https://secure.phabricator.com/D18934	2018-01-26 11:16:52 -08:00
epriestley	c0b8e4784b	Add a basic, general-purpose export workflow for all objects with SearchEngine support Summary: Depends on D18918. Ref T13046. Ref T5954. Pull logs can currently be browsed in the web UI, but this isn't very powerful, especially if you have thousands of them. Allow SearchEngine implementations to define exportable fields so that users can "Use Results > Export Data" on any query. In particular, they can use this workflow to download a file with pull logs. In the future, this can replace the existing "Export to Excel" feature in Maniphest. For now, we hard-code JSON as the only supported datatype and don't actually make any effort to format the data properly, but this leaves room to add more exporters (CSV, Excel) and data type awareness (integer casting, date formatting, etc) in the future. For sufficiently large result sets, this will probably time out. At some point, I'll make this use the job queue (like bulk editing) when the export is "large" (affects more than 1K rows?). Test Plan: Downloaded pull logs in JSON format. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13046, T5954 Differential Revision: https://secure.phabricator.com/D18919	2018-01-26 11:15:59 -08:00
Aviv Eyal	d8f2630d5c	Modernize QuickSearch typeahead Summary: Use ClassQuery to find datasources for the quick-search. Mostly, this allows extensions to add quicksearches. Test Plan: using `/typeahead/class/`, tested several search terms that make sense. Removed the tag interface from a datasource, which removed it from results. Reviewers: epriestley, amckinley, #blessed_reviewers Reviewed By: epriestley, #blessed_reviewers Subscribers: Korvin Differential Revision: https://secure.phabricator.com/D18760	2017-11-30 15:07:49 +00:00
epriestley	d36f98a15a	Clarify acceptable values for `--threshold` in `search ngrams` Summary: See D18710. Test Plan: o_O Reviewers: amckinley Reviewed By: amckinley Differential Revision: https://secure.phabricator.com/D18712	2017-10-17 14:32:25 -07:00
epriestley	63d1230ade	Parameterize the common ngrams threshold Summary: Ref T13000. Since other changes have generally made the ngrams table manageable, I'm not planning to enable common ngrams by default at this time. Instead, make the threshold configurable with "--threshold" so we can guide installs through tuning this if they want (e.g. PHI110), and tune hosted instances. (This might eventually become automatic, but just smoothing this bit off for now feels reasonable to me.) Test Plan: Ran with `--reset`, and with various invalid and valid `--threshold` arguments. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13000 Differential Revision: https://secure.phabricator.com/D18710	2017-10-17 14:13:49 -07:00
Dmitri Iouchtchenko	9bd6a37055	Fix spelling Summary: Noticed a couple of typos in the docs, and then things got out of hand. Test Plan: - Stared at the words until my eyes watered and the letters began to swim on the screen. - Consulted a dictionary. Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: epriestley, yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam Differential Revision: https://secure.phabricator.com/D18693	2017-10-09 10:48:04 -07:00
epriestley	17e83b53d5	Add "bin/search query" for debugging query execution Summary: Ref T13000. Currently, queries can only be executed from the web UI, which requires logging in as a user. I really want to avoid doing that wherever we can, but being able to execute queries on an instance (and, particularly, see the ngrams and timings on the underlying lookups) would have been helpful in several cases. Improve tooling a bit in advance of the "common ngrams" stuff going out since it seems likely that it will be useful if issues arise. Test Plan: Ran `bin/search query --query ...`, got useful minimal output. Ran with `--trace` to get internals. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13000 Differential Revision: https://secure.phabricator.com/D18690	2017-10-06 08:50:34 -07:00
epriestley	66df5b1493	Add a garbage collector for common ngrams Summary: Ref T13000. After an ngram is marked as "common", we can delete it from the storage table. Currently, the only way to get ngrams marked as "common" is to manually run `bin/search ngrams`, so this has no impact on normal installs. Test Plan: Ran `bin/garbage collect`, saw it start chewing through my local Maniphest ngrams table and removing common ngrams. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13000 Differential Revision: https://secure.phabricator.com/D18687	2017-10-05 11:41:18 -07:00
epriestley	3e589cdd73	Add a workflow for populating (or depopulating) the common ngrams table Summary: Depends on D18672. Ref T13000. This does an on-demand build of the common ngrams table. Plan here is: - Push to `secure`. - Build the common ngrams table here. - See if stuff breaks? If it looks okay on this dataset, we can build out the GC support and try it in production. Test Plan: - Locally, my dataset has a bunch of `bin/lipsum` tasks with similar, common words. - Verified that ipsum terms now skip ngrams. For "lorem ipsum" search performance actually IMPROVED by skipping the ngrams table (12s to 9s). - Queried for normal terms, got very fast results using the ngram table, as normal. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13000 Differential Revision: https://secure.phabricator.com/D18673	2017-10-03 13:28:19 -07:00
epriestley	1de130c9f5	Allow the Ferret engine to remove "common" ngrams from the index Summary: Ref T13000. This adds support for tracking "common" ngrams, which occur in too many documents to be useful as part of the ngram index. If an ngram is listed in the "common" table, it won't be written when indexing documents, or queried for when searching for them. In this change, nothing actually writes to the "common" table. I'll start writing to the table in a followup change. Specifically, I plan to do this: - A new GC process updates the "common" table periodically, by writing ngrams which appear in more than X% of documents to it, for some value of X, if there are at least a minimum number of documents (maybe like 4,000). - A new GC process deletes ngrams that have been added to the common table from the existing indexes. Hopefully, this will pare down the ngrams index to something reasonable over time without requiring any manual tuning. Test Plan: - Ran some queries and indexes. - Manually inserted ngrams `xxx` and `yyy` into the ngrams table, searched and indexed, saw them ignored as viable ngrams for search/index. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13000 Differential Revision: https://secure.phabricator.com/D18672	2017-10-03 13:27:42 -07:00
epriestley	a3a6c4ed2e	Fix fatal when searching for "r matey prepare to be boarded" Summary: See <https://discourse.phabricator-community.org/t/unrecoverable-fatal-error-on-repository-search-in-top-search-bar/503/2>. The Ferret engine replaced `withNameContains()`, but I missed this obscure callsite. Test Plan: - Searched for `r matey prepare to be boarded`. - Before: fatal. - After: no fatal. - Also searched for `r <actual repository name>`, got repository. Reviewers: amckinley Reviewed By: amckinley Differential Revision: https://secure.phabricator.com/D18661	2017-09-29 09:45:39 -07:00
epriestley	086a125ad5	Improve performance of Ferret engine ngram extraction, particularly for large input strings Summary: See PHI87. Ref T12974. The `array_slice()` method of splitting the string apart can perform poorly for large input strings. I think this is mostly just the large number of calls plus building and returning an array being not entirely trivial. We can just use `substr()` instead, as long as we're a little bit careful about keeping track of where we're slicing the string if it has UTF8 characters. Test Plan: - Created a task with a single, unbroken blob of base64 encoded data as the description, roughly 100KB long. - Saw indexing performance improve from ~6s to ~1.5s after patch. - Before: https://secure.phabricator.com/xhprof/profile/PHID-FILE-nrxs4lwdvupbve5lhl6u/ - After: https://secure.phabricator.com/xhprof/profile/PHID-FILE-6vs2akgjj5nbqt7yo7ul/ Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T12974 Differential Revision: https://secure.phabricator.com/D18649	2017-09-27 10:41:39 -07:00
epriestley	a1d9a2389d	Improve Ferret engine indexing performance for large blocks of text Summary: See PHI87. Ref T12974. Currently, we do a lot more work here than we need to: we call `phutil_utf8_strtolower()` on each token, but can do it once at the beginning on the whole block. Additionally, since ngrams don't care about order, we only need to convert unique tokens into ngrams. This saves us some `phutil_utf8v()`. These calls can be slow for large inputs. Test Plan: - Created a ~4MB task description. - Ran `bin/search index Txxx --profile ...` to profile indexing performance before and after the change. - Saw total runtime drop form 38s to 9s. - Before: <https://secure.phabricator.com/xhprof/profile/PHID-FILE-wiht5d7lkyazaywwxovw/> - After: <https://secure.phabricator.com/xhprof/profile/PHID-FILE-efxv56q2hulr6kjrxbx6/> Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T12974 Differential Revision: https://secure.phabricator.com/D18647	2017-09-27 08:15:40 -07:00
epriestley	1ac52c09e7	Improve search highlighting for CJK and substring queries Summary: Fixes T12995. Currently, the result highlighter (which shows //where// terms matched) only works in "term" mode, not in "substring" mode. Provide better feedback and behvaior: - When a term is a substring term, color it a little differently and add a tooltip. (This is partly to make it easier to debug/diagnose things, probably not enormously valuable to users.) - When a term is a substring term, highlight it anywhere in the results. Test Plan: Queried for latin and CJK terms. Here is CJK being highlighted: {F5192195} Here is substring vs non-substring implicit behavior: {F5192196} Here's ONLY terms being highlighted: {F5192198} Here's terms and substrings, since the query now has a substring: {F5192201} Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T12995 Differential Revision: https://secure.phabricator.com/D18635	2017-09-22 11:34:46 -07:00
epriestley	da0a08a7e1	Make "mysql" mean "Ferret engine" in Fulltext search Summary: Ref T12819. Swaps constants so existing configurations that use a "mysql" engine now use the Ferret engine, not an InnoDB/MyISAM FULLTEXT engine. Test Plan: Swapped my local config back to "mysql" (the default), saw Ferret engine results in the UI. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18590	2017-09-11 18:05:12 -07:00
epriestley	39b74572e6	Return fulltext tokens from the Ferret fulltext engine Summary: Ref T12819. These render the little "Searched For: X, Y, U V" hint about how something was parsed. (This might get a "substring" color or "title only" color or something in the future.) Test Plan: {F5178807} Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18589	2017-09-11 18:04:56 -07:00
epriestley	6edf98eb3b	Unprototype the Ferret UI fields Summary: Ref T12819. Show the new Ferret engine fields (and enable the indexer) unconditionally. Also pull them to the top since they're fairly general-purpose and appear more broadly now, and also they actually work correctly (WOW). Some redundant fields (like "Name Contains" in Repositories and Owners) could probably be removed now, I may clean those up in a followup. Test Plan: Browsed around, saw Ferret fields in UI without "(Prototype)" suffix. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18587	2017-09-11 18:04:25 -07:00
epriestley	d67cc8e5c5	Remove some redundant information from the Ferret engine index Summary: Ref T12819. The "full" field has all other fields, and the "core" field has "title" and "body". Due to the way the "full" and "core" fields were being built, the "core" field also got included in the "full" field, so the "full" field has two copies of the title, two copies of the body, and then one copy of everything else. Put only one copy of each distinct thing in each "full" and "core". Also, simplify the logic a little bit so we build these virtual fields in a more consistent way. Test Plan: Ran `bin/search index` and looked at the fields in the database, saw less redundant information. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18580	2017-09-08 09:40:12 -07:00
epriestley	7ea6de6e9c	Split Ferret engine strings for tokenization on any sequence of whitespace Summary: Ref T12819. Currently, strings are split only on spaces, but newlines (and, if they exist, tabs) should also split strings. Without this, we can fail to get the proper term boundary tokens for words which begin at the start of a line or end at the end of a line. Test Plan: Reindexed a document with "xyz\nabc", saw `"yz "` and `" ab"` term boundary tokens generate properly. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18579	2017-09-08 09:39:57 -07:00
epriestley	2218caee0f	Reduce the amount of boilerplate that implementing FerretInterface requires Summary: See brief discussion in D18554. All the index tables are the same for every application (and, at this point, seem unlikely to change) and we never actually pass these objects around (they're only used internally). In some other cases (like Transactions) not every application has the same tables (for example, Differential has extra field for inline comments), and/or we pass the objects around (lots of stuff uses `$xactions` directly). However, in this case, and in Edges, we don't interact with any representation of the database state directly in much of the code, and it doesn't change from application to application. Just automatically define document, field, and ngram tables for anything which implements `FerretInterface`. This makes the query and index logic a tiny bit messier but lets us delete a ton of boilerplate classes. Test Plan: Indexed objects, searched for objects. Same results as before with much less code. Ran `bin/storage upgrade`, got a clean bill of health. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18559	2017-09-07 13:23:31 -07:00
epriestley	3ff9d4a4ca	Support Ferret engine for searching users Summary: Ref T12819. Adds support for indexing user accounts so they appear in global fulltext results. Also, always rank users ahead of other results. Test Plan: Indexed users. Searched for a user, got that user. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18552	2017-09-07 13:22:12 -07:00
epriestley	a2a2b3f7f4	Sort global fulltext results by overall relevance Summary: Ref T12819. Currently, under the Ferret engine, we query each application's index separately and then aggregate the results. At the moment, results are aggregated by type first, then by actual rank. For example, all the revisions appear first, then all the tasks. Instead, surface the internal ranking data from the underlying query and sort by it. Test Plan: Searched for "A B" with a task named "A B" and a revision named "A". Saw task first. Broadly, saw mixed task and revision order in result sets. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18551	2017-09-07 13:21:58 -07:00
epriestley	8059db894d	Use the Ferret engine fulltext document table to drive auxiliary fulltext constraints Summary: Ref T12819. I started trying to get individual engines to drive these constraints (e.g., `ManiphestTaskQuery` can do most of the work) but this is a big pain, especially since most engines don't support "any owner" or "no owner", and not everything has an owner, and so on and so on. Going down this path would have meant a huge pile of stub functions everywhere, I think. Instead, drive these through the main engine using the fulltext document table, which already has everything we need to apply these constraints in a uniform way. Also tweak some parts of query construction and result ordering. Test Plan: Searched for documents by author, owner, unowned, any owner, tags, subscribers, fulltext in global search. Got sensible results without any application-specific code. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18550	2017-09-07 13:21:42 -07:00
epriestley	4ea677ba97	Skeleton support for running global fulltext queries via the Ferret engine Summary: Ref T12819. Provides a Ferret-engine-based fulltext engine to ultimately replace the InnoDB fulltext engine. This is still pretty basic (hard-coded and buggy) but technically sort of works. To activate this, you must explicitly configure it, so it isn't visible to users yet. Test Plan: Searched for objects with global fulltext search, got a mixture of matching revisions and tasks back. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18548	2017-09-06 13:15:36 -07:00
epriestley	551c62b91a	Support Ferret engine queries in ApplicationSearch via extension instead of hard-code Summary: Ref T12819. Uses an extension rather than hard-coding support into Maniphest. Test Plan: Saw "Query" field appear in Differential, which also implements the interface and has support. Used field in both applications. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18547	2017-09-06 13:15:10 -07:00
epriestley	faca1deea5	Remove the fulltext "reconstructDocument()" method Summary: Ref T12819. This was originally intended for debugging, but never actually used and not clearly useful. There are no callers and it probably does not work. Just get rid of it. Test Plan: Grepped for callers; none exist. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18544	2017-09-06 11:48:35 -07:00
Chad Little	fc893658b8	Update menu item names for Applications -> Favorites Summary: Adds a `MenuName` method to applications that `ProfileMenuItem` uses instead of the application name if set. This improves the home/menu/new user experience at little cost. Also renamed the label from Applications to Favorites, since this menu gets altered to provide more than just applications. This also allows instances to set back to Maniphest if they so choose. Overall I think this direction resolves 95% of my concerns, with maybe a small potential downside which I don't really anticipate. We already name Dashboard panels by their object, and that hasn't really caused confusion. I think these links are similar. I click 'Tasks' and get presented a list of my tasks from Maniphest. Test Plan: Review each of the name changes as a default new install and a modified install. Reviewers: epriestley, amckinley Reviewed By: epriestley Spies: Korvin Differential Revision: https://secure.phabricator.com/D18524	2017-09-05 19:05:03 -07:00
epriestley	20aad35e60	Move Ferret engine "title:..." field definitions to the engine itself Summary: Ref T12819. Move these out of the core engine into the Ferret engine. In the future different applications can define different functions, like "summary:..." or whatever. This may get more formalization when I possibly do "author:" and such some time down the road. Test Plan: Searched for "title:...". Searched for "dog:...", got a useful error. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18536	2017-09-05 11:57:51 -07:00
epriestley	46abc11114	Reduce the number of magic strings in the Ferret implementation Summary: Ref T12819. Push more of the magic `' '` stuff into the engine and simplify calls to ngram construction. Also fixes a bug where a task with title "apple banana" and description "cherry doughnut" could match query "banana cherry" by separating separate term segments with newlines instead of spaces. Test Plan: - Indexed some objects. - Searched (term, substring, quoted terms). - Viewed index in database. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18534	2017-09-05 11:57:35 -07:00
epriestley	4a7593f47f	Consolidate more Ferret engine code into FerretEngine Summary: Ref T12819. Earlier I separated some ngram code into an "ngram engine" hoping to share it across the simple Ngrams stuff and the full Ferret stuff, but they actually use slightly different rules. Just pull more of this stuff into FerretEngine to reduce the number of moving pieces and the amount of code duplication. Test Plan: Searched for terms, rebuilt indexes. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18533	2017-09-05 11:57:18 -07:00
epriestley	577d498033	Create a virtual "core" field in the Ferret engine for "title and body together" Summary: See PHI46. The `core:` function means "find results in either the title or body, but not other auxiliary fields like comments". Test Plan: Searched for text present in the title (yes), body (yes), and comments (no) with the `core:...` prefix. Reviewers: chad Reviewed By: chad Differential Revision: https://secure.phabricator.com/D18514	2017-09-01 09:40:56 -07:00
epriestley	f4f73e0a7e	Separate fulltext engine extensions into "enrich" and "index" phases Summary: Ref T12819. Some of the extensions "enrich" the document (adding more fields or relationships), while others "index" it (insert it into some kind of index for later searching). Currently, these are all muddled under a single "index" phase. However, the Ferret extension cares about fields and relationships which other extensions may add. Split this into two phases: "enrich" adds fields and relationships so other extensions can read them later if they want. "Index" happens after the document is built and has all the fields and relationships. The specific problem this solves is that comments may not have been added to the document when the Ferret extension runs. By moving them to the "enrich" phase, the Ferret engine will be able to see and index comments. Test Plan: Ran `bin/search index ...`, grepped for `indexFulltextDocument`. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18513	2017-09-01 09:40:11 -07:00
epriestley	df9c24e750	Provide some "term vs substring" support for the Ferret engine Summary: Ref T12819. Distinguishes between "term" queries and "substring" queries, and tries to match them correctly most of the time. For example: - `example` matches "example", obviously. - `~amp` matches "example", but `amp` does not. - `examples` matches "example" through stemming. - `"examples"` does not match "example" (quoted text does not stem). - `"an examp"` does not match "an example" (quoted text is still term text). - `~"an examp"` matches "an example" (quoted, substring-operator text uses substring search). Test Plan: Ran searches similar to the above, they seemed to do what they should. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18500	2017-08-30 11:30:04 -07:00
epriestley	0e2e525bb4	Add a "terms" corpus to Ferret fields Summary: Ref T12819. Ferret currently does substring search, but this is not the default mode users expect: when you search for the "RICO" act, you do not expect to find documents containing "apRICOt" even though "RICO" is a substring. To support term search, index the corpus as a list of terms with puncutation removed and whitespace normalized so the engine can match against it. Test Plan: Ran `storage upgrade`, ran `search index`, saw sensible database results: ``` rawCorpus: This is the task description. Hark! Whom'st'dve eaten this "food" shall surely ~perish~?? #blessed normalCorpus: thi the task descript hark whom dve eaten food shall sure perish bless termCorpus: This is the task description Hark Whom'st'dve eaten this food shall surely perish blessed ``` Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18498	2017-08-30 11:29:14 -07:00
epriestley	77ef38f9a8	Aggregate corpus data in Ferret field rows Summary: Ref T12819. This addresses two issues: - One practical issue is that right now, if you search for "dog cat", and they appear in different fields (for example, "dog" appears ONLY in the title, while "cat" appears ONLY in a comment) we won't find the document. This is somewhat rare -- usually, if "dog" appears in the title, it's also repeated in the description -- but I think clearly a bug. To attack this, start automatically creating a virtual "ALL" field with the full document text which we'll use as the primary thing we match against. - For fields which may occur more than once -- today, only comments -- aggregate them all into one big "all of the text" row instead of writing one row per comment. This partly addresses the first point ("dog" in one comment and "cat" in a different comment won't be found) and partly makes some of the query gymnastics easier. Test Plan: Ran `bin/storage upgrade`, ran `bin/search index <Txxx>`, saw sensible corpus values in the database: ``` mysql> select * from maniphest_task_ffield\G ************************* 1. row *********************** id: 3 documentID: 1981 fieldKey: full rawCorpus: This is the task title This is the task description. normalCorpus: thi the task titl thi the task descript *********************** 2. row *********************** id: 4 documentID: 1981 fieldKey: titl rawCorpus: This is the task title normalCorpus: thi the task titl *********************** 3. row ************************* id: 5 documentID: 1981 fieldKey: body rawCorpus: This is the task description. normalCorpus: thi the task descript 3 rows in set (0.00 sec) ``` Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18497	2017-08-30 11:28:30 -07:00
epriestley	4005a465f7	Make Ferret indexing more robust (UTF8, exception handling) Summary: Ref T12819. Two minor improvements from live data: - Tokenize in a UTF8-aware way. - When one document fails to index, kill the transaction explicitly (rather than leaving it hanging) so we don't cause other failures later. Test Plan: Created some UTF8 documents locally, indexed them, got clean results. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819 Differential Revision: https://secure.phabricator.com/D18487	2017-08-28 15:49:57 -07:00
epriestley	f97157e7ed	Build a prototype fulltext engine ("Ferret") using only basic MySQL primitives Summary: Ref T12819. I gave this stuff a sweet code name because all the terms related to "fulltext" and "search" already mean 5 different things. It, uh, ferrets out documents for you? I'm building this to work a lot like the existing ngram index, which seems to work pretty well. If this sticks, it will auto-resolve the join issue (in T12443) by letting us do the entire thing locally in a JOIN and thus dodge a lot of mess. This index gets built alongside other indexes, but only shows up in the UI if you have prototypes enabled. If you do, it appears under the existing fulltext field in Maniphest. No existing functionality is affected or disrupted. NOTE: The query engine half of this is still EXTREMELY primitive, and this probably performs worse than the existing field for now. If this doesn't show obvious signs of being awful on `secure` I'll improve that in followup changes. Test Plan: Indexed my tasks, ran some simple queries, got the results I wanted, even for queries "ko", "k", "v0.1". {F5147746} Reviewers: chad Reviewed By: chad Maniphest Tasks: T12819, T12443 Differential Revision: https://secure.phabricator.com/D18484	2017-08-28 14:52:59 -07:00
epriestley	47da632a22	Separate saved queries in applications into "personal" and "global" queries Summary: Ref T12956. UI changes: - Administrators get a new `[X] Save as global query` option when saving a query. - "Edit Queries..." is split into "Personal" and "Global" sections. For administrators, each section can be edited. For non-admins, only the top section can be edited, but any query can be pinned. A couple notes: - This doesn't support "pin for everyone by default". New users just get the first query from the bottom set. That seems reasonable for now. - Reordering is currently a little buggy (it works if you've reordered before, but not if you're reordering for the first time), but I need to migrate before I can fix / test that properly. So that'll get cleaned up in the next change or two. Test Plan: - As an admin and non-admin, viewed, edited, disabled, saved-as-personal and saved-as-global various queries. {F5098581} {F5098582} Reviewers: chad Reviewed By: chad Maniphest Tasks: T12956 Differential Revision: https://secure.phabricator.com/D18426	2017-08-24 15:24:34 -07:00
epriestley	58b889c5b0	Make the default ApplicationSearch query explicit, not just the first item in the list Summary: Ref T12956. Currently, when you visit `/maniphest/` (or any other ApplicationSearch application) we execute the first query in the list by default. In T12956, I plan to make changes so that personal queries are always first, then global/builtin queries. Without changing the "default query" rule, this will make it harder to have, for example, some custom queries in Differential but still run a global query like "Active" by default. To make this work, you'd have to save a personal copy of the "Active" query, then put it at the top. This feels a bit cumbersome and this rule is kind of implicit and a little weird anyway. To make this work a little better as we make changes here, add an explicit pinning action, like the one we have in Project ProfileMenus. You can now explicitly choose a query to make default. Test Plan: - Browsed without pinning anything, saw normal behavior. - Pinned queries, viewed `/maniphest/`, saw a non-initial query selected by default. - Pinned a query, deleted it, nothing exploded. {F5098484} Reviewers: chad Reviewed By: chad Maniphest Tasks: T12956 Differential Revision: https://secure.phabricator.com/D18422	2017-08-24 15:21:00 -07:00
Chad Little	748725a47d	Don't select disabled menu items as default Summary: Fixes T12969. If you disable "Home" but leave it at the top, we still load it. Test Plan: Disabled "Home". Move Dashboard into first position, see correct home layout. Reviewers: epriestley Reviewed By: epriestley Spies: Korvin Maniphest Tasks: T12969 Differential Revision: https://secure.phabricator.com/D18455	2017-08-23 09:40:30 -07:00
epriestley	8c3243ef68	Lightly modernize NamedQueryQuery Summary: Ref T12956. No real behavioral changes here, just slightly more modern code. Test Plan: Reviewed named queries in Maniphest and "Edit Queries...". Reviewers: chad Reviewed By: chad Maniphest Tasks: T12956 Differential Revision: https://secure.phabricator.com/D18420	2017-08-14 09:07:11 -07:00
epriestley	cfb86dddd2	Warn users that compound terms separated by apostrophes don't work in the MySQL FULLTEXT index either Summary: Ref T12928. Like `v0.1`, terms in the form `yo's` (sequences of two or fewer characters separated by apostrophes) do not get indexed. Test Plan: {F5078192} Reviewers: chad Reviewed By: chad Maniphest Tasks: T12928 Differential Revision: https://secure.phabricator.com/D18324	2017-08-02 16:06:08 -07:00
Chad Little	ea0db5aa9d	Clean up dropdown carets Summary: Adds dropdown carets to buttons more universally that are actually dropdowns. Test Plan: Differential, Application Search, Diffusion. Mobile and Desktop. Reviewers: epriestley Reviewed By: epriestley Subscribers: Korvin Differential Revision: https://secure.phabricator.com/D18292	2017-07-28 15:11:25 -07:00
epriestley	018d1b77bf	Identify compound short search tokens in the form "xx.yy" as unqueryable in the search UI Summary: Ref T12928. The index doesn't work for these, so show the user that there's a problem and drop the terms. This doesn't fix the problem, but makes the behavior more clear. Test Plan: {F5053703} {F5053704} Reviewers: chad Reviewed By: chad Maniphest Tasks: T12928 Differential Revision: https://secure.phabricator.com/D18254	2017-07-20 14:24:09 -07:00
Aviv Eyal	d1f144b214	Fix Search Application Config Summary: Fix T12924. Looks like this melted in D17384, and nobody noticed yet. Test Plan: Visit page, see fancy table. Reviewers: epriestley, 20after4, #blessed_reviewers Reviewed By: epriestley, 20after4, #blessed_reviewers Subscribers: Korvin Maniphest Tasks: T12924 Differential Revision: https://secure.phabricator.com/D18230	2017-07-18 17:44:56 +00:00
epriestley	b46e2bb4cc	Convert cluster/projects config options to newer modular structure Summary: Ref T12845. Converts the cluster and project config options to the new stuff; this is mostly just shifting boilerplate around. Test Plan: Edited, deleted, and mangled these options from the web UI and CLI. Reviewers: chad, amckinley Reviewed By: amckinley Maniphest Tasks: T12845 Differential Revision: https://secure.phabricator.com/D18166	2017-06-27 12:35:54 -07:00
epriestley	a198590533	Degrade more gracefully when ProfileMenu dashboards fail to render Summary: Ref T12871. This replaces a dead end UI (user totally locked out) with one where the menu is still available, if the default menu item is one which generates a policy exception (e.g., because users can't see the dashboard). Really, we should do better than this and not select this item as the default item if the viewer can't see it, but there is currently no reliable way to test for "can the viewer see this item?" so this is a more involved change. I'm thinking we get this minor improvement into the release, then pursue a more detailed fix afterward. Test Plan: - Added a dashboard as the top item in the global menu. - Changed the dashboard to be visible to only user B. - Viewed Home as user A. - Before patch: entire page is a policy exception dialog. - After patch, things are better: {F5014179} Reviewers: chad, amckinley Reviewed By: amckinley Maniphest Tasks: T12871 Differential Revision: https://secure.phabricator.com/D18152	2017-06-23 12:31:36 -07:00
epriestley	f704f905d2	Let PhabricatorSearchCheckboxesField survive saved query data with mismatched types Summary: Fixes T12851. This should fix the error I'm seeing, which is: * `Argument 1 passed to array_fuse() must be of the type array, boolean given` There may be a better way to patch this up than overriding the getValue() method, however. Test Plan: - Changed the default "Tags" filter to specify `true` instead of `array('self')`, then viewed that filter in the UI. - Before patch: fatal. - After patch: page loads. Note that `true` is not interpreted as `array('self')`, but the page isn't broken, which is a big improvement. Reviewers: #blessed_reviewers, 20after4, chad, amckinley Reviewed By: #blessed_reviewers, amckinley Subscribers: Korvin Maniphest Tasks: T12851 Differential Revision: https://secure.phabricator.com/D18132	2017-06-23 12:29:47 -07:00
Chad Little	00400ae6f9	Search and Replace calls to setShade Summary: grep for setShade and update to setColor. Add deprecated warning. Test Plan: Diffusion, Workboards, Maniphest, Project tags, tokenizer, uiexamples Reviewers: epriestley Reviewed By: epriestley Subscribers: Korvin, O14 ATC Monitoring Differential Revision: https://secure.phabricator.com/D17995	2017-05-22 18:59:53 +00:00
epriestley	f880000eb0	Stem fulltext tokens before filtering them for stopwords Summary: Fixes T12596. A query for a token (like "having") which stems to a stopword (like "have") currently survives filtering. Stem it first so it gets caught. Also, for InnoDB, a custom stopword table can be configured. If it is, read that instead of the default stopword list (I configured it locally, but the default list is reasonable so we never formally recommended installs configure it). Test Plan: Queried for words that stem to stopwords, saw them filtered: {F4915843} Queried for the original problem query and saw "having" caught with "have" in the stopword list: {F4915844} Fiddled with local InnoDB stopword table config and saw the stopword list get loaded correctly. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12596 Differential Revision: https://secure.phabricator.com/D17728	2017-04-19 10:02:21 -07:00
epriestley	6052bc1933	Extend "fulltext" and "ngrams" interfaces from "indexable" interface Summary: Ref T8788. See D17702. This allows `bin/search index` to index stuff which only implements `Ngrams`, not `Fulltext`. Test Plan: Kinda poked around `bin/search index` a bit, yell if you hit more issues deeper down the stack? Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T8788 Differential Revision: https://secure.phabricator.com/D17704	2017-04-17 12:59:41 -07:00
Chad Little	5587abf04c	Remove recentParticipants from ConpherenceThread Summary: We no longer display this any more in the UI, so go ahead and remove the callsites and db column. Test Plan: New Room, with and without participants. Reviewers: epriestley Reviewed By: epriestley Subscribers: Korvin Differential Revision: https://secure.phabricator.com/D17683	2017-04-13 13:55:08 -07:00
Chad Little	2c5ee2a225	Fix Durable Column CSS-Overload Summary: This moves the count on the Conpherence Menu Item into a phui-list-item-count, and removes the CSS call to the entire Conphrence stack when durable column is open. Test Plan: Test with and without the chat column, and a menu with a count Reviewers: epriestley Reviewed By: epriestley Subscribers: Korvin Differential Revision: https://secure.phabricator.com/D17677	2017-04-13 11:29:30 -07:00
epriestley	ada9046e31	Fix a fulltext search issue where finding token length and stopwords could fail Summary: Ref T12137. If a database is missing the InnoDB or MyISAM table engines, the big combined query to get both will fail. Instead, try InnoDB first and then MyISAM. (I have both engines locally so this worked until I deployed it.) Test Plan: Faked an InnoDB error like `secure`, got a MyISAM result. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12137 Differential Revision: https://secure.phabricator.com/D17673	2017-04-12 19:22:46 -07:00
epriestley	3245e74f16	Show users how fulltext search queries are parsed and executed; don't query stopwords or short tokens Summary: Depends on D17670. Fixes T12137. Fixes T12003. Ref T2632. This shows users a readout of which terms were actually searched for. This also drops those terms from the query we submit to the backend, dodging the weird behaviors / search engine bugs in T12137. This might need some design tweaking. Test Plan: {F4899825} Reviewers: chad Reviewed By: chad Maniphest Tasks: T12137, T12003, T2632 Differential Revision: https://secure.phabricator.com/D17672	2017-04-12 19:07:54 -07:00
epriestley	cb49acc2ca	Update Phabricator to use intermediate tokens from the query compiler Summary: Depends on D17669. Ref T12137. Ref T12003. Ref T2632. Ref T7860. Converts Phabricator to the new parse + compile workflow with intermediate tokens. Also fixes a bug where searches for `cat"` or similar (unmatched quotes) wouldn't produce a nice exception. Test Plan: - Fulltext searched. - Fulltext searched in Conpherence. - Fulltext searched with bad syntax. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12137, T12003, T7860, T2632 Differential Revision: https://secure.phabricator.com/D17670	2017-04-12 19:07:33 -07:00
epriestley	4bf968148c	Fix pagination of fulltext search results Summary: Fixes T8285. Fulltext search relies on an underlying engine which can not realistically use cursor paging. This is unusual and creates some oddness. Tweak a few numbers -- and how offsets are handled -- to separate the filtered offset and unfiltered offset. Test Plan: - Set page size to 2. - Ran a query. - Paged forward and backward through results sensibly, seeing the full result set. Reviewers: chad Reviewed By: chad Maniphest Tasks: T8285 Differential Revision: https://secure.phabricator.com/D17667	2017-04-12 17:57:46 -07:00
Chad Little	6bf595b951	Check is viewer is a participant before showing count Summary: In Conpherence ProfileMenuItem we show an unread count if you're a participant, but all message count if you're not. Just remove that. Test Plan: Log out of room in Conpherence, leave messages on second account, check menu item on both accounts. Reviewers: epriestley Reviewed By: epriestley Subscribers: Korvin Differential Revision: https://secure.phabricator.com/D17664	2017-04-12 13:27:07 -07:00
Chad Little	75303567b3	Add a Conpherence Profile Menu Item Summary: Builds a Conpherence Profile Menu Item, complete with counts for the unreads. This allows pinning to home as well as swapping out thread list in Conpherence for pinning eventually. Test Plan: Add a menu item, chat in room, log into other account, see room count. Room count disappears after viewing. Reviewers: epriestley Reviewed By: epriestley Subscribers: Korvin Differential Revision: https://secure.phabricator.com/D17662	2017-04-12 13:07:44 -07:00
epriestley	7e6f37fffb	Rename "ElasticSearch" filenames to "Elasticsearch" (2/2) Sometimes git does some odd magic on case-insensitive filesystems, try to trick it. Auditors: chad	2017-04-02 14:59:36 -07:00
epriestley	a9e2732a5c	Spell "Elasticsearch" correctly, not "ElasticSearch" Summary: Ref T12450. These are like 95% my fault, but Elastic appears to spell the name "Elasticsearch" consistently in their branding. Test Plan: `grep ElasticSearch` Reviewers: chad, 20after4 Maniphest Tasks: T12450 Differential Revision: https://secure.phabricator.com/D17601	2017-04-02 14:58:59 -07:00
epriestley	0f144d29e9	When "cluster.search" changes, don't trust the old index versions Summary: Ref T12450. We track a "document version" for updating search indexes, so that if a document is rapidly updated many times in a row we can skip most of the work. However, this version doesn't consider "cluster.search" configuration, so if you add a new service (like a new ElasticSearch host) we still think that every document is up-to-date. When you run `bin/search index` to populate the index (without `--force`), we just do nothing. This isn't necessarily very obvious. D17597 makes it more clear, by printing "everything was skipped and nothing happened" at the end. Here, fix the issue by considering the content of "cluster.search" when computing fulltext document versions: if you change `cluster.search`, we throw away the version index and reindex everything. This is slightly more work than we need to do, but changes to "cluster.search" are rare and this is much easier than trying to individually track which versions of which documents are in which services, which probably isn't very useful anyway. Test Plan: - Ran `bin/search index --type project`, saw everything get skipped. - Changed `cluster.search`. - Ran `search index` again, saw everything get updated. - Ran a third time without changing `cluster.search`, everything was properly skipped. Reviewers: chad, 20after4 Reviewed By: 20after4 Maniphest Tasks: T12450 Differential Revision: https://secure.phabricator.com/D17598	2017-04-02 13:45:48 -07:00
epriestley	bd93978200	Count and report skipped documents from "bin/search index" Summary: Ref T12450. There's currently a bad behavior where inserting a document into one search service marks it as up to date everywhere. This isn't nearly as obvious as it should be because `bin/search index` doesn't make it terribly clear when a document was skipped because the index version was already up to date. When running `bin/seach index` without `--force` or `--background`, keep track of updated vs not-updated documents and print out some guidance. In other configurations, try to provide more help too. Test Plan: {F4452134} Reviewers: chad, 20after4 Reviewed By: 20after4 Maniphest Tasks: T12450 Differential Revision: https://secure.phabricator.com/D17597	2017-04-02 13:45:30 -07:00
epriestley	6d81675032	Remove "url" from Elasticsearch index Summary: Ref T12450. This was added a very very long time ago (D2298). I don't want to put this in the upstream index anymore because I don't want to encourage third parties to develop software which reads the index directly. Reading the index directly is a big skeleton key which bypasses policy checks. This was added before much of the policy model existed, when that wasn't as much of a concern. On a tecnhnical note, this also doesn't update when `phabricator.base-uri` changes. This can be written as a search index extension if an install relies on it for some bizarre reason, although none should and I'm unaware of any actual use cases in the wild for it, even at Facebook. Test Plan: Indexed some random stuff into ElasticSearch. Reviewers: chad, 20after4 Reviewed By: chad Maniphest Tasks: T12450 Differential Revision: https://secure.phabricator.com/D17600	2017-04-02 13:26:45 -07:00
epriestley	64234535e3	Remove FIELD_KEYWORDS, index project slugs as body content Summary: D17384 added a "keywords" field but only partially implemented it. - Remove this field. - Index project slugs as part of the document body instead. Test Plan: - Ran `bin/search index PHID-PROJ-... --force`. - Found project by searching for a unique slug. Reviewers: chad, 20after4 Reviewed By: chad Differential Revision: https://secure.phabricator.com/D17596	2017-04-02 09:36:32 -07:00
Mukunda Modell	cb1d904654	Make sure writes go to the right cluster Summary: Two little issues 1. there was an extra call to getHostForWrite, 2. The engine instance was shared between multiple service definitions so it was overwriting the list of writable hosts from one service with hosts from another. Test Plan: tested in wikimedia production with multiple services defined like this: ```language=json [ { "hosts": [ { "host": "search.svc.codfw.wmnet", "protocol": "https", "roles": { "read": true, "write": true }, "version": 5 } ], "path": "/phabricator", "port": 9243, "type": "elasticsearch" }, { "hosts": [ { "host": "search.svc.eqiad.wmnet", "protocol": "https", "roles": { "read": true, "write": true }, "version": 5 } ], "path": "/phabricator", "port": 9243, "type": "elasticsearch" } ] ``` Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: epriestley Differential Revision: https://secure.phabricator.com/D17581	2017-03-30 18:08:05 +00:00
Mukunda Modell	67a1c40476	Set content-type to application/json Summary: Elasticsearch really wants a raw json body and it fails to accept the request as of es version 5.3 Test Plan: Tested with elasticsearch 5.2 and 5.3. Before this change 5.2 worked but 5.3 failed with `HTTP/406 "Content-Type header [application/x-www-form-urlencoded] is not supported"` [1] After this change, both worked. [1] https://phabricator.wikimedia.org/P5158 Reviewers: epriestley, #blessed_reviewers Reviewed By: epriestley, #blessed_reviewers Subscribers: Korvin Differential Revision: https://secure.phabricator.com/D17580	2017-03-30 18:07:47 +00:00
Mukunda Modell	654f0f6043	Make messages translatable and more sensible. Summary: These exception messages & comments didn't quite match reality. Fixed and added pht() around a couple of them. Test Plan: I didn't test this :P Reviewers: epriestley, #blessed_reviewers Reviewed By: epriestley, #blessed_reviewers Subscribers: Korvin Differential Revision: https://secure.phabricator.com/D17578	2017-03-28 23:17:35 +00:00
epriestley	5f939dcce0	Re-run config validation from `bin/search` Summary: Ref T12450. Normally, we validate config when: - You restart the webserver. - You edit it with `bin/config set ...`. - You edit it with the web UI. However, you can also change config by editing `local.json`, `some_env.conf.php`, a `SiteConfig` class, etc. In these cases, you may miss config warnings. Explicitly re-run search config checks from `bin/search`, similar to the additional database checks we run from `bin/storage`, to try to produce a better error message if the user has made a configuration error. Test Plan: ``` $ ./bin/search init Usage Exception: Setting "cluster.search" is misconfigured: Invalid search engine type: elastic. Valid types are: elasticsearch, mysql. ``` Reviewers: chad, 20after4 Reviewed By: 20after4 Maniphest Tasks: T12450 Differential Revision: https://secure.phabricator.com/D17574	2017-03-28 14:53:26 -07:00
epriestley	c22693ff29	Remove PhabricatorSearchEngineTestCase Summary: Ref T12450. This is now pointless and just asserts that `cluster.search` has a default value. We might restore a fancier version of this eventually, but get rid of this for now. Test Plan: Scruitinized the test case. Reviewers: chad, 20after4 Reviewed By: 20after4 Maniphest Tasks: T12450 Differential Revision: https://secure.phabricator.com/D17573	2017-03-28 13:57:55 -07:00
epriestley	e7c76d92d5	Make `bin/search init` messaging a little more consistent Summary: Ref T12450. This mostly just smooths out the text a little to improve consistency. Also: - Use `isWritable()`. - Make the "skipping because not writable" message more clear and tailored. - Try not to use the word "index" too much to avoid confusion with `bin/search index` -- instead, talk about "initialize a service". Test Plan: Ran `bin/search init` with a couple of different (writable / not writable) configs, saw slightly clearer messaging. Reviewers: chad, 20after4 Reviewed By: 20after4 Maniphest Tasks: T12450 Differential Revision: https://secure.phabricator.com/D17572	2017-03-28 13:57:37 -07:00
Mukunda Modell	699228c73b	Address some New Search Configuration Errata Summary: [ ] Write an "Upgrading: ..." guidance task with narrow instructions for installs that are upgrading. [ ] Do we need to add an indexing activity (T11932) for installs with ElasticSearch? [ ] We should more clearly detail exactly which versions of ElasticSearch are supported (for example, is ElasticSearch <2 no longer supported)? From T9893 it seems like we may //only// have supported ElasticSearch <2 before, so are the two regions of support totally nonoverlapping and all ElasticSearch users will need to upgrade? [ ] Documentation should provide stronger guidance toward MySQL and away from Elastic for the vast majority of installs, because we've historically seen users choosing Elastic when they aren't actually trying to solve any specific problem. [ ] When users search for fulltext results in Maniphest and hit too many documents, the current behavior is approximately silent failure (see T12443). D17384 has also lowered the ceiling for ElasticSearch, although previous changes lowered it for MySQL search. We should not fail silently, and ideally should build toward T12003. [ ] D17384 added a new "keywords" field, but MySQL does not search it (I think?). The behavior should be as consistent across MySQL and Elastic as we can make it. Likely cleaner is giving "Project" objects a body, with "slugs" and "description" separated by newlines? [ ] `PhabricatorSearchEngineTestCase` is now pointless and only detects local misconfigurations. [ ] It would be nice to build a practical test suite instead, where we put specific documents into the index and then search for them. The upstream test could run against MySQL, and some `bin/search test` could run against a configured engine like ElasticSearch. This would make it easier to make sure that behavior was as uniform as possible across engine implementations. [ ] Does every assigned task now match "user" in ElasticSearch? [x] `PhabricatorElasticFulltextStorageEngine` has a `json_encode()` which should be `phutil_json_encode()`. [ ] `PhabricatorSearchService` throws an untranslated exception. [ ] When a search cluster is down, we probably don't degrade with much grace (unhandled exception)? [ ] I haven't run bin/search init, but bin/search index doesn't warn me that I may want to. This might be worth adding. The UI does warn me. [ ] bin/search init warns me that the index is "incorrect". It might be more clear to distinguish between "missing" and "incorrect", since it's more comforting to users to see "everything is as we expect, doing normal first-time setup now" than "something is wrong, fixing it". [ ] CLI message "Initializing search service "ElasticSearch"" does not end with a period, which is inconsistent with other UI messages. [ ] It might be nice to let bin/search commands like init and index select a specific service (or even service + host) to act on, as bin/storage --ref ... now does. You can generally get the result you want by fiddling with config. [ ] When a service isn't writable, bin/search init reports "Search cluster has no hosts for role "write".". This is accurate but does not provide guidance: it might be more useful to the user to explain "This service is not writable, so we're skipping index check for it.". [x] Even with write off for MySQL, bin/search index --type task --trace still updates MySQL, I think? I may be misreading the trace output. But this behavior doesn't make sense if it is the actual behavior, and it seems like reindexAbstractDocument() uses "all services", not "writable services", and the MySQL engine doesn't make sure it's writable before indexing. [x] Searching or user fails to find task Grant users tokens when a mention is created, suggesting that stemming is not working. [x] Searching for users finds that task, but fails to find a task containing "per user per month" in a comment, also suggesting that stemming is not working. [x] Searching for maniphest fails to find task maniphest.query elephant, suggesting that tokenization in ElasticSearch is not as good as the MySQL tokenization for these words (see D17330). [x] The "index incorrect" warning UI uses inconsistent title case. [x] The "index incorrect" warning UI could format the command to be run more cleanly (with addCommand(), I think). refs T12450 Test Plan: * Stared blankly at the code. * Disabled 'write' role on mysql fulltext service. * Edited a task, ran search indexer, verified that the mysql index wasn't being updated. Reviewers: epriestley, #blessed_reviewers Reviewed By: epriestley, #blessed_reviewers Subscribers: Korvin Maniphest Tasks: T12450 Differential Revision: https://secure.phabricator.com/D17564	2017-03-28 20:19:38 +00:00
epriestley	7d3956bec1	Correct spelling of "Dasbhoard" Summary: Before the speling pollice lock us in prisun. Test Plan: Used a dicationairey. Reviewers: chad, jmeador Reviewed By: jmeador Differential Revision: https://secure.phabricator.com/D17570	2017-03-28 10:04:26 -07:00
Mukunda Modell	9e2f263bb4	Add repositories to fulltext search index. Summary: This implements a simplistic `PhabricatorRepositoryFulltextEngine` Currently only the repository name, description, timestamps and status are indexed. Note: I had to change the `search index` workflow to disambiguate PhabricatorRepository from PhabricatorRepositoryCommit Test Plan: * ran `./bin/search index --type PhabricatorRepository --force` * searched for some repositories. Saw reasonable results matching on either title or description. * Edited a repository in the web ui * Added unique key words to the repo description. * I was then able to find that repo by searching for the new keywords. Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: Korvin Tags: #search, #diffusion Differential Revision: https://secure.phabricator.com/D17300	2017-03-28 07:58:22 +00:00
Mukunda Modell	e41c25de50	Support multiple fulltext search clusters with 'cluster.search' config Summary: The goal is to make fulltext search back-ends more extensible, configurable and robust. When this is finished it will be possible to have multiple search storage back-ends and potentially multiple instances of each. Individual instances can be configured with roles such as 'read', 'write' which control which hosts will receive writes to the index and which hosts will respond to queries. These two roles make it possible to have any combination of: * read-only * write-only * read-write * disabled This 'roles' mechanism is extensible to add new roles should that be needed in the future. In addition to supporting multiple elasticsearch and mysql search instances, this refactors the connection health monitoring infrastructure from PhabricatorDatabaseHealthRecord and utilizes the same system for monitoring the health of elasticsearch nodes. This will allow Wikimedia's phabricator to be redundant across data centers (mysql already is, elasticsearch should be as well). The real-world use-case I have in mind here is writing to two indexes (two elasticsearch clusters in different data centers) but reading from only one. Then toggling the 'read' property when we want to migrate to the other data center (and when we migrate from elasticsearch 2.x to 5.x) Hopefully this is useful in the upstream as well. Remaining TODO: * test cases * documentation Test Plan: (WARNING) This will most likely require the elasticsearch index to be deleted and re-created due to schema changes. Tested with elasticsearch versions 2.4 and 5.2 using the following config: ```lang=json "cluster.search": [ { "type": "elasticsearch", "hosts": [ { "host": "localhost", "roles": { "read": true, "write": true } } ], "port": 9200, "protocol": "http", "path": "/phabricator", "version": 5 }, { "type": "mysql", "roles": { "write": true } } ] Also deployed the same changes to Wikimedia's production Phabricator instance without any issues whatsoever. ``` Reviewers: epriestley, #blessed_reviewers Reviewed By: epriestley, #blessed_reviewers Subscribers: Korvin, epriestley Tags: #elasticsearch, #clusters, #wikimedia Differential Revision: https://secure.phabricator.com/D17384	2017-03-26 08:16:47 +00:00
Chad Little	2921bad1ff	Add an action to adding Panels from ApplicationSearch Summary: Ref T5307. This adds an additional action to Use Results for creating a panel from the query. Test Plan: Navigate to Maniphest, select dropdown for Use Results. Try any of the following: - Try to set a panel without a name (fail) - Muck up query or engine (fail) - Set a fake Dashboard ID (fail) Give panel a name and select a dashboard I have edit permissions to, get taken to dashboard. Reviewers: epriestley Subscribers: Korvin Maniphest Tasks: T5307 Differential Revision: https://secure.phabricator.com/D17516	2017-03-20 14:15:31 -07:00
Chad Little	de4e8728b2	Add ActionIcon to PHUIListItemView, use in Dashboards Summary: Extends PHUIListItemView to take an icon, link as an "Action Item" that displays on the right side of the menu link. Does not display on Favorites. This allows for adding edit, external, or other links (documentation?) to any menu item. Right now the secondary link is only visible when the item is selected. This feels right, but if we offer it in other ways, users may always want it visible. We could look at making it onhover. Test Plan: Add a bunch of random global and personal dashboards to my menu. Add a menu to Favorites, see no link. Test mobile, link works. {F4136699} Reviewers: epriestley Reviewed By: epriestley Subscribers: Korvin Differential Revision: https://secure.phabricator.com/D17505	2017-03-16 11:32:16 -07:00
epriestley	d6d3ad6f80	Allow administrators to get a list of users who don't have MFA configured Summary: Fixes T12400. Adds a "Has MFA" filter to People so you can figure out who you need to harass before turning on "require MFA". When you run this as a non-admin, you don't currently actually hit the exception: the query just doesn't work. I think this is probably okay, but if we add more of these it might be better to make the "this didn't work" more explicit since it could be confusing in some weird edge cases (like, an administrator sending a non-administrator a link which they expect will show the non-administrator some interesting query results, but they actually just get no constraint). The exception is more of a fail-safe in case we make application changes in the future and don't remember this weird special case. Test Plan: - As an administrator and non-administrator, used People and Conduit to query MFA, no-MFA, and don't-care-about-MFA. These queries worked for an admin and didn't work for a non-admin. - Viewed the list as an administrator, saw MFA users annotated. - Viewed config help, clicked link as an admin, ended up in the right place. {F4093033} {F4093034} Reviewers: chad Reviewed By: chad Maniphest Tasks: T12400 Differential Revision: https://secure.phabricator.com/D17500	2017-03-15 17:49:01 -07:00
epriestley	d73df58cc6	Prevent use of the "quality" constraint in the Badge search API Summary: Ref T12270. This just drops the constraint for now, rather than dealing with all the typecasting stuff and putting us in a position which will almost certainly require backward compatibility breaks in the future. Also renames "badges." to "badge." for consistency (all other methods are singular: token., project., differential.revision.*, etc). Test Plan: Saw "qualities" now "Not Supported", while other constraints continue to work: {F3887194} Reviewers: chad Reviewed By: chad Maniphest Tasks: T12270 Differential Revision: https://secure.phabricator.com/D17487	2017-03-09 12:26:58 -08:00
epriestley	4948a21959	Allow tasks to be searched by subtype Summary: Ref T12314. Allow tasks to be queried by subtype using a typeahead. Open to a better default icon. I'll probably let you configure them later. Just hide this constraint if there's only one subtype. Test Plan: - Searched for subtypes. - Verified that the control hides if there is only one subtype. {F3492293} Reviewers: chad Reviewed By: chad Maniphest Tasks: T12314 Differential Revision: https://secure.phabricator.com/D17444	2017-03-02 04:20:38 -08:00
Chad Little	3f1ee67972	Add a tooltip option to Link menu items Summary: Ref T12174. Let's users add a tooltip to LinkProfileMenuItem Test Plan: Add a tooltip, remove tooltip. Menu appears as expected Reviewers: epriestley Reviewed By: epriestley Subscribers: Korvin Maniphest Tasks: T12174 Differential Revision: https://secure.phabricator.com/D17437	2017-03-01 11:16:25 -08:00
Chad Little	54059b0a9d	Add fulltext search results panel back for dashboards Summary: Ref T12324. Adds back this query for search results in dashboards. Test Plan: Use panel in Dashboard. Reviewers: epriestley Reviewed By: epriestley Subscribers: Korvin Maniphest Tasks: T12324 Differential Revision: https://secure.phabricator.com/D17428	2017-02-27 12:45:17 -08:00

1 2 3 4 5 ...

718 commits