1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2025-03-09 19:04:48 +01:00
phorge-phorge/src/applications/search/ferret
epriestley 7ea6de6e9c Split Ferret engine strings for tokenization on any sequence of whitespace
Summary:
Ref T12819. Currently, strings are split only on spaces, but newlines (and, if they exist, tabs) should also split strings.

Without this, we can fail to get the proper term boundary tokens for words which begin at the start of a line or end at the end of a line.

Test Plan: Reindexed a document with "xyz\nabc", saw `"yz "` and `" ab"` term boundary tokens generate properly.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T12819

Differential Revision: https://secure.phabricator.com/D18579
2017-09-08 09:39:57 -07:00
..
__tests__ Reduce the number of magic strings in the Ferret implementation 2017-09-05 11:57:35 -07:00
PhabricatorFerretEngine.php Split Ferret engine strings for tokenization on any sequence of whitespace 2017-09-08 09:39:57 -07:00
PhabricatorFerretInterface.php Build a prototype fulltext engine ("Ferret") using only basic MySQL primitives 2017-08-28 14:52:59 -07:00
PhabricatorFerretMetadata.php Support Ferret engine for searching users 2017-09-07 13:22:12 -07:00