-
Notifications
You must be signed in to change notification settings - Fork 8
Guide to search terms
The hoover search engine supports a rich syntax, borrowed from elasticsearch, to narrow down search results. Here are some examples.
The most simple search, just enter a few words:
offshore company lawyer
Several words that must appear together, in order:
"lawyer contract"
Will find sign the lawyer contract with George
but not get a contract for the lawyer George
.
Search on similar words, e.g. names that are spelled in several ways:
Gillian~
Will find both Gillian
and Jillian
Several words that must appear close together, and the order doesn't matter:
"George Costanza"~3
Will find talk to Costanza P George about the video
. The number 3
specifies
how far apart the search words can be found.
Filter out documents that contain a term, use !
or -
:
George -Costanza
George !Costanza
Will find George Bush
but not George Costanza
.
There are many ways to search for phone numbers, but a nice trick is to filter by country code:
Joe Smith 31*
Will turn up phone numbers from the Netherlands.
During indexing, several fields are extracted, to be easily searched.
from:[email protected]
Will return emails sent by [email protected]
.
-
md5
- md5 checksum -
sha1
- sha1 checksum -
path-parts:"/path/to/folder"
- the path should be exact and it will find the documents under that path, including folders and containers. Please copy/paste the "Path" variable from the directory's Meta field into the quotes. -
path-text:"*some part/of_the path//you remember*"
- search for the partial pathsome part/of_the path//you remember
. Does not find partial words, so only search for full file names. -
filename
- filename of the document, e.g.invoice.pdf
-
lang
- language (not yet available) -
text
- all the text found in the file -
subject
- email subject -
from
- email sender -
to
- email recipients (to
,cc
,bcc
) -
message-id
- unique email identifier -
in-reply-to
- for reply emails, this is themessage-id
of the original email -
thread-index
- unique identifier for all emails in a thread (not used consistently) -
references
- similar toin-reply-to
but contains a chain ofmessage-id
values -
date
- modification date for documents and emails -
date-created
- creation date for documents -
ocr:true
- will only select documents with OCR data present
Example: This searches for the text "Johnny Cash" in the OCR scans of .jpg files larger than 200KB.
filename:*.jpg size:>200000 ocr:true Johnny Cash
Find documents (e.g. emails or PDFs) that were created in a certain time period:
date:[2016-03-01 TO 2016-04-01}
Will return only documents created in March 2016. Square brackets ([
, ]
)
are for closed intervals, while curly brackets ({
, }
) are for open
intervals, so the query above includes documents from March 1 but not from
April 1.
Filter on file type.
filetype:email
Will search in emails only. Available options are email
, pdf
, doc
, xls
,
ppt
(these include the OpenOffice versions), text
, html
, folder
.