Noise Words - CV Text Search

Created by Martin Parkinson, Modified on Sun, 02 Jul 2023 at 03:19 PM by Martin Parkinson

When using the CV Text search Influence searches using the Windows index service

Common words such as 'the', 'and', 'out', are ignored when searching, these are known as noise words. It is not possible to search for these words and they will act as placeholders in phrase queries.


A document that contains the text "wag the dog" is stored in the index with "wag" at occurrence 1 and "dog" at occurrence 3. The phrase query "wag dog" does not match, but the phrase query "wag a dog" does, because the occurrence information matches. The phrase "wag purple dog" does not match because "purple" is not found in the index at occurrence 2. However, a query for "wag the dog" returns documents that contain "wag purple dog" because there is no way to efficiently determine whether the document had a non-noise word between "wag" and "dog."


For example the word Out is a noise word.

This means if you searched for the phrase “Fit Out” it will actually look for the word “Fit” followed by any other word.


Once Microsoft has returned a list of what it considers “Matching” candidates (i.e. Those with the word “Fit” somewhere on their CV), Influence then goes through the text displayed on the “CV” page and tries to highlight the phrase “Fit Out” in Yellow. This means that there will be CVs with the word “OUT” highlighted, because of the way Influence looks for text to highlight is different than the way Microsoft looks to get us the results in the first place.


If a particular phrase, containing a noise word, needs to be searched for often then it is best to add a key code/attribute for this phrase.


Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article