Match Queries

Match Queries🔗

Match queries are auto-generated queries based on an input document (document URI) or URL.

Matching based on an already indexed document🔗

When matching based on an already indexed document, use the following query language to construct the document URI: doc://environmentName/searchEngine/documentID/[extractorName].

Example: doc://mycompany_cv/mycandidates/12345/extractor1

The components of the URI are described below. Consult Textkernel Support for the correct values to use in your match queries.

URI Component	Description
environmentName	The name of the environment in which the source document is located. You will receive an error if you try to reference an environment that you do not have access to.
searchEngine	The name of the searcher (within the environment) in which the source document is located.
documentID	The ID of the source document.
extractorName	Optional. The name of the query extractor in the target environment. Different query extractors can be used for different Match query templates. By default, the target environment's default query extractor is used.

Match Fingerprint🔗

Auto-generated Match queries may contain a special fingerprint to improve the relevance of the Match results. The text_fingerprint captures important nuances from the source document (candidate or job profile). It boosts results containing salient keywords over results containing only common keywords.

The text_fingerprint is never displayed in the Search UI. In case the user manually removes all visible query parts one-by-one so that the text_fingerprint is the only remaining query part, it is automatically removed from the query.

Match Pre-Filtering🔗

Match query results are getting pre-filtered by default. If a query is recognized as being a Match query, we create and apply an additional filter component, independent of the condition types of the query parts. Hence, even with only nice-to-have and should-have query parts, results that make no chance to be relevant are filtered out from the result set. Match pre-filtering is based on 3 components of the query:

the job title
the normalized job group ID
the skills

Only if a query contains all 3 components a Match pre-filter is created and applied. This will filter out results that doesn't match ANY of the 3 components. In case of the skills, only a reasonable fraction of the requested skills needs to be matched in order to satisfy this component. For that, we first consider only the skill-field having the most query parts in the Match query, being either professional skills or computer skills. Then, the number of required skills of that skill field to regard this component matching is determined. For a query having less than 4 skills it's just 1, for less than 7 it's 2, and for any higher number we request at least 3 matching skill parts. (Notice, that the skill component of the filter is only to allow also for results that satisfy neither jobtitle nor jobgroup.)

Currently, the following Match fields are recognized for the different components:

Component	Field Names
job title	`jobtitlesonlyrecent`, `recent_jobtitles`, `vacancytitle`, `jobtitle`
job group	`jobgroupidsonlyrecent`, `recent_profession_groups`, `profession_group`
skills	`compskills`, `profskills`

Minimum matching criteria🔗

Search results will only contain documents matching at least one criteria of the query. This focuses on presenting the user with relevant results.

Visible consequences of this feature:

The total match result count of a query shrinks considerably. Without this feature, queries used to return the entire collection size as match size, even though most documents were completely irrelevant because they didn't match any aspect of the query whatsoever.
Cloud suggestions, and aggregation counts are now meaningful, because now they only consider relevant results. Without this feature, cloud terms and facet counts were counting the entire collection and many irrelevant results.
Faster query execution for queries not containing must-have criteria

A less intuitive consequence of this feature:

If the current query contains only nice-to-have and should-have parts and another nice-to-have/shold-have search term is added, the total match count will increase instead of decrease. However, this is correct behaviour: the query has become broader, so there are more results.

Note that this can also happen in cases where hidden query is defined with only nice-to-have and should-have parts. In this case, the user doesn't see the hidden query parts, so it is also harder to understand.