I’m searching for short phrases where I may want to respect order or not and where the phrases may cross element boundaries.

 

For example, I have the phrase “Amazon Alexa Spoke” and I want to find any DITA topic whose title text includes “Amazon Alexa Spoke” in that order, or maybe I want those words in any order, depending on my search requirements.

 

When I run this query against my database I find occurrences where all three words are in the same parent element, i.e.:

<title>Create a connection record for the <ph>Amazon Alexa spoke</ph>

</title>

<title>Create a credential record for the <ph>Amazon Alexa spoke</ph>

</title>

<title>Set up the <ph>Amazon Alexa spoke</ph>

</title>

But I do not find it where one of the words is not in the same parent:

This title is *not* found (even though this is the one I actually want to have found):

<title><ph id="alexa">Amazon Alexa</ph> Spoke</title>

 

Reading the docs on ft:search(), it is clear that it is searching on text nodes:

Returns all text nodes from the full-text index…”

So I think the behavior here is as documented.

Short of creating a separate database that removes the subelements within <title> elements, is there a way to use full text indexing to do the search I want? In particular, I want to be able to turn the ordered/unordered check on or off.

If I always wanted ordered I could just use a regular expression match—it wouldn’t be that efficient but efficiency is not a concern in this particular case (but I can see where it would be in a more general search support situation).

Or am I missing a more obvious solution to this requirement?

Note that in this case I don’t care about finding different word forms—for this particular search I only care about exact word matches.

 

Cheers,

 

E.

_____________________________________________

Eliot Kimber

Sr Staff Content Engineer

O: 512 554 9368

M: 512 554 9368

servicenow.com

LinkedIn | Twitter | YouTube | Facebook