Re: [basex-talk] Ignore option for full-text-search

3 Jan 2016


      Hi Günter,
...
my long-time project (kleist-digital.de) is nearly only based on documents (TEI-xml), so I'm working mostly with mixed content.
Thanks for the link to your project. I just remembered the (in)famous
dash in Kleist’s Marquise (»Hier — traf er«), and it was interesting
to look it up in one of the earliest editions of the text…
...
Are there any plans for future versions to implement the ignore option for full-text-search? It would help a lot.
...
From today's perspective, unfortunately, it has a rather low priority.
The implementation would probably raise various questions that would
first need to be conceptually resolved. Right now, as you probably
know, both the input and the query strings are simply atomized:
<a>A B</a> contains text { <b>B</b> }
  →  'A B' contains text 'B'
In order to ignore certain elements, we’d need to extend the
atomization process to consider element borders. Things are even
getting more complicated if data is indexed.
We usually solve this by creating additional index databases, which
only contain the texts that need to be searched. The following XQuery
expression shows how to create an 'index' database which all documents
from 'data' without 'p' elements:
db:create(
    'index',
    db:open('data') update delete node .//p,
    db:open('data')/db:path(.)
  )
Hope this helps,
Christian

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] Ignore option for full-text-search