Hi Lars,
When using ft:extract() on nodes, it seems to clip into the match itself too often. Is it possible to have ft:extract() leave as much before the match as after?
the ft:extract algorithm is an intricate one [1], as the inputs to be processed can be very manifold, but we can try to tweak it a little. Could you please provide us with a little sample XML and query?
Best, Christian
PS: if you, or someone else, want to tweak the extract code.. We are always glad to receive patches!
[1] https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/ba...
For example, here are two results for "spise lunsj" (= eat lunch (language is Norwegian)) the first is as it should be, while the second have half of the matched string clipped . The result is obtained first as a set of hits using fulltext search [text() contains text {$terms} all], then each hit is processed through ft:extract($hit, $terms):
... gjerne komme for å spise lunsj med meg på Harrods. Da skal jeg servere hjortetestikler fra eiendommen min i Skottland. Vi trenger nemlig alle store...
... de franske VM-spillerne hjem til Paris. Der ble de mottatt av fans både på Charles de Gaulle-flyplassen og da de ankom Elysee-palasset for å spise...
Regards, Lars G Johnsen National Library of Norway
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk