Re: [basex-talk] Matching multiple names across a list of sequences of names

4 Apr 2016


      Hi Graydon,
...
I can't give you a real example because it's the client's health care data,
No problem, your example looks fine.
...
let $found := //*[@name eq $match(1)][./descendant::*[@name eq
$match(2)][./descendant::*[@name eq $match(3)]]]
Right. You could try to rewrite this for index access:
1. You’ll have to mark the generated arrays as string arrays:
let $composedNames as array(xs:string) :=
      for $x in $composed//composed
      return array { tokenize($x/string(),'.') }
2. You need to replace "eq" with "=", and you can simplify the
predicates a little:
let $found := //*[@name = $match(1)]
    [descendant::*/@name = $match(2)]
    [descendant::*/@name = $match(3)]
You indicated that you’ll have thousands of paths. How do they look
like? Could you add some more examples (besides
"class.operation.specifier")? Are some parts of the paths more
specific than others? E.g...
A.A.A
   A.A.B
   A.A.C
   A.B.D
   A.B.E
   A.B.F
   ...
In this case, it could make sense to only look for the last path
segment via the index. You could also try to group your results by the
first segment, then do the search on the second segment, etc. See my
attached query as example (I’m sure it needs to be revised to work
properly, because I have only run it with your simple example file).
Does this help?
Christian
...
This works, but it's going over the entire database for every three part
class-operation-specifier compound name.  I can't shake the feeling that
there's a more efficient way to do this, but I can't see what it might be.
Thanks!
Graydon
On Fri, Apr 1, 2016 at 12:04 PM, Christian Grün christian.gruen@gmail.com
wrote:
...
Hi Graydon,
Do you think there’d be a chance for us to get a minimized,
self-contained example, which demonstrates the n^2 solution?
Thanks  in advance,
Christian
On Fri, Apr 1, 2016 at 5:24 PM, Graydon Saunders graydonish@gmail.com
wrote:
...
Hello -
I've got a problem I'm not sure how to best approach.
I've got triplets of names -- class.operation.specifier -- that I need
to
match against much longer sequences of names. (Which are in attributes
in an
XML hierarchy; each sequence of names derives from a path to a leaf
element.)
If there is a match (as there usually is not) one of the names in the
sequence of names will match to the class, a subsequent name to the
operation,  and a name subsequent to that match to the specifier. (All
simple string values.)
The naive n^2 version is much too slow for the amount of data involved.
Is there an efficient way to do this kind of matching?
Thanks!
Graydon

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] Matching multiple names across a list of sequences of names