Hi Christian --
Your version, slightly tweaked:
declare function local:realPos( $current as element(table:table-cell) ) as xs:integer { let $cells := $current/preceding-sibling::table:table-cell let $repeated := $cells/@table:number-columns-repeated let $total := count($cells) + 1 + sum($repeated) - count($repeated) return xs:integer($total) };
because it wanted to return xs:double, rather than xs:integer, ran, on 8.4.4, in 54036.0 ms. (This is comparable to the previous version where everything was just [4] and I didn't know about the collapsed cells.)
My previous attempt took 1.011152923E7 ms to complete.
I'm going to remember that hint about "bind expressions to variables"! Two whole orders of magnitude improvement.
(And yes, this will always be called on table:table-cell.)
Thank you very much! Graydon
On Tue, May 24, 2016 at 4:29 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Graydon,
This might give you faster results (provided that it does the same as your function, I’m not 100% sure…):
declare function local:realPos( $current as element(table:table-cell) ) as xs:integer { let $cells := $current/preceding-sibling::table:table-cell let $repeated := $cells/@table:number-columns-repeated return count($cells) + 1 + sum($repeated) - count($repeated) };
Some general hinta: It’s recommendable to bind expressions to variables if they will be accessed more than once. Next, fn:path is pretty expensive indeed (I guess you’ve already noticed that by yourself anyway). Finally, if type definitions in function signatures are speficied as strictly as possible, you will get better error messages, and it helps others to understand the code (is it correct that your function will always be called with elements named table:table-cell?).
Hope this helps, Christian
On Tue, May 24, 2016 at 9:47 PM, Graydon Saunders graydonish@gmail.com wrote:
Hi all --
So in Open Document Spreadsheets, columns with empty cells may be
collapsed:
<table:table-row table:style-name="ro2"> <table:table-cell table:number-columns-repeated="3" table:style-name="ce454" /> <table:table-cell table:style-name="ce255" /> <table:table-cell table:style-name="ce255"
office:value-type="string">
<text:p>label</text:p> </table:table-cell> <table:table-cell table:style-name="ce255"
office:value-type="string">
<text:p>description</text:p> </table:table-cell> <table:table-cell table:number-columns-repeated="4"
table:style-name="ce455" /> </table:table-row>
Not all of the rows have have these collapsed empty cells; in some rows,
all
the cells are present because they've got values in them. And while I wouldn't care in the example row, sometimes more than one
table:table-cell
represents a collapsed column in a position I care about.
All the labels are in column D and all the descriptions in column E (in spreadsheet terms), whether all the table:cell elements are present in
the
XML representation of the sheet or not, and I need to reliably find them
so
as to be able to convince myself that the XSLT transform is getting
correct
answers. (The spreadsheet is about 40 MiB; not so much by recent list standards, but still more than I can hope to read. :)
declare function local:realPos($current as node()) as xs:integer { (: the horror, the horror :) xs:integer(path($current) ! tokenize(.,'/')[last()] ! replace(.,'.*[(\p{Nd}+)]','$1') ! xs:integer( .)
sum($current/preceding-sibling::table:table-cell/@table:number-columns-repeated)
count($current/preceding-sibling::table:table-cell/@table:number-columns-repeated)
) };
works, in the sense of "will get the cell corresponding to the intended spreadsheet column even when some of the columns have been collapsed with @table:number-columns-repeated".
It's horribly slow and it hurts just to look at, though, so -- is there a better way?
(The XSLT transform pre-processes all the collapsed table:table-cell elements back into place, among other tidying. I'm not at all sure
that's a
better way with XQuery.)
Thanks! Graydon