…great, that was perfectly easy to reproduce.
If the EXPORT command is used, an alternative option needs to be used … And it’s called EXPORTER [1]. The following script should do the job:
SET CHOP false SET EXPORTER indent=no,omit-xml-declaration=no CREATE DB oshb-morphology morphhb/wlc RUN xquery/oshb-use-qere.xq EXPORT out/oshb
Some notes (just ignore those that you are already aware of, or that may not matter to you):
• If the options are specified in the script, the .basex file can be kept untouched • As the input files have an XML declaration, I have added the omit-xml-declaration parameter • DROP can be… skipped, as CREATE will remove an existing database • If the initial input is specified in the CREATE command, things will be slightly faster
I think we should merge the options SERIALIZER and EXPORTER in a future version of BaseX, as they have already been a source of confusion in the past.
I really appreciate all your help with this!
You are welcome! Christian
[1] https://docs.basex.org/wiki/Options#EXPORTER
On Fri, Jul 16, 2021 at 8:29 PM Jonathan Robie jonathan.robie@gmail.com wrote:
Sure. As I said, I am using these options in .basex:
CHOP = false SERIALIZER = indent=no
I am using data from the wlc subdirectory of this repo:
https://github.com/openscriptures/morphhb
Here is my .bxs file:
DROP DB oshb-morphology CREATE DB oshb-morphology ADD ./morphhb/wlc RUN ./xquery/oshb-use-qere.xq EXPORT ./out/oshb
This is the query (oshb-use-quere.xq):
declare default element namespace "http://www.bibletechnologies.net/2003/OSIS/namespace"; declare default function namespace "http://www.w3.org/2005/xquery-local-functions";
declare function local:get-ketiv($base, $catchword) { let $prev := $base/preceding-sibling::*[1] let $prevstring := fn:string($prev) where $prev and fn:ends-with($catchword, $prevstring) return ( $prev , if ($prevstring != $catchword) then get-ketiv($prev, fn:substring($catchword, 1, fn:string-length($catchword) - fn:string-length($prevstring))) else () ) };
declare updating function local:mark-ketiv($variant) { for $ketiv in get-ketiv($variant, $variant/catchWord) return ( delete node $ketiv/@type, insert node attribute type { fn:string-join(($ketiv/@type, "x-ketiv")," ") } into $ketiv ) };
let $oshb := db:open("oshb-morphology") for $verse in $oshb//verse[note[@type='variant']] for $variant in $verse/note[@type='variant'] return mark-ketiv($variant)
I really appreciate all your help with this!
Jonathan
On Fri, Jul 16, 2021 at 1:55 PM Christian Grün christian.gruen@gmail.com wrote:
Hi Jonathan,
Could you provide us with a little step-by-step description that allows us to reproduce your use case?
Thanks in advance, Christian
Jonathan Robie jonathan.robie@gmail.com schrieb am Fr., 16. Juli 2021, 19:42:
I tried adding these options to .basex:
# Local Options CHOP = false SERIALIZER = indent=no
It still seems to be putting elements on individual lines, as above, and not just for elements that have been modified. Is there a way to prevent this?
Jonathan
On Fri, Jul 16, 2021 at 1:33 PM Jonathan Robie jonathan.robie@gmail.com wrote:
Hmmm, the original repo puts elements smack dab together on the same line to avoid whitespace issues, perhaps using CSS. When I do the update, it puts the updated elements on separate lines:
< <w lemma="1121 a" morph="HNcmsc" id="01PQe">בֶּן</w><seg type="x-maqqef">־</seg><w lemma="3967" morph="HAcbsa" id="01Exo">מֵאָ֥ה</w>
<w lemma="1121 a" morph="HNcmsc" id="01PQe">בֶּן</w> <seg type="x-maqqef">־</seg> <w lemma="3967" morph="HAcbsa" id="01Exo">מֵאָ֥ה</w>
Jonathan
On Fri, Jul 16, 2021 at 11:25 AM Jonathan Robie jonathan.robie@gmail.com wrote:
Yes, that is shorter and more readable. Thanks!
And if I don't have to worry about setting options, that's nicely convenient. Again, thanks!
Jonathan
On Fri, Jul 16, 2021 at 8:53 AM Christian Grün christian.gruen@gmail.com wrote:
Thanks, Jonathan, for the code snippet.
> replace value of node $ketiv/@type with fn:string-join(($ketiv/@type, "x-ketiv"), " ")
This statement should be completely safe, no matter which options you have set. If you want to avoid if/then/else, you can also do the following (but it’s not much shorter):
delete node $ketiv/@type, insert node attribute type { string-join(($ketiv/@type, "x-ketiv"), " ") } into $ketiv