I have a BaseX backed website which stores all the pieces of the page in the DB. There is a significant amount of processing to be done. Pretty much any piece of the page can have effective or expiration dates (so that page parts can come and go on schedule). Some pages have dynamic elements, such as a Twitter feed. The page is build from objects which are wrapped in frames (for adding temporal and styling parameters), which are inserted into columns, which go into rows, which make up a page (which itself is wrapped in a template). So there is a lot of recursion.
I do the recursion using local functions.
It works well, but the page-building query is getting enormous, most of which is all these functions to insert parts of the page, often recursively.
I am wondering what the best practice would be for this. I'm working in Scala, and I could break the process up into repeated passes hitting the db with a different query for each, but it seems to me one big query will almost certainly be faster. But including all the local functions in the query makes it huge.
Is there some way to compile and load the functions once on startup, and then just run a small XQuery that calls functions which call functions, etc.?
Any ideas (or resources) for ways to optimize the query? Everything gets reused, so there is very little nesting -- mostly references are passed. For example:
<pages> <page> <id>1</id> <rows> <row id="2"/> <row id="3"/> </rows> </page> </pages>
<rows> <row> <id>2</id> <columns> <column width="18"> <frames> <frame id="4"/> <frame id="5"/> </frames> </column> <column width="6"> <frames> <frame id="6/> </frames> </column> </columns> </row> </rows>
<frames> <frame> <id>4</id> <contents> <content id="7"/> </contents> </frame> </frames>
<contents> <content type="TEXT"/> <id>7</id> <body> <p>Some text here.</p> </body> </content> <content type="WIDGET"/> <id>7</id> <!-- Widget parameters here --> </content> </contents>
Thanks! Chas.
Hi Chas,
using the following documents in the collection "Chas":
<pages> <page id="1"> <rows> <row ref="2"/> <row ref="3"/> </rows> </page> </pages>
<rows> <row id="2"> <columns> <column width="18"> <frames> <frame ref="4"/> <frame ref="5"/> </frames> </column> <column width="6"> <frames> <frame ref="6"/> </frames> </column> </columns> </row> <row id="3"> <columns> <column width="18"> <frames> <frame ref="4"/> <frame ref="5"/> </frames> </column> </columns> </row> </rows>
<frames> <frame id="4"> <contents> <content ref="7"/> </contents> </frame> <frame id="5"> <contents> <content ref="8"/> </contents> </frame> <frame id="6"> <contents> <content ref="7"/> <content ref="8"/> </contents> </frame> </frames>
and
<contents> <content type="TEXT" id="7"> <body> <p>Some text here.</p> </body> </content> <content type="WIDGET" id="8"> <!-- Widget parameters here --> </content> </contents>
the following query:
declare function local:getIt($n as node()) as node() { if ($n/@ref) then local:getIt(collection('Chas')//*[node-name() eq node-name($n) and @id eq $n/@ref]) else element {node-name($n)} { $n/@*, for $child in $n/* return if ($child instance of element()) then local:getIt($child) else $child} };
local:getIt(<page ref="1"/>)
gives me this result:
<page id="1"> <rows> <row id="2"> <columns> <column width="18"> <frames> <frame id="4"> <contents> <content type="TEXT" id="7"> <body> <p/> </body> </content> </contents> </frame> <frame id="5"> <contents> <content type="WIDGET" id="8"/> </contents> </frame> </frames> </column> <column width="6"> <frames> <frame id="6"> <contents> <content type="TEXT" id="7"> <body> <p/> </body> </content> <content type="WIDGET" id="8"/> </contents> </frame> </frames> </column> </columns> </row> <row id="3"> <columns> <column width="18"> <frames> <frame id="4"> <contents> <content type="TEXT" id="7"> <body> <p/> </body> </content> </contents> </frame> <frame id="5"> <contents> <content type="WIDGET" id="8"/> </contents> </frame> </frames> </column> </columns> </row> </rows> </page>
Is that what you are looking for? (I've replaced the @id attributes with @ref attributes and the <id/> elements with @id attributes and fixed up the <content/> elements, but otherwise it's roughly the same.) It doesn't have all the bells and whistles yet and doesn't even check if an element can be found in the database, but these things can be added later.
Regards,
Huib Verweij.
Op 29 dec 2010, om 04:47 heeft Charles F. Munat het volgende geschreven:
I have a BaseX backed website which stores all the pieces of the page in the DB. There is a significant amount of processing to be done. Pretty much any piece of the page can have effective or expiration dates (so that page parts can come and go on schedule). Some pages have dynamic elements, such as a Twitter feed. The page is build from objects which are wrapped in frames (for adding temporal and styling parameters), which are inserted into columns, which go into rows, which make up a page (which itself is wrapped in a template). So there is a lot of recursion.
I do the recursion using local functions.
It works well, but the page-building query is getting enormous, most of which is all these functions to insert parts of the page, often recursively.
I am wondering what the best practice would be for this. I'm working in Scala, and I could break the process up into repeated passes hitting the db with a different query for each, but it seems to me one big query will almost certainly be faster. But including all the local functions in the query makes it huge.
Is there some way to compile and load the functions once on startup, and then just run a small XQuery that calls functions which call functions, etc.?
Any ideas (or resources) for ways to optimize the query? Everything gets reused, so there is very little nesting -- mostly references are passed. For example:
<pages> <page> <id>1</id> <rows> <row id="2"/> <row id="3"/> </rows> </page> </pages>
<rows> <row> <id>2</id> <columns> <column width="18"> <frames> <frame id="4"/> <frame id="5"/> </frames> </column> <column width="6"> <frames> <frame id="6/> </frames> </column> </columns> </row> </rows>
<frames> <frame> <id>4</id> <contents> <content id="7"/> </contents> </frame> </frames>
<contents> <content type="TEXT"/> <id>7</id> <body> <p>Some text here.</p> </body> </content> <content type="WIDGET"/> <id>7</id> <!-- Widget parameters here --> </content> </contents>
Thanks! Chas. _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Hey, Huib, this is great. One of the things I was asking was for how others would do it. I'll compare this to my own function and see what I can learn.
The other question I had was if it was possible to pre-load the embedded database with the functions, rather than having to load them all at the time the xquery is run. Sort of the way functions or triggers are preloaded into an RDBMS. Do you happen to know if that's possible?
Thanks much for the function! I'll study it carefully. It looks a bit different from my own, so I'm sure I can learn from it.
Chas.
On 12/29/2010 2:17 AM, Huib Verweij wrote:
Hi Chas,
using the following documents in the collection "Chas":
<pages> <pageid="1"> <rows> <rowref="2"/> <rowref="3"/> </rows> </page> </pages>
<rows> <rowid="2"> <columns> <columnwidth="18"> <frames> <frameref="4"/> <frameref="5"/> </frames> </column> <columnwidth="6"> <frames> <frameref="6"/> </frames> </column> </columns> </row> <rowid="3"> <columns> <columnwidth="18"> <frames> <frameref="4"/> <frameref="5"/> </frames> </column> </columns> </row> </rows>
<frames> <frameid="4"> <contents> <contentref="7"/> </contents> </frame> <frameid="5"> <contents> <contentref="8"/> </contents> </frame> <frameid="6"> <contents> <contentref="7"/> <contentref="8"/> </contents> </frame> </frames>
and
<contents> <contenttype="TEXT"id="7"> <body> <p>Some text here.</p> </body> </content> <contenttype="WIDGET"id="8"> <!-- Widget parameters here --> </content> </contents>
the following query:
declare function local:getIt($n as node()) as node() { if ($n/@ref) then local:getIt(collection('Chas')//*[node-name() eq node-name($n) and @id eq $n/@ref]) else element {node-name($n)} { $n/@*, for $child in $n/* return if ($child instance of element()) then local:getIt($child) else $child} };
local:getIt(<page ref="1"/>)
gives me this result:
<pageid="1">
<rows> <rowid="2"> <columns> <columnwidth="18"> <frames> <frameid="4"> <contents> <contenttype="TEXT"id="7"> <body> <p/> </body> </content> </contents> </frame> <frameid="5"> <contents> <contenttype="WIDGET"id="8"/> </contents> </frame> </frames> </column> <columnwidth="6"> <frames> <frameid="6"> <contents> <contenttype="TEXT"id="7"> <body> <p/> </body> </content> <contenttype="WIDGET"id="8"/> </contents> </frame> </frames> </column> </columns> </row> <rowid="3"> <columns> <columnwidth="18"> <frames> <frameid="4"> <contents> <contenttype="TEXT"id="7"> <body> <p/> </body> </content> </contents> </frame> <frameid="5"> <contents> <contenttype="WIDGET"id="8"/> </contents> </frame> </frames> </column> </columns> </row> </rows> </page>
Is that what you are looking for? (I've replaced the @id attributes with @ref attributes and the <id/> elements with @id attributes and fixed up the <content/> elements, but otherwise it's roughly the same.) It doesn't have all the bells and whistles yet and doesn't even check if an element can be found in the database, but these things can be added later.
Regards,
Huib Verweij.
Op 29 dec 2010, om 04:47 heeft Charles F. Munat het volgende geschreven:
I have a BaseX backed website which stores all the pieces of the page in the DB. There is a significant amount of processing to be done. Pretty much any piece of the page can have effective or expiration dates (so that page parts can come and go on schedule). Some pages have dynamic elements, such as a Twitter feed. The page is build from objects which are wrapped in frames (for adding temporal and styling parameters), which are inserted into columns, which go into rows, which make up a page (which itself is wrapped in a template). So there is a lot of recursion.
I do the recursion using local functions.
It works well, but the page-building query is getting enormous, most of which is all these functions to insert parts of the page, often recursively.
I am wondering what the best practice would be for this. I'm working in Scala, and I could break the process up into repeated passes hitting the db with a different query for each, but it seems to me one big query will almost certainly be faster. But including all the local functions in the query makes it huge.
Is there some way to compile and load the functions once on startup, and then just run a small XQuery that calls functions which call functions, etc.?
Any ideas (or resources) for ways to optimize the query? Everything gets reused, so there is very little nesting -- mostly references are passed. For example:
<pages> <page> <id>1</id> <rows> <row id="2"/> <row id="3"/> </rows> </page> </pages>
<rows> <row> <id>2</id> <columns> <column width="18"> <frames> <frame id="4"/> <frame id="5"/> </frames> </column> <column width="6"> <frames> <frame id="6/> </frames> </column> </columns> </row> </rows>
<frames> <frame> <id>4</id> <contents> <content id="7"/> </contents> </frame> </frames>
<contents> <content type="TEXT"/> <id>7</id> <body> <p>Some text here.</p> </body> </content> <content type="WIDGET"/> <id>7</id> <!-- Widget parameters here --> </content> </contents>
Thanks! Chas. _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de mailto:BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Hi Chas, (and hi Huib),
thanks for your input!
At the moment preloading & compiling queries ahead of time is not possible. The main reason for this is that, from our experience, queries are usually rather short and cheap to compile.
Providing an infrastructure that takes care of precompiled queries might present quite a challenge, yet it should not be impossible.
You two probably knew this, but it might be of interest to the list as well, there's a very basic benchmark mechanism:
SET RUNS N
Now every query you issue is run N-times, including parsing & compiling. You will get average duration values for each step, if you enable the query info view.
This is not representative, but on my machine the 40KB functx library [1] takes an avg of ~13ms to parse and ~7ms to compile, so the overhead introduced by reparsing/recompiling should be rather low in general.
In case you experience particular problems with specific queries feel free to contact us. Often tweaking the query a little, such that possible index optimizations are recognized correctly by our compiler will speed up queries considerably.
I hope this helps, feel free to ask for more! Feedback is very welcome :-)
Kind regards Michael
Am 29.12.2010 um 07:40 schrieb Charles F. Munat:
The other question I had was if it was possible to pre-load the embedded database with the functions, rather than having to load them all at the time the xquery is run. Sort of the way functions or triggers are preloaded into an RDBMS. Do you happen to know if that's possible?
Hi Chas,
probably the run command can help you (if you dont use it), so the whole query is stored in a textfile and the size of the query shouldnt be a problem.
run path/to/xquery.xq
Kind regards, Andreas
Am 29.12.10 11:08, schrieb Michael Seiferle:
Hi Chas, (and hi Huib),
thanks for your input!
At the moment preloading& compiling queries ahead of time is not possible. The main reason for this is that, from our experience, queries are usually rather short and cheap to compile.
Providing an infrastructure that takes care of precompiled queries might present quite a challenge, yet it should not be impossible.
You two probably knew this, but it might be of interest to the list as well, there's a very basic benchmark mechanism:
SET RUNS N
Now every query you issue is run N-times, including parsing& compiling. You will get average duration values for each step, if you enable the query info view.
This is not representative, but on my machine the 40KB functx library [1] takes an avg of ~13ms to parse and ~7ms to compile, so the overhead introduced by reparsing/recompiling should be rather low in general.
In case you experience particular problems with specific queries feel free to contact us. Often tweaking the query a little, such that possible index optimizations are recognized correctly by our compiler will speed up queries considerably.
I hope this helps, feel free to ask for more! Feedback is very welcome :-)
Kind regards Michael
Am 29.12.2010 um 07:40 schrieb Charles F. Munat:
The other question I had was if it was possible to pre-load the embedded database with the functions, rather than having to load them all at the time the xquery is run. Sort of the way functions or triggers are preloaded into an RDBMS. Do you happen to know if that's possible?
[1] http://www.xqueryfunctions.com/xq/download.html _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Good to know. I'm not worried about performance that much. I'm sure I can optimize the query(s) once I get them all figured out. What would have made the ability to store functions nice is that there are many that are reused in different queries, so currently I have to have multiple copies. Then if I make a change, I have to remember to make it to all of them. Not a big deal, just a matter of efficiency. But I can work around this on my end by storing the functions in one spot and then just inserting them programatically as necessary into the query.
When I get to optimizing, I will gladly take advantage of your offer. Huib has already given me some good ideas.
Chas.
On 12/29/2010 7:08 AM, Michael Seiferle wrote:
Hi Chas, (and hi Huib),
thanks for your input!
At the moment preloading& compiling queries ahead of time is not possible. The main reason for this is that, from our experience, queries are usually rather short and cheap to compile.
Providing an infrastructure that takes care of precompiled queries might present quite a challenge, yet it should not be impossible.
You two probably knew this, but it might be of interest to the list as well, there's a very basic benchmark mechanism:
SET RUNS N
Now every query you issue is run N-times, including parsing& compiling. You will get average duration values for each step, if you enable the query info view.
This is not representative, but on my machine the 40KB functx library [1] takes an avg of ~13ms to parse and ~7ms to compile, so the overhead introduced by reparsing/recompiling should be rather low in general.
In case you experience particular problems with specific queries feel free to contact us. Often tweaking the query a little, such that possible index optimizations are recognized correctly by our compiler will speed up queries considerably.
I hope this helps, feel free to ask for more! Feedback is very welcome :-)
Kind regards Michael
Am 29.12.2010 um 07:40 schrieb Charles F. Munat:
The other question I had was if it was possible to pre-load the embedded database with the functions, rather than having to load them all at the time the xquery is run. Sort of the way functions or triggers are preloaded into an RDBMS. Do you happen to know if that's possible?
Hi It seems like client side library manager would be sort of option. Possibly with components like:
1. library storage (central? some sort of source control system, e.g. Mercurial?) 2. merger - would take "live" XQuery and merge in any function from library, which seems appropriate 1. real time one - having libraries somewhere at hand and inserting them there 1. If you would use REST access to BaseX, it could be implemented at web server as filter 2. Or as I use Python API, I would modify the library to intercept any XQuery command, and enhance the XQuery content at this place. 2. preprocessing way - building final XQuery once at the moment source code is written and before it is run multiple times
Just some ideas.
Jan
2010/12/29 Charles F. Munat charles.munat@gmail.com
Good to know. I'm not worried about performance that much. I'm sure I can optimize the query(s) once I get them all figured out. What would have made the ability to store functions nice is that there are many that are reused in different queries, so currently I have to have multiple copies. Then if I make a change, I have to remember to make it to all of them. Not a big deal, just a matter of efficiency. But I can work around this on my end by storing the functions in one spot and then just inserting them programatically as necessary into the query.
When I get to optimizing, I will gladly take advantage of your offer. Huib has already given me some good ideas.
Chas.
On 12/29/2010 7:08 AM, Michael Seiferle wrote:
Hi Chas, (and hi Huib),
thanks for your input!
At the moment preloading& compiling queries ahead of time is not possible. The main reason for this is that, from our experience, queries are usually rather short and cheap to compile.
Providing an infrastructure that takes care of precompiled queries might present quite a challenge, yet it should not be impossible.
You two probably knew this, but it might be of interest to the list as well, there's a very basic benchmark mechanism:
SET RUNS N
Now every query you issue is run N-times, including parsing& compiling. You will get average duration values for each step, if you enable the query info view.
This is not representative, but on my machine the 40KB functx library [1] takes an avg of ~13ms to parse and ~7ms to compile, so the overhead introduced by reparsing/recompiling should be rather low in general.
In case you experience particular problems with specific queries feel free to contact us. Often tweaking the query a little, such that possible index optimizations are recognized correctly by our compiler will speed up queries considerably.
I hope this helps, feel free to ask for more! Feedback is very welcome :-)
Kind regards Michael
Am 29.12.2010 um 07:40 schrieb Charles F. Munat:
The other question I had was if it was possible to pre-load the embedded
database with the functions, rather than having to load them all at the time the xquery is run. Sort of the way functions or triggers are preloaded into an RDBMS. Do you happen to know if that's possible?
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
I'm currently using 2.2. Works fine, but I'd like to chop the queries into reusable pieces. Will try that soon.
Chas.
On 12/29/2010 1:48 PM, Jan Vlčinský (CAD) wrote:
Hi It seems like client side library manager would be sort of option. Possibly with components like:
- library storage (central? some sort of source control system, e.g. Mercurial?)
- merger - would take "live" XQuery and merge in any function from library, which seems appropriate
- real time one - having libraries somewhere at hand and inserting them there
- If you would use REST access to BaseX, it could be implemented at web server as filter
- Or as I use Python API, I would modify the library to intercept any XQuery command, and enhance the XQuery content at this place.
- preprocessing way - building final XQuery once at the moment source code is written and before it is run multiple times
Just some ideas.
Jan
2010/12/29 Charles F. Munat <charles.munat@gmail.com mailto:charles.munat@gmail.com>
Good to know. I'm not worried about performance that much. I'm sure I can optimize the query(s) once I get them all figured out. What would have made the ability to store functions nice is that there are many that are reused in different queries, so currently I have to have multiple copies. Then if I make a change, I have to remember to make it to all of them. Not a big deal, just a matter of efficiency. But I can work around this on my end by storing the functions in one spot and then just inserting them programatically as necessary into the query. When I get to optimizing, I will gladly take advantage of your offer. Huib has already given me some good ideas. Chas. On 12/29/2010 7:08 AM, Michael Seiferle wrote: Hi Chas, (and hi Huib), thanks for your input! At the moment preloading& compiling queries ahead of time is not possible. The main reason for this is that, from our experience, queries are usually rather short and cheap to compile. Providing an infrastructure that takes care of precompiled queries might present quite a challenge, yet it should not be impossible. You two probably knew this, but it might be of interest to the list as well, there's a very basic benchmark mechanism: SET RUNS N Now every query you issue is run N-times, including parsing& compiling. You will get average duration values for each step, if you enable the query info view. This is not representative, but on my machine the 40KB functx library [1] takes an avg of ~13ms to parse and ~7ms to compile, so the overhead introduced by reparsing/recompiling should be rather low in general. In case you experience particular problems with specific queries feel free to contact us. Often tweaking the query a little, such that possible index optimizations are recognized correctly by our compiler will speed up queries considerably. I hope this helps, feel free to ask for more! Feedback is very welcome :-) Kind regards Michael Am 29.12.2010 um 07:40 schrieb Charles F. Munat: The other question I had was if it was possible to pre-load the embedded database with the functions, rather than having to load them all at the time the xquery is run. Sort of the way functions or triggers are preloaded into an RDBMS. Do you happen to know if that's possible? [1] http://www.xqueryfunctions.com/xq/download.html _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de <mailto:BaseX-Talk@mailman.uni-konstanz.de> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
-- *Ing. Jan Vlčinský* CAD programy Slunečnicová 338/3, 734 01 Karviná Ráj, Czech Republic tel: +420-597 602 024; mob: +420-608 979 040 skype: janvlcinsky; GoogleTalk: jan.vlcinsky@gmail.com mailto:jan.vlcinsky@gmail.com http://cz.linkedin.com/in/vlcinsky
A little while ago I asked something similar. Even when it's not a performance concern, then at least it very quickly gets to be a developer's concern when each query is lead by a huge amount of unused functions.
What I did was the following, functions are stored in a function element:
<function name='ar:A'> declare function ar:A ( $arg as xs:string ) as xs:string { let $result := collection()//variable[@name=$arg]/@value return ( if ($result) then $result/string() else fn:concat('"',fn:concat($arg,' not found"')) ) } </function> <function name='ar:B> <use name="ar:A"/> declare function ar:B ( $arg as xs:string ) as xs:string { ar:A($arg) } </function>
and then when I call it:
<query> <use name="ar:B"/> ar:B('test') </query>
and then I have an XSLT script that transforms the above by replacing the <use> element with the text-node of the <function> element with the same name at the time the XML file is read in. This is just an unrealistically simple example of course. It works for me because my queries are stored in an XML file. Otherwise you may have to invent a similar method for your particular need.
You may also want to check out the improved error-reporting I posted a few threads back.
Mark Boon
(I hope GMail didn't mangle the formatting too much)
On Wed, Dec 29, 2010 at 6:32 AM, Charles F. Munat charles.munat@gmail.com wrote:
Good to know. I'm not worried about performance that much. I'm sure I can optimize the query(s) once I get them all figured out. What would have made the ability to store functions nice is that there are many that are reused in different queries, so currently I have to have multiple copies. Then if I make a change, I have to remember to make it to all of them. Not a big deal, just a matter of efficiency. But I can work around this on my end by storing the functions in one spot and then just inserting them programatically as necessary into the query.
When I get to optimizing, I will gladly take advantage of your offer. Huib has already given me some good ideas.
Chas.
On 12/29/2010 7:08 AM, Michael Seiferle wrote:
Hi Chas, (and hi Huib),
thanks for your input!
At the moment preloading& compiling queries ahead of time is not possible. The main reason for this is that, from our experience, queries are usually rather short and cheap to compile.
Providing an infrastructure that takes care of precompiled queries might present quite a challenge, yet it should not be impossible.
You two probably knew this, but it might be of interest to the list as well, there's a very basic benchmark mechanism:
SET RUNS N
Now every query you issue is run N-times, including parsing& compiling. You will get average duration values for each step, if you enable the query info view.
This is not representative, but on my machine the 40KB functx library [1] takes an avg of ~13ms to parse and ~7ms to compile, so the overhead introduced by reparsing/recompiling should be rather low in general.
In case you experience particular problems with specific queries feel free to contact us. Often tweaking the query a little, such that possible index optimizations are recognized correctly by our compiler will speed up queries considerably.
I hope this helps, feel free to ask for more! Feedback is very welcome :-)
Kind regards Michael
Am 29.12.2010 um 07:40 schrieb Charles F. Munat:
The other question I had was if it was possible to pre-load the embedded database with the functions, rather than having to load them all at the time the xquery is run. Sort of the way functions or triggers are preloaded into an RDBMS. Do you happen to know if that's possible?
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
This is an interesting idea, if a bit complex. I'll try it out, but I can also simply create the various functions in the business layer, and then assemble queries as necessary.
Thanks!
Chas.
On 12/29/2010 10:48 PM, Mark Boon wrote:
A little while ago I asked something similar. Even when it's not a performance concern, then at least it very quickly gets to be a developer's concern when each query is lead by a huge amount of unused functions.
What I did was the following, functions are stored in a function element:
<function name='ar:A'> declare function ar:A ( $arg as xs:string ) as xs:string { let $result := collection()//variable[@name=$arg]/@value return ( if ($result) then $result/string() else
fn:concat('"',fn:concat($arg,' not found"')) ) } </function> <function name='ar:B> <use name="ar:A"/> declare function ar:B ( $arg as xs:string ) as xs:string { ar:A($arg) } </function>
and then when I call it:
<query> <use name="ar:B"/> ar:B('test') </query>
and then I have an XSLT script that transforms the above by replacing the<use> element with the text-node of the<function> element with the same name at the time the XML file is read in. This is just an unrealistically simple example of course. It works for me because my queries are stored in an XML file. Otherwise you may have to invent a similar method for your particular need.
You may also want to check out the improved error-reporting I posted a few threads back.
Mark Boon
(I hope GMail didn't mangle the formatting too much)
On Wed, Dec 29, 2010 at 6:32 AM, Charles F. Munat charles.munat@gmail.com wrote:
Good to know. I'm not worried about performance that much. I'm sure I can optimize the query(s) once I get them all figured out. What would have made the ability to store functions nice is that there are many that are reused in different queries, so currently I have to have multiple copies. Then if I make a change, I have to remember to make it to all of them. Not a big deal, just a matter of efficiency. But I can work around this on my end by storing the functions in one spot and then just inserting them programatically as necessary into the query.
When I get to optimizing, I will gladly take advantage of your offer. Huib has already given me some good ideas.
Chas.
On 12/29/2010 7:08 AM, Michael Seiferle wrote:
Hi Chas, (and hi Huib),
thanks for your input!
At the moment preloading& compiling queries ahead of time is not possible. The main reason for this is that, from our experience, queries are usually rather short and cheap to compile.
Providing an infrastructure that takes care of precompiled queries might present quite a challenge, yet it should not be impossible.
You two probably knew this, but it might be of interest to the list as well, there's a very basic benchmark mechanism:
SET RUNS N
Now every query you issue is run N-times, including parsing& compiling. You will get average duration values for each step, if you enable the query info view.
This is not representative, but on my machine the 40KB functx library [1] takes an avg of ~13ms to parse and ~7ms to compile, so the overhead introduced by reparsing/recompiling should be rather low in general.
In case you experience particular problems with specific queries feel free to contact us. Often tweaking the query a little, such that possible index optimizations are recognized correctly by our compiler will speed up queries considerably.
I hope this helps, feel free to ask for more! Feedback is very welcome :-)
Kind regards Michael
Am 29.12.2010 um 07:40 schrieb Charles F. Munat:
The other question I had was if it was possible to pre-load the embedded database with the functions, rather than having to load them all at the time the xquery is run. Sort of the way functions or triggers are preloaded into an RDBMS. Do you happen to know if that's possible?
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Op 30 dec 2010, om 04:28 heeft Charles F. Munat het volgende geschreven:
This is an interesting idea, if a bit complex.
Mark's idea does not seem too complex to me, I think it depends on the environment you're working in. I use a similar trick, storing the XQueries in XSLT templates and calling these when needed. For example:
<xsl:template name="auth"> declare function {...}; </xsl:template>
and then when compiling the query using XSLT:
<query> declare namespace ... ; <xsl:call-template name="auth"/> </query>
and then I send it off to the XML database (eXist or BaseX) using HTTP. Using Cocoon this is very easy and it gives me a lot of control and no maintenance costs (compared to managing stored queries f.i). There are drawbacks of course, not the least of which is that oXygen doesn't recognize my XQueries stored inside a XSLT stylesheet and doesn't do syntax highlighting etc.
As was said earlier, I wouldn't worry too much about creating large XQueries, the compiler is fast and as time goes by and the dataset grows most time will be spent executing the query. OTOH, sometimes every millisecond counts.
Huib.
This is nice. I like the changes you made. @ref makes more sense, and then moving the id element to an attribute is still clear. Thanks! Very nice.
Chas.
On 12/29/2010 2:17 AM, Huib Verweij wrote:
Hi Chas,
using the following documents in the collection "Chas":
<pages> <pageid="1"> <rows> <rowref="2"/> <rowref="3"/> </rows> </page> </pages>
<rows> <rowid="2"> <columns> <columnwidth="18"> <frames> <frameref="4"/> <frameref="5"/> </frames> </column> <columnwidth="6"> <frames> <frameref="6"/> </frames> </column> </columns> </row> <rowid="3"> <columns> <columnwidth="18"> <frames> <frameref="4"/> <frameref="5"/> </frames> </column> </columns> </row> </rows>
<frames> <frameid="4"> <contents> <contentref="7"/> </contents> </frame> <frameid="5"> <contents> <contentref="8"/> </contents> </frame> <frameid="6"> <contents> <contentref="7"/> <contentref="8"/> </contents> </frame> </frames>
and
<contents> <contenttype="TEXT"id="7"> <body> <p>Some text here.</p> </body> </content> <contenttype="WIDGET"id="8"> <!-- Widget parameters here --> </content> </contents>
the following query:
declare function local:getIt($n as node()) as node() { if ($n/@ref) then local:getIt(collection('Chas')//*[node-name() eq node-name($n) and @id eq $n/@ref]) else element {node-name($n)} { $n/@*, for $child in $n/* return if ($child instance of element()) then local:getIt($child) else $child} };
local:getIt(<page ref="1"/>)
gives me this result:
<pageid="1">
<rows> <rowid="2"> <columns> <columnwidth="18"> <frames> <frameid="4"> <contents> <contenttype="TEXT"id="7"> <body> <p/> </body> </content> </contents> </frame> <frameid="5"> <contents> <contenttype="WIDGET"id="8"/> </contents> </frame> </frames> </column> <columnwidth="6"> <frames> <frameid="6"> <contents> <contenttype="TEXT"id="7"> <body> <p/> </body> </content> <contenttype="WIDGET"id="8"/> </contents> </frame> </frames> </column> </columns> </row> <rowid="3"> <columns> <columnwidth="18"> <frames> <frameid="4"> <contents> <contenttype="TEXT"id="7"> <body> <p/> </body> </content> </contents> </frame> <frameid="5"> <contents> <contenttype="WIDGET"id="8"/> </contents> </frame> </frames> </column> </columns> </row> </rows> </page>
Is that what you are looking for? (I've replaced the @id attributes with @ref attributes and the <id/> elements with @id attributes and fixed up the <content/> elements, but otherwise it's roughly the same.) It doesn't have all the bells and whistles yet and doesn't even check if an element can be found in the database, but these things can be added later.
Regards,
Huib Verweij.
Op 29 dec 2010, om 04:47 heeft Charles F. Munat het volgende geschreven:
I have a BaseX backed website which stores all the pieces of the page in the DB. There is a significant amount of processing to be done. Pretty much any piece of the page can have effective or expiration dates (so that page parts can come and go on schedule). Some pages have dynamic elements, such as a Twitter feed. The page is build from objects which are wrapped in frames (for adding temporal and styling parameters), which are inserted into columns, which go into rows, which make up a page (which itself is wrapped in a template). So there is a lot of recursion.
I do the recursion using local functions.
It works well, but the page-building query is getting enormous, most of which is all these functions to insert parts of the page, often recursively.
I am wondering what the best practice would be for this. I'm working in Scala, and I could break the process up into repeated passes hitting the db with a different query for each, but it seems to me one big query will almost certainly be faster. But including all the local functions in the query makes it huge.
Is there some way to compile and load the functions once on startup, and then just run a small XQuery that calls functions which call functions, etc.?
Any ideas (or resources) for ways to optimize the query? Everything gets reused, so there is very little nesting -- mostly references are passed. For example:
<pages> <page> <id>1</id> <rows> <row id="2"/> <row id="3"/> </rows> </page> </pages>
<rows> <row> <id>2</id> <columns> <column width="18"> <frames> <frame id="4"/> <frame id="5"/> </frames> </column> <column width="6"> <frames> <frame id="6/> </frames> </column> </columns> </row> </rows>
<frames> <frame> <id>4</id> <contents> <content id="7"/> </contents> </frame> </frames>
<contents> <content type="TEXT"/> <id>7</id> <body> <p>Some text here.</p> </body> </content> <content type="WIDGET"/> <id>7</id> <!-- Widget parameters here --> </content> </contents>
Thanks! Chas. _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de mailto:BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
basex-talk@mailman.uni-konstanz.de