Hi Guys,
I was trying to use the fn:collectino function today, but I am having trouble understanding the implementation. From past experience, fn:collection($uri) (where $uri points to a document with a list of docs within it) has returned a sequence of all the documents. It appears that BaseX does not implement it this way.
I have a database where I am storing a large number of documents, and I would like to have several collections inside which are subsets of these documents. I need to open these subsets quickly, and I thought fn:collection would allow me to do this. Is there any way to easily open these subsets using fn:collection (or a similar high performance function)? As of right now I am solving the problem by using a for loop to traverse the documents, but this is not fast enough for my needs.
Thanks,
Jeremy
Hi Jeremy,
it’s somewhat surprising, but XQuery itself has no database semantics. This is the reason why the implementation of fn:collection is pretty much implementation-defined [1]. In BaseX, the function checks if the specified URI matches a database. If negative, the URI will be resolved against the file system. I’m not sure why it didn’t turn out to do so in your scenario, but I’m pretty sure you’ll find some answers in our Wiki article on BaseX databases [2]. If not, feel free to give us some more feedback.
Christian
[1] http://www.w3.org/TR/xpath-functions/#func-collection [2] http://docs.basex.org/wiki/Databases ___________________________
Hi Guys,
I was trying to use the fn:collectino function today, but I am having trouble understanding the implementation. From past experience, fn:collection($uri) (where $uri points to a document with a list of docs within it) has returned a sequence of all the documents. It appears that BaseX does not implement it this way.
I have a database where I am storing a large number of documents, and I would like to have several collections inside which are subsets of these documents. I need to open these subsets quickly, and I thought fn:collection would allow me to do this. Is there any way to easily open these subsets using fn:collection (or a similar high performance function)? As of right now I am solving the problem by using a for loop to traverse the documents, but this is not fast enough for my needs.
Thanks,
Jeremy
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Hey Christian,
Thanks for the response. fn:collection is handling as specified in the BaseX documentation, so I guess my question changes into: Is there a method to open a subset of documents (not distinguishable by path) in the database with performance similar to calling db:open($db_name)?
Thanks,
Jeremy
On Thu, Oct 17, 2013 at 1:44 AM, Christian Grün christian.gruen@gmail.comwrote:
Hi Jeremy,
it’s somewhat surprising, but XQuery itself has no database semantics. This is the reason why the implementation of fn:collection is pretty much implementation-defined [1]. In BaseX, the function checks if the specified URI matches a database. If negative, the URI will be resolved against the file system. I’m not sure why it didn’t turn out to do so in your scenario, but I’m pretty sure you’ll find some answers in our Wiki article on BaseX databases [2]. If not, feel free to give us some more feedback.
Christian
[1] http://www.w3.org/TR/xpath-functions/#func-collection [2] http://docs.basex.org/wiki/Databases ___________________________
Hi Guys,
I was trying to use the fn:collectino function today, but I am having trouble understanding the implementation. From past experience, fn:collection($uri) (where $uri points to a document with a list of docs within it) has returned a sequence of all the documents. It appears that BaseX does not implement it this way.
I have a database where I am storing a large number of documents, and I would like to have several collections inside which are subsets of these documents. I need to open these subsets quickly, and I thought
fn:collection
would allow me to do this. Is there any way to easily open these subsets using fn:collection (or a similar high performance function)? As of right now I am solving the problem by using a for loop to traverse the
documents,
but this is not fast enough for my needs.
Thanks,
Jeremy
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Hi Jeremy,
Is there a method to open a subset of documents (not distinguishable by path) in the database with performance similar to calling db:open($db_name)?
Is there any criteria regarding the documents you want to open? If you simply want to choose the first 10 documents, you could try a position filter:
collection("db")[position() = 1 to 10]
Talking about performance: fn:collection and db:open are based on the same code, so there shouldn’t be any difference.
Hope this helps, Christian
Yes, sorry I should have specified the criteria. I have a list of a subset of the documents in the database that need to be opened (I can store this list in any form necessary), but I am experiencing performance problems since I need to iterate over the list in order to filter or choose which documents to open.
Thanks,
Jeremy
On Thu, Oct 17, 2013 at 10:18 AM, Christian Grün christian.gruen@gmail.comwrote:
Hi Jeremy,
Is there a method to open a subset of documents (not distinguishable by path) in the database with performance similar to calling db:open($db_name)?
Is there any criteria regarding the documents you want to open? If you simply want to choose the first 10 documents, you could try a position filter:
collection("db")[position() = 1 to 10]
Talking about performance: fn:collection and db:open are based on the same code, so there shouldn’t be any difference.
Hope this helps, Christian
Hi Jeremy,
if the list is more or less arbitrary, then you’ll indeed have to browse all your documents in order to find the ones that are relevant you. One approach could be to specify a filtering predicate:
let $paths := ("a.xml", "b.xml") return db:open("db")[db:path(.) = $paths]
If this is too slow, string comparisons can be sped up by using a map, as recently proposed on this list:
let $paths := ("a.xml", "b.xml") let $map := map:new( $paths ! { . : true() }) return db:open("db")[$map(db:path(.))]
How many documents are stored in your database?
Best, Christian ___________________________
Yes, sorry I should have specified the criteria. I have a list of a subset of the documents in the database that need to be opened (I can store this list in any form necessary), but I am experiencing performance problems since I need to iterate over the list in order to filter or choose which documents to open.
Thanks,
Jeremy
On Thu, Oct 17, 2013 at 10:18 AM, Christian Grün christian.gruen@gmail.com wrote:
Hi Jeremy,
Is there a method to open a subset of documents (not distinguishable by path) in the database with performance similar to calling db:open($db_name)?
Is there any criteria regarding the documents you want to open? If you simply want to choose the first 10 documents, you could try a position filter:
collection("db")[position() = 1 to 10]
Talking about performance: fn:collection and db:open are based on the same code, so there shouldn’t be any difference.
Hope this helps, Christian
I am dealing with collections of documents in the 2500+ range. The code you suggested is similar to what I have already, but much cleaner. I'm assuming the lookup within the map is done in constant time?
Cheers,
Jeremy
On Thu, Oct 17, 2013 at 10:41 AM, Christian Grün christian.gruen@gmail.comwrote:
Hi Jeremy,
if the list is more or less arbitrary, then you’ll indeed have to browse all your documents in order to find the ones that are relevant you. One approach could be to specify a filtering predicate:
let $paths := ("a.xml", "b.xml") return db:open("db")[db:path(.) = $paths]
If this is too slow, string comparisons can be sped up by using a map, as recently proposed on this list:
let $paths := ("a.xml", "b.xml") let $map := map:new( $paths ! { . : true() }) return db:open("db")[$map(db:path(.))]
How many documents are stored in your database?
Best, Christian ___________________________
Yes, sorry I should have specified the criteria. I have a list of a
subset
of the documents in the database that need to be opened (I can store this list in any form necessary), but I am experiencing performance problems since I need to iterate over the list in order to filter or choose which documents to open.
Thanks,
Jeremy
On Thu, Oct 17, 2013 at 10:18 AM, Christian Grün <
christian.gruen@gmail.com>
wrote:
Hi Jeremy,
Is there a method to open a subset of documents (not distinguishable by path) in the
database
with performance similar to calling db:open($db_name)?
Is there any criteria regarding the documents you want to open? If you simply want to choose the first 10 documents, you could try a position filter:
collection("db")[position() = 1 to 10]
Talking about performance: fn:collection and db:open are based on the same code, so there shouldn’t be any difference.
Hope this helps, Christian
basex-talk@mailman.uni-konstanz.de