Hi. I'm trying to use the BaseX QueryProcessor with Hadoop to process very large XML files. I use an XmlInputFormat to split the XML files, and process each node with BaseX.
I prefer to use QueryProcessor for this and not the server (each individual element is not very large and I don't want to store the input files on a BaseX server).
The problem is when I use modules in the query, BaseX can't open the module files in the distributed filesystem. I get "[XQST0059] Could not retrieve module …".
I'm passing the query using "new QueryProcessor(query_string, basex_context);", and the bind method for the input.
Is there a way to include the module programmatically so the QueryProcessor doesn't try to open the xquery file? Maybe with org.basex.query.util.pkg.ModuleLoader or QueryContext.module but I din't figure how to do it yet.
Regards
Hi Tiago,
query modules need to be placed either in the file system, or in the repository [1], of a local machine, and (as you already noticed) cannot be passed on by our client APIs. One of the reasons is that many modules have additional dependencies, which would all have to be passed on by a client.
For the beginning, I would recommend you to simply embed the query module functions in your original query. If there are reasons why this seems impracticable to you, please let me know.
Christian
[1] http://docs.basex.org/wiki/Repository ___________________________
On Tue, Aug 7, 2012 at 3:58 AM, Tiago Freitas coolkcah@gmail.com wrote:
Hi. I'm trying to use the BaseX QueryProcessor with Hadoop to process very large XML files. I use an XmlInputFormat to split the XML files, and process each node with BaseX.
I prefer to use QueryProcessor for this and not the server (each individual element is not very large and I don't want to store the input files on a BaseX server).
The problem is when I use modules in the query, BaseX can't open the module files in the distributed filesystem. I get "[XQST0059] Could not retrieve module …".
I'm passing the query using "new QueryProcessor(query_string, basex_context);", and the bind method for the input.
Is there a way to include the module programmatically so the QueryProcessor doesn't try to open the xquery file? Maybe with org.basex.query.util.pkg.ModuleLoader or QueryContext.module but I din't figure how to do it yet.
Regards
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
basex-talk@mailman.uni-konstanz.de