Hello,
got a serious problem here. any ideas? thanks
Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk(a)mailman.uni-konstanz.de
Version: BaseX 9.0.2 beta
Java: Oracle Corporation, 1.8.0_151
OS: Linux, amd64
Stack Trace:
java.lang.ArrayIndexOutOfBoundsException: 4288
at org.basex.io.random.TableDiskAccess.read1(TableDiskAccess.java:151)
at org.basex.data.Data.kind(Data.java:294)
at org.basex.query.value.node.DBNode$4.next(DBNode.java:332)
at org.basex.query.value.node.DBNode$4.next(DBNode.java:323)
at org.basex.query.expr.path.IterStep$1.next(IterStep.java:38)
at org.basex.query.expr.path.IterStep$1.next(IterStep.java:32)
at org.basex.query.QueryContext.next(QueryContext.java:392)
at org.basex.query.expr.path.IterPath$1.next(IterPath.java:50)
at org.basex.query.expr.path.IterPath$1.next(IterPath.java:34)
at org.basex.query.expr.ParseExpr.item(ParseExpr.java:58)
at org.basex.query.expr.ParseExpr.atomItem(ParseExpr.java:84)
at org.basex.query.func.fn.FnConcat.item(FnConcat.java:20)
at org.basex.query.expr.ParseExpr.value(ParseExpr.java:71)
at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:177)
at org.basex.query.expr.gflwor.GFLWOR.value(GFLWOR.java:72)
at org.basex.query.expr.path.CachedPath.nodeIter(CachedPath.java:36)
at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:69)
at org.basex.query.expr.gflwor.For$1.next(For.java:107)
at org.basex.query.expr.gflwor.OrderBy$1.sort(OrderBy.java:73)
at org.basex.query.expr.gflwor.OrderBy$1.next(OrderBy.java:54)
at org.basex.query.expr.gflwor.GFLWOR.value(GFLWOR.java:72)
at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:177)
at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:87)
at org.basex.query.QueryContext.next(QueryContext.java:392)
at org.basex.query.scope.MainModule$1.next(MainModule.java:122)
at org.basex.core.cmd.AQuery.query(AQuery.java:94)
at org.basex.core.cmd.XQuery.run(XQuery.java:22)
at org.basex.core.Command.run(Command.java:257)
at org.basex.core.Command.execute(Command.java:93)
at org.basex.server.ClientListener.run(ClientListener.java:140)
Hello.
Briefing:
I want to implement distributed work with BaseX in Hadoop using Apache Spark. Data processing will be divided into the following stages:
1) Splitting XML into chunks
2) Parallel parsing and filling the database
3) Executing queries to make the table (Apache Spark Dataset<Row>)
Stage 1 is a simple algorithmic problem. It will compose a HashMap of (ChunkNumber -> List<Xml_Path>). Each chunk contains no more than 128 MB of data.
Step 2. On each node of the cluster will initialized a standalone instance of BaseX. Every instance of BaseX will recieve files / lines from HDFS to the input. A xml database of each chunk as result will be serialized to HDFS.
Stage 3. When the request of a query is received, each xml database will be sequentially deserialized to apply the query. A table will be composed from the result.
Questions:
1) Send data from HDFS to embedded BaseX:
1.1) Does BaseX support reading data by schemed URI, e.g. `hdfs://home/user/file.xml`?
1.2) Can I send XML from RAM to BaseX?
1.3) Can I send XML lines (line by line) to BaseX?
2) Can I get a database in ram to serialize it in HDFS?
3.1) Do I need to store XML in a persistent path to query it in the future?
3.2) When executing a query on XML in HDFS, can I read it line by line if BaseX does not know how to work with it directly?
Best regards,
Andrei Iatsuk.
Hello,
I used baseX gui to create a database from the following test file:
<main xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href="folder1/file1.xml" />
<xi:include href="folder2/file1.xml" />
</main>
The Result of the baseX gui is correct:
<main xmlns:xi="http://www.w3.org/2001/XInclude">
<data xml:base="folder1/file1.xml">5555</data>
<data xml:base="folder2/file1.xml">6666</data>
</main>
Now, after editing the values, I would like to export the contents of the database to xml files, recreating the same folders and files (I see this information stored in the value of xml:base)
However, using the export function available in the GUI, I am able only to obtain a single file containing the Result shown above. What should I do?
Thank you
Cheers
Marco Randazzo
In the context of trying to do fun things with DITA docs in BaseX I downloaded the latest BaseX (9.0.1) and tried creating a new database and loading docs into it using the BaseX GUI. This is on macOS 10.13.4 with 16GB of hardware RAM available.
My corpus is about 4000 DITA topics totaling about 30MB on disk. They are all in a single directory (not my decision) if that matters.
Using the "parse DTDs" option and default indexing options (no token or full text indexes) I'm finding that even with 12GB of RAM allocated to the JVM the memory usage during load will eventually go to 12GB, at which point the processing appears to stop (that is, whatever I set the max memory to, when it's reached, things stop but I only got out of memory errors when I had much lower settings, like the default 2GB).
I'm currently running a test with 14GB allocated and it is continuing but it does go to 12GB occasionally (watching the memory display on the Add progress panel).
No individual file is that big--the biggest is 150K and typical is 30K or smaller.
I wouldn't expect BaseX to have this kind of memory problem so I'm wondering if maybe there's an issue with memory on macOS or with DITA documents in particular (the DITA DTDs are notoriously large)?
Should I expect BaseX to be able to load this kind of corpus with 14GB of RAM?
Cheers,
E.
--
Eliot Kimber
http://contrext.com
Hello,
am I right, that both XML catalogs and XIncludes get only evaluated at
document import, not at XQuery execution?
If I do
declare base-uri "http://example.com/restxq-framework/src/";
let $file := doc("fragments/xhtml5-page.xhtml")
return $file
and I have an XML Catalog entry like this:
<uri id="RESTXQ-Famework"
name="http://example.com/restxq-framework/src/"
uri="file:///S:/projects/restxqfr/src/"/>
or this:
<rewriteSystem id="RESTXQ-Famework"
systemIdStartString="http://example.com/restxq-framework/src/"
rewritePrefix="file:///S:/projects/restxqfr/src/" />
BaseX tries to load the 'fragments/xhtml5-page.xhtml' from
'example.com/restxq-framework/src/'
--
Goody Bye, Minden jót, Mit freundlichen Grüßen,
Andreas Mixich
Hi,
Somehow, I managed to lock a (test)-database and now I can't get it
unlocked.
Is it possible to manually remove the lock? If so, how?
Cheers,
Ben Engbers
Hi,
This used to work, but it doesn't anymore.
Setup: Create a db called AppResources and put any xsd in it with the root
id set to 'schema-test-validate'.
.xq:
Expected result: validation error.
Result I get:
Stopped at *path*/xsd-validate-test.xq, 6/30:
[FODC0002] Resource 'test-to-delete.xsd' does not exist.
I checked, the xsd does exist on my file system. I also validated content
against it. The .xsd works, it does validate content outside of baseX.
Can you help? Thanks!
--
France Baril
Architecte documentaire / Documentation architect
france.baril(a)architextus.com
Hello,
writing a RESTXQ application, I have the following code in a module:
module namespace page = 'http://localhost/web-page';
declare %rest:path("/list/{$category}")
%rest:GET
%rest:query-param("page:category", "{$category}")
%output:method("xhtml")
%output:omit-xml-declaration("no")
%output:doctype-public("-//W3C//DTD XHTML 1.0 Transitional//EN")
%output:doctype-system("http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd")
function page:list($page:category as xs:string) { () };
I get:
[basex:restxq] Variable $category is not specified as argument.
If I replace *every* '$category' with '$page:category' I get:
[basex:restxq] Variable $page:category is specified more than once.
But the only place I use '$page:category' in this module is at this point.
The rules are, as far as I have understood:
1. In a module, only one namespace can be used for the functions and variables defined therein,
foreign namespaces can be sourced via module imports only.
2. This namespace can not be in the {http://www.w3.org/2005/xquery-local-functions} namespace,
but must go into my own namespace, here {http://localhost/web-page}.
3. All variables must be in the relevant namespaces *(global or local only?)*.
I now have tried several approaches. As I see in the XQM modules in the BaseX DBA application,
it seems, that neither the '%rest:query-param' is needed nor is it needed to prefix variable/parameter
names, that are local to that function. This can be found, for example, in 'dba/common.xqm':
module namespace dba = 'dba/common';
(:~
: Shows a "page not found" error.
: @param $path path to unknown page
: @return page
:)
declare
%rest:path("/dba/{$path}")
%output:method("html")
function dba:unknown(
$path as xs:string
) as element(html) {
html:wrap(
<tr>
<td>
<h2>Page not found:</h2>
<ul>
<li>Page: dba/{ $path }</li>
<li>Method: { Request:method() }</li>
</ul>
</td>
</tr>
)
};
So, I am disturbed by now. What am I doing wrong here?
--
Goody Bye, Minden jót, Mit freundlichen Grüßen,
Andreas Mixich
Hello,
in a module, consisting of many function declarations, how can I
configure a single function to `declare copy-namespaces="no-preserve,
no-inherit";` ?
Thanks.
--
Goody Bye, Minden jót, Mit freundlichen Grüßen,
Andreas Mixich
Hi,
Thanks for version 9!
I am trying to get things running in docker container, and see in the
github issues there have been a few changes such as the directories
BaseX uses [1] which are not reflected in the README yet.
If you like I can put a few things in a pull request while I figure
things out here :-)
~~Rolf.
[1] https://github.com/BaseXdb/basex/issues/1546