Hi,
I just joined the mailing list due to a problem I'm having displaying and storing special characters.
I started with a CSV and created a database from it and the CSV is in UTF-8. However, when I query the special characters become garbled. I'm using the GUI in Windows 10.
It starts with this in the CSV:
<name>Cañelas</name>
Then ends up with this when I export the query result into a text file:
<name>Ca�las</name>
Help please.
Bit
> Java heap space at C:\Users\lourduswamym\oracle_scripts/BaseXClient.pm line
> 213.
How much memory have you assigned to BaseX [1]?
> Actually the error on the server log. The insert operation was taking 1+minutes due to slow network and then the error happened. I have been running this server instance continuously for 1 day with continuous insert from a per script.
If you create a new database, you can specify a directory with the
files that are to be initially added to your database. This might save
you a lot of time.
> So I think I should be careful to run only one single process of client at a
> time or make sure memory is enough to handle multiple client operations,
> please let me know your thoughts,
If your documents are small, you shouldn’t usually encounter any
errors. But it might have to do with the large number of documents you
are dealing with. One more option is to distribute your docs across
multiple databases (you can access all of them via a single XQuery
expression).
Best,
Christian
[1] http://docs.basex.org/wiki/Start_Scripts
> Thanks again,
> Regards
> Martin Lourduswamy
>
> On Sun, May 20, 2018 at 9:48 AM, Martin Lourduswamy <martin.louis(a)gmail.com>
> wrote:
>>
>> Hi,
>>
>> Thanks for the clarifications. My database is
>>
>> DB Size:
>> ======
>> 1. each XML node => 500 bytes
>> 2. total size of XML DB => 10million nodes(which will grow continuously as
>> new files are added).
>> 3. I will be having multiple DB's of same size and more size as the DB
>> continuously grows
>>
>> I have to do a update of the nodes as I need to replace them. I chose
>> delete + insert instead of replace for each of the node through perl, as I
>> will make sure there are no duplicate nodes left in the database.(just a
>> precaution that even if some nodes get duplicated, through some other
>> processes I will be able to remove them by delete)
>>
>> DB Speed:
>> ========
>> I will disable logging as it might speed up the database operations.
>>
>> DB Architecture
>> ============
>> 1 server(windows or Linux) running baseX
>> multiple client computers trying to query the same machine for select
>> through fn:doc(...) API to get the data through perl
>> then do a delete + insert from perl
>>
>> I would like to know if you think this might work for production and any
>> suggestion for data retrieval and update
>> Any changes to the way I architecture it might help
>>
>> Also, I am thinking of indexing on the node id, as that is like the
>> primary key on which I base my operations, any suggestions might also be
>> helpful,
>> Thanks again,
>> Regards
>> Martin Lourduswamy
>>
>> On Sun, May 20, 2018 at 9:28 AM, Christian Grün
>> <christian.gruen(a)gmail.com> wrote:
>>>
>>> Hi Martin,
>>>
>>> > I am new to BaseX, I would like ot speed up XQuery of insert and delete
>>> > and
>>> > replace through options.
>>>
>>> Welcome to the list. As there are numerous ways to do updates in
>>> BaseX, feel free to give us more information on your insert and delete
>>> operations. Do you work with large single documents or many small
>>> documents? How large is your database?
>>>
>>> > While I query basex through perl, and try to
>>> > connect through GUI, the perl connection aborts. Is there a parameters
>>> > for
>>> > parallel connections, please let me know.
>>>
>>> I can’t tell why your perl connection is interrupted by opening the
>>> GUI (because they should be completely independent from each other). A
>>> step-by-step description on how you proceeded might be helpful.
>>>
>>> Because there is no coupling between GUI and the client/server
>>> architecture, however, you should avoid running updates outside the
>>> GUI. Please check out [1] for more information.
>>>
>>> Best,
>>> Christian
>>>
>>> [1] http://docs.basex.org/wiki/Startup#Concurrent_Operations
>>>
>>>
>>>
>>> >
>>> > Thanks,
>>> > Regards
>>> > Martin Lourduswamy
>>
>>
>
Hi All,
I have to set indent='no' at the database label how can i achieve this, i
have gone thorough the documentation and found that either we can do:
(1)
declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization
<https://www.google.com/url?q=http://www.w3.org/2010/xslt-xquery-serializati…>
";
declare option output:indent "no";
but the problem is that all the XQUERY module i have library module so it
is throwing error.
so is there any option.
Regards
Dharmendra Kumar Singh
I'm using the C# driver. but still, it doesn't look like a high level
problem. neither a query problem. thats why I didn't post some code
snippets. To make sure, I just tried to execute the INSPECT command in the
CLI. same exception. Also the db:optimize fails with the same exception,
although with a different stacktrace, which I will post below.
Altough the race condition point of two processes interfering is a good
point, I'm currently having trouble to do anything useful with a session
from which a race condition would matter. If, then it must have happened
while the import was running along with the other, non-importing process.
As I mentioned in the first post, while the import process was still
ongoing, the other process already started to have the exception at a very
late progress of the import. After the import was done, I was still able to
operate on the db as normal, but only with the session in the process which
did the long-run import.
To restore the second process its session state to normal, I restarted the
webapplication, which closes both sessions. After that, I'm no longer to
operate on the db. Not even directly from the CLI.
Seems it is a sort of a persisted data state which the low level basex code
can't handle when trying to reread it into to the memory.
I 've already googled around for the same exception in BaseX, seems it was
an issue in some older versions too.
here the stacktrace when running db:optimize:
Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk(a)mailman.uni-konstanz.de
Version: BaseX 9.0.2 beta
Java: Oracle Corporation, 1.8.0_151
OS: Linux, amd64
Stack Trace:
java.lang.ArrayIndexOutOfBoundsException: 52
at org.basex.util.hash.TokenSet.key(TokenSet.java:128)
at org.basex.data.Data.name(Data.java:388)
at org.basex.io.serial.Serializer.node(Serializer.java:414)
at org.basex.io.serial.Serializer.node(Serializer.java:158)
at org.basex.io.serial.Serializer.node(Serializer.java:345)
at org.basex.io.serial.Serializer.node(Serializer.java:158)
at org.basex.io.serial.Serializer.serialize(Serializer.java:109)
at org.basex.core.cmd.OptimizeAll$DBParser.parse(OptimizeAll.java:200)
at org.basex.build.Builder.parse(Builder.java:77)
at org.basex.build.DiskBuilder.build(DiskBuilder.java:77)
at org.basex.core.cmd.OptimizeAll.optimizeAll(OptimizeAll.java:122)
at org.basex.query.up.primitives.db.DBOptimize.apply(DBOptimize.java:124)
at org.basex.query.up.DataUpdates.apply(DataUpdates.java:175)
at org.basex.query.up.ContextModifier.apply(ContextModifier.java:120)
at org.basex.query.up.Updates.apply(Updates.java:157)
at org.basex.query.QueryContext.iter(QueryContext.java:341)
at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90)
at org.basex.core.cmd.AQuery.query(AQuery.java:92)
at org.basex.core.cmd.XQuery.run(XQuery.java:22)
at org.basex.core.Command.run(Command.java:257)
at org.basex.core.Command.execute(Command.java:93)
at org.basex.server.ClientListener.run(ClientListener.java:140)
On Wed, May 9, 2018 at 11:45 AM, Alexander Holupirek <alex(a)holupirek.de>
wrote:
> Hi,
>
> (please also respond to the list. Others might have suggestions as well
> ;-)
>
> it still would be great if you could provide even more details. For
> instance, what programming language do you use?Ideally include a Short,
> Self Contained, Correct (Compilable), Example (SSCCE) [1] in order to let
> people reproduce the behaviour.
>
> From a high level perspective, are you sure the sessions do not
> interfere? Do you additionally work with BaseXGUI on the database, while
> your program is importing?
>
> Cheers,
> Alex
>
>
> [1] http://sscce.org/
>
> > On 9. May 2018, at 09:26, halit tiryaki <halit.tiryaki(a)gmail.com> wrote:
> >
> > hello, thanks for the fast response.
> >
> > It happened in the process of a large Import. There were two processes
> with each a respective Session to BaseX. One process with the Open session
> which did the Import was doing fine. The Second process however Starter
> having the mentioned exception. After stopping Both processes and
> restarting the app, no basex Session is now able to operate normal.
> >
> > The ArrayIndexOutOfBoundsException already comes in executing the
> command INSPECT.
> >
> > The db-Info returns following:
> >
> >
> > info db:
> > Database Properties
> > NAME: XXX
> > SIZE: 5877 kB
> > NODES: 124051
> > DOCUMENTS: 5497
> > BINARIES: 0
> > TIMESTAMP: 2018-05-09T05:20:44.000Z
> > UPTODATE: false
> >
> > Resource Properties
> > INPUTPATH:
> > INPUTSIZE: 0 b
> > INPUTDATE: 2018-05-09T04:32:10.539Z
> >
> > Indexes
> > TEXTINDEX: true
> > ATTRINDEX: true
> > TOKENINDEX: false
> > FTINDEX: false
> > TEXTINCLUDE: /item_User/node()/User:Email
> > ATTRINCLUDE: /item_UserImage/node()/UserImage:Id
> > TOKENINCLUDE:
> > FTINCLUDE: /item_XXX/node()/XXX:[german]
> > LANGUAGE: English
> > STEMMING: false
> > CASESENS: false
> > DIACRITICS: false
> > STOPWORDS:
> > UPDINDEX: true
> > AUTOOPTIMIZE: false
> > MAXCATS: 100
> > MAXLEN: 96
> > SPLITSIZE: 0
> > Alexander Holupirek <alex(a)holupirek.de> schrieb am Mi., 9. Mai 2018
> 08:44:
> > Hi,
> >
> > could you please provide some more details?
> > The best would be a reproducible example.
> >
> > Thanks,
> > Alex
> >
> >
> > > On 9. May 2018, at 06:50, halit tiryaki <halit.tiryaki(a)gmail.com>
> wrote:
> > >
> > > Hello,
> > >
> > > got a serious problem here. any ideas? thanks
> > >
> > > Improper use? Potential bug? Your feedback is welcome:
> > > Contact: basex-talk(a)mailman.uni-konstanz.de
> > > Version: BaseX 9.0.2 beta
> > > Java: Oracle Corporation, 1.8.0_151
> > > OS: Linux, amd64
> > > Stack Trace:
> > > java.lang.ArrayIndexOutOfBoundsException: 4288
> > > at org.basex.io.random.TableDiskAccess.read1(TableDiskAccess.java:151)
> > > at org.basex.data.Data.kind(Data.java:294)
> > > at org.basex.query.value.node.DBNode$4.next(DBNode.java:332)
> > > at org.basex.query.value.node.DBNode$4.next(DBNode.java:323)
> > > at org.basex.query.expr.path.IterStep$1.next(IterStep.java:38)
> > > at org.basex.query.expr.path.IterStep$1.next(IterStep.java:32)
> > > at org.basex.query.QueryContext.next(QueryContext.java:392)
> > > at org.basex.query.expr.path.IterPath$1.next(IterPath.java:50)
> > > at org.basex.query.expr.path.IterPath$1.next(IterPath.java:34)
> > > at org.basex.query.expr.ParseExpr.item(ParseExpr.java:58)
> > > at org.basex.query.expr.ParseExpr.atomItem(ParseExpr.java:84)
> > > at org.basex.query.func.fn.FnConcat.item(FnConcat.java:20)
> > > at org.basex.query.expr.ParseExpr.value(ParseExpr.java:71)
> > > at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:177)
> > > at org.basex.query.expr.gflwor.GFLWOR.value(GFLWOR.java:72)
> > > at org.basex.query.expr.path.CachedPath.nodeIter(CachedPath.java:36)
> > > at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:69)
> > > at org.basex.query.expr.gflwor.For$1.next(For.java:107)
> > > at org.basex.query.expr.gflwor.OrderBy$1.sort(OrderBy.java:73)
> > > at org.basex.query.expr.gflwor.OrderBy$1.next(OrderBy.java:54)
> > > at org.basex.query.expr.gflwor.GFLWOR.value(GFLWOR.java:72)
> > > at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:177)
> > > at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:87)
> > > at org.basex.query.QueryContext.next(QueryContext.java:392)
> > > at org.basex.query.scope.MainModule$1.next(MainModule.java:122)
> > > at org.basex.core.cmd.AQuery.query(AQuery.java:94)
> > > at org.basex.core.cmd.XQuery.run(XQuery.java:22)
> > > at org.basex.core.Command.run(Command.java:257)
> > > at org.basex.core.Command.execute(Command.java:93)
> > > at org.basex.server.ClientListener.run(ClientListener.java:140)
> >
>
>
Hi all,
investigating the new features tagged 9.x, I discovered a lot of great
stuff. I have teo smaller questions ...
I'd like to know whether the new "-c flag" for the Basex Http Server in
[1] means that one can request a script of commands to be executed once
at startup of the server. Is there a way to do something equivalent also
at server shutdown?
In [2] with the new Service it is said that one could write jobs.xml
file inside the database directory. I couldn't find the specification of
the syntax of this file anywhere. Do you have some hints?
Bye,
Marco.
[1] http://docs.basex.org/wiki/Command-Line_Options#HTTP_Server
[2] http://docs.basex.org/wiki/Jobs_Module#Services
Hi,
If we manage to install BaseX on a MacBook, chances are great that we
will use BaseX dor our final project.
I know how to install BaseX on linux but I have no experience with
Apple. My fellow-students know how to use applications but don't know
how to deal with java-applications.
My question is if BaseX can be used on a MacBook. If so, where can I
find instructions?
Cheers,
Ben Engbers
I thought I would see if I could add the Xerces grammar caching to BaseX, at least to see if it improved things for DITA loading.
I've updated my fork of the basex project to the current version in github.
Using the master branch as the basis for my local feature branch and with no modified files, I get one failing test from "mvn test":
Failed tests:
FnTest.sum:91->AdvancedQueryTest.error:78 Query did not fail:
sum(1, 'x')
[E] Error: err:FORG0006
[F] 1
Tests run: 1578, Failures: 1, Errors: 0, Skipped: 5
I'm also not able to run the BaseXGUI class using an Eclipse run configuration per the documentation on the BaseX site. I get
A bunch of messages about things missing from English.lang:
/lang/English.lang not found.
English.lang: 'port' is missing
... lots more
English.lang: 'h_no_html_parser' is missing
Then this fatal error:
Image not found: /img/text_xml.png
at org.basex.util.Util.stack(Util.java:224)
at org.basex.gui.layout.BaseXImages.url(BaseXImages.java:125)
at org.basex.gui.layout.BaseXImages.get(BaseXImages.java:62)
at org.basex.gui.layout.BaseXImages.icon(BaseXImages.java:109)
at org.basex.gui.layout.BaseXImages.<clinit>(BaseXImages.java:34)
at org.basex.gui.GUIMacOSX.addDockIcon(GUIMacOSX.java:84)
at org.basex.gui.GUIMacOSX.<init>(GUIMacOSX.java:60)
at org.basex.BaseXGUI.<init>(BaseXGUI.java:58)
at org.basex.BaseXGUI.main(BaseXGUI.java:39)
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.basex.gui.GUIMacOSX.addDockIcon(GUIMacOSX.java:84)
at org.basex.gui.GUIMacOSX.<init>(GUIMacOSX.java:60)
at org.basex.BaseXGUI.<init>(BaseXGUI.java:58)
at org.basex.BaseXGUI.main(BaseXGUI.java:39)
Caused by: java.lang.IllegalArgumentException: input == null!
at javax.imageio.ImageIO.read(ImageIO.java:1388)
at org.basex.gui.layout.BaseXImages.get(BaseXImages.java:72)
at org.basex.gui.layout.BaseXImages.get(BaseXImages.java:62)
at org.basex.gui.layout.BaseXImages.icon(BaseXImages.java:109)
at org.basex.gui.layout.BaseXImages.<clinit>(BaseXImages.java:34)
... 4 more
I suspect it's something very simple but no idea what it might be.
Thanks,
Eliot
--
Eliot Kimber
http://contrext.com
Hi,
during development I sometimes use the static file, that contains my
database:
declare variable $app:db :=
doc('file:/S:/Users/dev/Eigene%20Projekte/codingcookbook/src/data/dbsample.xml');
So far, it went without problems, but today something very strange happened:
[basex:doc] Database path 'file:/S:/Users/dev/Eigene%20Projekte
/codingcookbook/src/data/dbsample.xml' yields more than one document.
Now I checked the file in question. It is well formed and has a single
document-root.
One thing may be interesting to note: When I imported that file into a
BaseX database, a few days ago, it resulted in that document being in
the database twice. Once as it was on disk and once it contained only a
single comment, that I had in the file after the <?xml ...> prolog. I
moved the comment to another place, even removed it, just to make sure,
but the error persists, when trying to load the file from disk via `doc()`
Error:
Stopped at S:/Users/dev/Eigene
Projekte/xquery-appframework/src/lib/app.xqm, 37/32:
[basex:doc] Database path
'file:/S:/Users/dev/Eigene%20Projekte/codingcookbook/src/data/dbsample.xml'
yields more than one document.
Optimized Query:
declare variable $app:db as item()* :=
doc("file:/S:/Users/dev/Eigene%20Projekte/codingcookbook/src/data/dbsample.xml");
declare function local:autoComplete($search_489 as xs:string,
$record-name_490 as xs:string, $places_491 as element()*, $title_492,
$categories_493) as element(container)* { for $var_494 in
$app:db/descendant-or-self::node()/node()[({http://www.w3.org/1999/xhtml}name
= $record-name_490)]/$places_491 where $var_494/text()[. contains text {
$search_489 } using fuzzy using stemming using language 'English']
return element Q{dev:intermediatexml:unstable}container { (element
Q{dev:intermediatexml:unstable}item { ($var_494/text()) }, element
Q{dev:intermediatexml:unstable}title {
($var_494/preceding::node()[1]/node()[(name() = $title_492)]/text()) },
element Q{dev:intermediatexml:unstable}categories {
($var_494/preceding::node()[1]/node()[(name() =
$categories_493)]/text()) }, element Q{dev:intermediatexml:unstable}id {
(($var_494/parent::{http://www.w3.org/1999/xhtml}entry/@xml:id !
string())) }) } };
local:autoComplete("nested", "cb:entry", ((dc:subject union dc:title
union {http://www.w3.org/1999/xhtml}question)), "dc:title", "cb:categories")
Query:
declare base-uri
"file:///S:/Users/dev/Eigene%20Projekte/xquery-appframework/src/";
import module namespace app = "dev:app:unstable" at "lib/app.xqm";
declare default element namespace "http://www.w3.org/1999/xhtml";
declare namespace cb = "http://codeblocker.org/ns/codingcookbook/1.0/";
declare namespace dc = "http://purl.org/dc/elements/1.1/"; declare
namespace itx = "dev:intermediatexml:unstable"; declare function
local:autoComplete( $search as xs:string, $record-name as xs:string,
$places as element()*, $title, $categories ) as
element(Q{dev:intermediatexml:unstable}container)* { for $var in
$app:db//node()[name=$record-name]/$places where $var/text()[. contains
text {$search} using stemming using fuzzy] return <itx:container>
<itx:item>{$var/text()}</itx:item>
<itx:title>{$var/preceding::node()[1]/node()[name()=$title]/text()}</itx:title>
<itx:categories>{$var/preceding::node()[1]/node()[name()=$categories]/text()}</itx:categories>
<itx:id>{$var/parent::entry/@xml:id/string()}</itx:id> </itx:container>
};
local:autoComplete("nested","cb:entry",(dc:subject|dc:title|question),"dc:title","cb:categories")
--
Goody Bye, Minden jót, Mit freundlichen Grüßen,
Andreas Mixich
Hi,
when loading my RESTXQ app I get this:
Stopped at C:/Users/dev/.jetty/webapps/basex/fwdev/lib/account.xqm, 20/18:
[XQST0049] Duplicate declaration of static variable $acc:last-visited.
dev@mambo ~/.jetty/webapps/basex
$ grep -R --include="*.*" "\$acc\:last-visited" .
./fwdev/lib/account.xqm:declare variable $acc:last-visited as
xs:dateTime? external := ();
As it seems, there is no other declaration of the variable in the whole
project. What could that be?
--
Goody Bye, Minden jót, Mit freundlichen Grüßen,
Andreas Mixich
Follow up--I tried giving BaseX the full 16GB of RAM and it still ultimately locked up with the memory meter showing 13GB.
I'm thinking this must be some kind of memory leak.
I tried importing the DITA Open Toolkit's documentation source and that worked fine with the max memory being about 2.5GB, but it's only about 250 topics.
Cheers,
E.
--
Eliot Kimber
http://contrext.com
On 5/3/18, 4:59 PM, "Eliot Kimber" <basex-talk-bounces(a)mailman.uni-konstanz.de on behalf of ekimber(a)contrext.com> wrote:
In the context of trying to do fun things with DITA docs in BaseX I downloaded the latest BaseX (9.0.1) and tried creating a new database and loading docs into it using the BaseX GUI. This is on macOS 10.13.4 with 16GB of hardware RAM available.
My corpus is about 4000 DITA topics totaling about 30MB on disk. They are all in a single directory (not my decision) if that matters.
Using the "parse DTDs" option and default indexing options (no token or full text indexes) I'm finding that even with 12GB of RAM allocated to the JVM the memory usage during load will eventually go to 12GB, at which point the processing appears to stop (that is, whatever I set the max memory to, when it's reached, things stop but I only got out of memory errors when I had much lower settings, like the default 2GB).
I'm currently running a test with 14GB allocated and it is continuing but it does go to 12GB occasionally (watching the memory display on the Add progress panel).
No individual file is that big--the biggest is 150K and typical is 30K or smaller.
I wouldn't expect BaseX to have this kind of memory problem so I'm wondering if maybe there's an issue with memory on macOS or with DITA documents in particular (the DITA DTDs are notoriously large)?
Should I expect BaseX to be able to load this kind of corpus with 14GB of RAM?
Cheers,
E.
--
Eliot Kimber
http://contrext.com
Hello,
got a serious problem here. any ideas? thanks
Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk(a)mailman.uni-konstanz.de
Version: BaseX 9.0.2 beta
Java: Oracle Corporation, 1.8.0_151
OS: Linux, amd64
Stack Trace:
java.lang.ArrayIndexOutOfBoundsException: 4288
at org.basex.io.random.TableDiskAccess.read1(TableDiskAccess.java:151)
at org.basex.data.Data.kind(Data.java:294)
at org.basex.query.value.node.DBNode$4.next(DBNode.java:332)
at org.basex.query.value.node.DBNode$4.next(DBNode.java:323)
at org.basex.query.expr.path.IterStep$1.next(IterStep.java:38)
at org.basex.query.expr.path.IterStep$1.next(IterStep.java:32)
at org.basex.query.QueryContext.next(QueryContext.java:392)
at org.basex.query.expr.path.IterPath$1.next(IterPath.java:50)
at org.basex.query.expr.path.IterPath$1.next(IterPath.java:34)
at org.basex.query.expr.ParseExpr.item(ParseExpr.java:58)
at org.basex.query.expr.ParseExpr.atomItem(ParseExpr.java:84)
at org.basex.query.func.fn.FnConcat.item(FnConcat.java:20)
at org.basex.query.expr.ParseExpr.value(ParseExpr.java:71)
at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:177)
at org.basex.query.expr.gflwor.GFLWOR.value(GFLWOR.java:72)
at org.basex.query.expr.path.CachedPath.nodeIter(CachedPath.java:36)
at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:69)
at org.basex.query.expr.gflwor.For$1.next(For.java:107)
at org.basex.query.expr.gflwor.OrderBy$1.sort(OrderBy.java:73)
at org.basex.query.expr.gflwor.OrderBy$1.next(OrderBy.java:54)
at org.basex.query.expr.gflwor.GFLWOR.value(GFLWOR.java:72)
at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:177)
at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:87)
at org.basex.query.QueryContext.next(QueryContext.java:392)
at org.basex.query.scope.MainModule$1.next(MainModule.java:122)
at org.basex.core.cmd.AQuery.query(AQuery.java:94)
at org.basex.core.cmd.XQuery.run(XQuery.java:22)
at org.basex.core.Command.run(Command.java:257)
at org.basex.core.Command.execute(Command.java:93)
at org.basex.server.ClientListener.run(ClientListener.java:140)
Hello.
Briefing:
I want to implement distributed work with BaseX in Hadoop using Apache Spark. Data processing will be divided into the following stages:
1) Splitting XML into chunks
2) Parallel parsing and filling the database
3) Executing queries to make the table (Apache Spark Dataset<Row>)
Stage 1 is a simple algorithmic problem. It will compose a HashMap of (ChunkNumber -> List<Xml_Path>). Each chunk contains no more than 128 MB of data.
Step 2. On each node of the cluster will initialized a standalone instance of BaseX. Every instance of BaseX will recieve files / lines from HDFS to the input. A xml database of each chunk as result will be serialized to HDFS.
Stage 3. When the request of a query is received, each xml database will be sequentially deserialized to apply the query. A table will be composed from the result.
Questions:
1) Send data from HDFS to embedded BaseX:
1.1) Does BaseX support reading data by schemed URI, e.g. `hdfs://home/user/file.xml`?
1.2) Can I send XML from RAM to BaseX?
1.3) Can I send XML lines (line by line) to BaseX?
2) Can I get a database in ram to serialize it in HDFS?
3.1) Do I need to store XML in a persistent path to query it in the future?
3.2) When executing a query on XML in HDFS, can I read it line by line if BaseX does not know how to work with it directly?
Best regards,
Andrei Iatsuk.
Hello,
I used baseX gui to create a database from the following test file:
<main xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href="folder1/file1.xml" />
<xi:include href="folder2/file1.xml" />
</main>
The Result of the baseX gui is correct:
<main xmlns:xi="http://www.w3.org/2001/XInclude">
<data xml:base="folder1/file1.xml">5555</data>
<data xml:base="folder2/file1.xml">6666</data>
</main>
Now, after editing the values, I would like to export the contents of the database to xml files, recreating the same folders and files (I see this information stored in the value of xml:base)
However, using the export function available in the GUI, I am able only to obtain a single file containing the Result shown above. What should I do?
Thank you
Cheers
Marco Randazzo
In the context of trying to do fun things with DITA docs in BaseX I downloaded the latest BaseX (9.0.1) and tried creating a new database and loading docs into it using the BaseX GUI. This is on macOS 10.13.4 with 16GB of hardware RAM available.
My corpus is about 4000 DITA topics totaling about 30MB on disk. They are all in a single directory (not my decision) if that matters.
Using the "parse DTDs" option and default indexing options (no token or full text indexes) I'm finding that even with 12GB of RAM allocated to the JVM the memory usage during load will eventually go to 12GB, at which point the processing appears to stop (that is, whatever I set the max memory to, when it's reached, things stop but I only got out of memory errors when I had much lower settings, like the default 2GB).
I'm currently running a test with 14GB allocated and it is continuing but it does go to 12GB occasionally (watching the memory display on the Add progress panel).
No individual file is that big--the biggest is 150K and typical is 30K or smaller.
I wouldn't expect BaseX to have this kind of memory problem so I'm wondering if maybe there's an issue with memory on macOS or with DITA documents in particular (the DITA DTDs are notoriously large)?
Should I expect BaseX to be able to load this kind of corpus with 14GB of RAM?
Cheers,
E.
--
Eliot Kimber
http://contrext.com
Hello,
am I right, that both XML catalogs and XIncludes get only evaluated at
document import, not at XQuery execution?
If I do
declare base-uri "http://example.com/restxq-framework/src/";
let $file := doc("fragments/xhtml5-page.xhtml")
return $file
and I have an XML Catalog entry like this:
<uri id="RESTXQ-Famework"
name="http://example.com/restxq-framework/src/"
uri="file:///S:/projects/restxqfr/src/"/>
or this:
<rewriteSystem id="RESTXQ-Famework"
systemIdStartString="http://example.com/restxq-framework/src/"
rewritePrefix="file:///S:/projects/restxqfr/src/" />
BaseX tries to load the 'fragments/xhtml5-page.xhtml' from
'example.com/restxq-framework/src/'
--
Goody Bye, Minden jót, Mit freundlichen Grüßen,
Andreas Mixich
Hi,
Somehow, I managed to lock a (test)-database and now I can't get it
unlocked.
Is it possible to manually remove the lock? If so, how?
Cheers,
Ben Engbers
Hi,
This used to work, but it doesn't anymore.
Setup: Create a db called AppResources and put any xsd in it with the root
id set to 'schema-test-validate'.
.xq:
Expected result: validation error.
Result I get:
Stopped at *path*/xsd-validate-test.xq, 6/30:
[FODC0002] Resource 'test-to-delete.xsd' does not exist.
I checked, the xsd does exist on my file system. I also validated content
against it. The .xsd works, it does validate content outside of baseX.
Can you help? Thanks!
--
France Baril
Architecte documentaire / Documentation architect
france.baril(a)architextus.com
Hello,
writing a RESTXQ application, I have the following code in a module:
module namespace page = 'http://localhost/web-page';
declare %rest:path("/list/{$category}")
%rest:GET
%rest:query-param("page:category", "{$category}")
%output:method("xhtml")
%output:omit-xml-declaration("no")
%output:doctype-public("-//W3C//DTD XHTML 1.0 Transitional//EN")
%output:doctype-system("http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd")
function page:list($page:category as xs:string) { () };
I get:
[basex:restxq] Variable $category is not specified as argument.
If I replace *every* '$category' with '$page:category' I get:
[basex:restxq] Variable $page:category is specified more than once.
But the only place I use '$page:category' in this module is at this point.
The rules are, as far as I have understood:
1. In a module, only one namespace can be used for the functions and variables defined therein,
foreign namespaces can be sourced via module imports only.
2. This namespace can not be in the {http://www.w3.org/2005/xquery-local-functions} namespace,
but must go into my own namespace, here {http://localhost/web-page}.
3. All variables must be in the relevant namespaces *(global or local only?)*.
I now have tried several approaches. As I see in the XQM modules in the BaseX DBA application,
it seems, that neither the '%rest:query-param' is needed nor is it needed to prefix variable/parameter
names, that are local to that function. This can be found, for example, in 'dba/common.xqm':
module namespace dba = 'dba/common';
(:~
: Shows a "page not found" error.
: @param $path path to unknown page
: @return page
:)
declare
%rest:path("/dba/{$path}")
%output:method("html")
function dba:unknown(
$path as xs:string
) as element(html) {
html:wrap(
<tr>
<td>
<h2>Page not found:</h2>
<ul>
<li>Page: dba/{ $path }</li>
<li>Method: { Request:method() }</li>
</ul>
</td>
</tr>
)
};
So, I am disturbed by now. What am I doing wrong here?
--
Goody Bye, Minden jót, Mit freundlichen Grüßen,
Andreas Mixich
Hello,
in a module, consisting of many function declarations, how can I
configure a single function to `declare copy-namespaces="no-preserve,
no-inherit";` ?
Thanks.
--
Goody Bye, Minden jót, Mit freundlichen Grüßen,
Andreas Mixich
Hi,
Thanks for version 9!
I am trying to get things running in docker container, and see in the
github issues there have been a few changes such as the directories
BaseX uses [1] which are not reflected in the README yet.
If you like I can put a few things in a pull request while I figure
things out here :-)
~~Rolf.
[1] https://github.com/BaseXdb/basex/issues/1546
Hi,
It was only after starting to implement my R-client implementation in
examples, that I noticed there is no 'DELETE'-command specified in the
server protocol.
Is this a deliberate ommission?
I would guess that implementing such a command would come down to
something like this:
delete = function(name = name) {
writeBin(as.raw(---BYTE---), private$sock)
writeBin(private$raw_terminated_string(name), private$sock)
return(list(info = private$info, success = self$bool_test_sock()))
}
If this is correct, where can I find (a list with) the required byte-codes?
Ben Engbers
Hello,
I am running a XSLT transformation which is producing a wide textual output.
<xsl:output method="text" />
Its output is 300 plus characters wide.
It seems that the BaseX Result pane doesn't have a wrap/unwrap option.
Is it possible to introduce such functionality to the BaseX Result pane?
[cid:image001.png@01D3DBD4.8C176710]
Regards,
Yitzhak Khabinsky
Technical Services Lead
Millicom International Services LLC
396 Alhambra Circle, Suite 1100
Coral Gables, FL 33134
Skype4B: +1 (305) 445-4172
Tel: (954) 684-8673
yitzhak.khabinsky(a)millicom.com<mailto:.khabinsky@millicom.com>
www.millicom.com<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.millico…>
Dear all,
Welcome to our first BaseX 9.01 maintenance release:
http://basex.org/
An update is highly recommended: The major release had a critical bug,
regarding the storage of short non-ASCII Unicode strings.
This is the 9.0.1 changelog:
CRITICAL BUG FIXES
* Storage: short strings with extended Unicode characters fixed
* XQuery: nested path optimizations reenabled (e.g. in functions)
* XQuery: map:merge, size computation fixed
* XQuery: node ordering across multiple database instances fixed
IMPROVEMENTS
* GUI: Better Java 9 support (DPI scaling, font rendering)
* XQuery, collections: faster document root tests
* New R client. Thanks Ben Engbers!
* Linux: exec command used in startup scripts
MINOR BUG FIXES
* XQuery: Allow interruption of tail-call function calls
* XQuery, HTTP parsing of content-type parameters
* XQuery, restrict rewriting of filter to path expression
* GUI: progress feedback when creating databases via double-click
Have fun everyone,
Christian
Hi All,
I can create a database via the GUI, but if I use db:create [1] I get the message "out of main memory": why? Thanks!
db:create("myDB",
"sourceDirectory",
"destinationDirectory",
map{"ftindex": true(), "language": false()}
)
Best,
Giuseppe