Hello Christian,

thanks for your answer. I managed to solve the problem using the latest snapshot, but there are some issues/notes i want to share.
First it seems (either in 7.7.2 nor 7.8 beta) not possible to change the parser options (at least there were no changes in behaviour)
I'm running basex using the bin/basexhttp script. If i change the intparse or dtd option using bin/basexclient they are restored to default when restarting the server, i'm not sure wether this is desired behaviour or not. But even without restart its not possible to get the questioned xmls parsed in 7.7.2.

The second note is that the latest snapshot is having some serious concurrency issues which 7.7.2 doesn't have.
I am using a node.js environment to PUT around 10000 xml files to the db. If i start those PUT requests all at once (i have no idea how node internally queues them or if it fires them all at once on the network) i get these Exceptions after a few successful PUTs with the latest snapshot:
Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk@mailman.uni-konstanz.de
Version: BaseX 7.8 beta 4cfa54c
Java: Oracle Corporation, 1.7.0_25
OS: Linux, amd64
Stack Trace: 
java.lang.RuntimeException: Data Access out of bounds:
- pre value: 1950001
- #used blocks: 7618
- #total locks: 7618
- access: 7617 (7618 > 7617]
	at org.basex.util.Util.notExpected(Util.java:53)
	at org.basex.io.random.TableDiskAccess.cursor(TableDiskAccess.java:508)
	at org.basex.io.random.TableDiskAccess.read5(TableDiskAccess.java:216)
	at org.basex.data.Data.textOff(Data.java:422)
	at org.basex.data.DiskData.text(DiskData.java:234)
	at org.basex.core.cmd.List.listDB(List.java:132)
	at org.basex.core.cmd.List.run(List.java:50)
	at org.basex.core.Command.run(Command.java:329)
	at org.basex.http.rest.RESTCmd.run(RESTCmd.java:93)
	at org.basex.http.rest.RESTCmd.run(RESTCmd.java:82)
	at org.basex.http.rest.RESTRetrieve.run0(RESTRetrieve.java:51)
	at org.basex.http.rest.RESTCmd.run(RESTCmd.java:61)
	at org.basex.core.Command.run(Command.java:329)
	at org.basex.core.Command.execute(Command.java:94)
	at org.basex.core.Command.execute(Command.java:117)
	at org.basex.http.rest.RESTServlet.run(RESTServlet.java:21)
	at org.basex.http.BaseXServlet.service(BaseXServlet.java:58)
	....

sometimes the collection is not even accessible per GET afterwards (other collections are).
PUTting the xml files one by one and waiting for the last result first however works fine.
7.7.2 doesn't have this issue, so is this maybe some regression bug?

best,
Martin


On 28.01.2014 23:59, Christian Grün wrote:
An update: I noticed that external entity references were resolved by
the parser even if DTD parsing was switched off, leading to long
waiting times. The issue is resolved in the very latest snapshot, both
with the internal and Java’s default parser. If you still want to
parse all entities, simply activate DTD parsing.


On Tue, Jan 28, 2014 at 6:44 PM, Christian Grün
<christian.gruen@gmail.com> wrote:
Hi Martin,

thanks for your feedback. The problem should be solved with Version
7.8 of BaseX. The official version will be out soon, but you are
invited to check out the latest stable snapshot [1].

If you want to use BaseX 7.7.2, you can also switch to Java’s default
parser (via SET INTPARSE false, or by deactivating "Use internal XML
parser" in the "Database" → "New…" dialog and the "Parsing" tab).

Hope this helps,
Christian

[1] http://files.basex.org/releases/latest/


On Tue, Jan 28, 2014 at 6:36 PM, Martin Reckziegel
<reckziegel@informatik.uni-leipzig.de> wrote:
Hello everybody,

i'm using basex 7.7.2 in a university based project. I'm trying to store TEI
XML files in the database but there is an error storing certain valid files.
Using a rest PUT request to store a file starting like this:

<?xml version="1.0"?>
<!DOCTYPE TEI.2 PUBLIC "-//TEI P4//DTD Main DTD Driver File//EN"
"http://www.tei-c.org/Guidelines/DTD/tei2.dtd" [
<!ENTITY % TEI.XML "INCLUDE">
<!ENTITY % PersProse PUBLIC "-//Perseus P4//DTD Perseus Prose//EN"
"http://www.perseus.tufts.edu/DTD/1.0/PersProse.dtd" >
%PersProse;
]>
<TEI.2>
<teiHeader type="text" status="new">
....

results in this error:
"tlg0003.xml.xml" (Line 5): ']' expected, '<' found.
(Line 5 is %PersProse;)
I have no clue how to interpret the error since non of the mention
characters are in that line. Maybe this is resulting in some internal
replacement?
Anyway deleting line 5 resolves the error (but of course does not solve my
problem since i don't want to alter the files)
The problematic files are all valid, at least according to
http://www.validome.org/xml/validate/ and http://validator.w3.org/check so i
wonder why they are rejected by basex?

kind regards,
Martin Reckziegel




_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk