I love BaseX for the simplicity it brings to XML handling. But this is a problem I have not encountered before.
I am creating a DB from about 17,000 small zipfiles, each containing a directory structure and somewhere within each, some XML. BaseX chokes on one of these files giving the error: "invalid entry size (expected 0 but got 11083 bytes)".
So clearly one or more of the zips is invalid - but which one(s)?
Is there any way that BaseX can echo to me the file that is causing the error? DEBUG is set to TRUE, but I don't get any morer verbose output. I am using the GUI, and running a BSX script too.
Thanks in advance, C.
________________________________
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
Have you tried using try/catch?
for zip in zips return try { unzip() } catch ...
Am 16.10.2015 um 15:16 schrieb Hondros, Constantine (ELS-AMS) C.Hondros@elsevier.com:
I love BaseX for the simplicity it brings to XML handling. But this is a problem I have not encountered before.
I am creating a DB from about 17,000 small zipfiles, each containing a directory structure and somewhere within each, some XML. BaseX chokes on one of these files giving the error: “invalid entry size (expected 0 but got 11083 bytes)”.
So clearly one or more of the zips is invalid - but which one(s)?
Is there any way that BaseX can echo to me the file that is causing the error? DEBUG is set to TRUE, but I don’t get any morer verbose output. I am using the GUI, and running a BSX script too.
Thanks in advance, C.
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
Hmm, it is surprisingly hard to get some useful logging information out of this.
Here for example I fall foul of a XQuery syntax here: “all expressions must be updating or return an empty sequence. ”
for $zip in file:list($hm) return try { db:add("myDB", $hm || '' || $zip, "", map { 'intparse': true() }) } catch * { $err:code }
Can anyone help me rephrase this so when BaseX halts on a bad zipfile, I at least know what the name of the zipfile is?
C.
From: Alexander Holupirek [mailto:alex@holupirek.de] Sent: 16 October 2015 15:23 To: Hondros, Constantine (ELS-AMS) Cc: BaseX Subject: Re: [basex-talk] Which zip did BaseX choke on?
Have you tried using try/catch?
for zip in zips return try { unzip() } catch ...
Am 16.10.2015 um 15:16 schrieb Hondros, Constantine (ELS-AMS) <C.Hondros@elsevier.commailto:C.Hondros@elsevier.com>: I love BaseX for the simplicity it brings to XML handling. But this is a problem I have not encountered before.
I am creating a DB from about 17,000 small zipfiles, each containing a directory structure and somewhere within each, some XML. BaseX chokes on one of these files giving the error: “invalid entry size (expected 0 but got 11083 bytes)”.
So clearly one or more of the zips is invalid - but which one(s)?
Is there any way that BaseX can echo to me the file that is causing the error? DEBUG is set to TRUE, but I don’t get any morer verbose output. I am using the GUI, and running a BSX script too.
Thanks in advance, C.
________________________________
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
________________________________
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
I may be wrong, but I think you need to wrap the $err:code in db:output(): see http://docs.basex.org/wiki/XQuery_Update#Returning_Results
Tim
-- Tim A. Thompson Metadata Librarian (Spanish/Portuguese Specialty) Princeton University Library
On Fri, Oct 16, 2015 at 10:28 AM, Hondros, Constantine (ELS-AMS) < C.Hondros@elsevier.com> wrote:
Hmm, it is surprisingly hard to get some useful logging information out of this.
Here for example I fall foul of a XQuery syntax here: “all expressions must be updating or return an empty sequence. ”
for $zip in file:list($hm)
return try {
db:add("myDB", $hm || '\' || $zip, "", map {
'intparse': true() })
} catch * { $err:code }
Can anyone help me rephrase this so when BaseX halts on a bad zipfile, I at least know what the name of the zipfile is?
C.
*From:* Alexander Holupirek [mailto:alex@holupirek.de] *Sent:* 16 October 2015 15:23 *To:* Hondros, Constantine (ELS-AMS) *Cc:* BaseX *Subject:* Re: [basex-talk] Which zip did BaseX choke on?
Have you tried using try/catch?
for zip in zips return try { unzip() } catch ...
Am 16.10.2015 um 15:16 schrieb Hondros, Constantine (ELS-AMS) < C.Hondros@elsevier.com>:
I love BaseX for the simplicity it brings to XML handling. But this is a problem I have not encountered before.
I am creating a DB from about 17,000 small zipfiles, each containing a directory structure and somewhere within each, some XML. BaseX chokes on one of these files giving the error: “invalid entry size (expected 0 but got 11083 bytes)”.
So clearly one or more of the zips is invalid - but which one(s)?
Is there any way that BaseX can echo to me the file that is causing the error? DEBUG is set to TRUE, but I don’t get any morer verbose output. I am using the GUI, and running a BSX script too.
Thanks in advance,
C.
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
db:output certainly looked promising, but I get no output from a try/catch clause, alas:
for $zip in file:list($hm) return try { db:add("myDB", $hm || '' || $zip, "", map { 'intparse': true()}) } catch * { db:output($zip) }
Any other thoughts, BaseX users?
C.
From: Tim Thompson [mailto:timathom@gmail.com] Sent: 16 October 2015 16:40 To: Hondros, Constantine (ELS-AMS) Cc: Alexander Holupirek; BaseX Subject: Re: [basex-talk] Which zip did BaseX choke on?
I may be wrong, but I think you need to wrap the $err:code in db:output(): see http://docs.basex.org/wiki/XQuery_Update#Returning_Results
Tim
-- Tim A. Thompson Metadata Librarian (Spanish/Portuguese Specialty) Princeton University Library
On Fri, Oct 16, 2015 at 10:28 AM, Hondros, Constantine (ELS-AMS) <C.Hondros@elsevier.commailto:C.Hondros@elsevier.com> wrote: Hmm, it is surprisingly hard to get some useful logging information out of this.
Here for example I fall foul of a XQuery syntax here: “all expressions must be updating or return an empty sequence. ”
for $zip in file:list($hm) return try { db:add("myDB", $hm || '' || $zip, "", map { 'intparse': true() }) } catch * { $err:code }
Can anyone help me rephrase this so when BaseX halts on a bad zipfile, I at least know what the name of the zipfile is?
C.
From: Alexander Holupirek [mailto:alex@holupirek.demailto:alex@holupirek.de] Sent: 16 October 2015 15:23 To: Hondros, Constantine (ELS-AMS) Cc: BaseX Subject: Re: [basex-talk] Which zip did BaseX choke on?
Have you tried using try/catch?
for zip in zips return try { unzip() } catch ...
Am 16.10.2015 um 15:16 schrieb Hondros, Constantine (ELS-AMS) <C.Hondros@elsevier.commailto:C.Hondros@elsevier.com>: I love BaseX for the simplicity it brings to XML handling. But this is a problem I have not encountered before.
I am creating a DB from about 17,000 small zipfiles, each containing a directory structure and somewhere within each, some XML. BaseX chokes on one of these files giving the error: “invalid entry size (expected 0 but got 11083 bytes)”.
So clearly one or more of the zips is invalid - but which one(s)?
Is there any way that BaseX can echo to me the file that is causing the error? DEBUG is set to TRUE, but I don’t get any morer verbose output. I am using the GUI, and running a BSX script too.
Thanks in advance, C.
________________________________
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
________________________________
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
________________________________
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
Hi Constantine,
I guess that try/catch won't help you here, because the files will only be unzipped in the update phase (after actual query evaluation).
I am not sure when the error is generated. Could you send me the full stack trace (triggered via -d)?
Thanks, Christian
Fri, Oct 16, 2015 at 4:55 PM, Hondros, Constantine (ELS-AMS) C.Hondros@elsevier.com wrote:
db:output certainly looked promising, but I get no output from a try/catch clause, alas:
for $zip in file:list($hm)
return
try {
db:add("myDB", $hm || '\' || $zip, "", map { 'intparse': true()})
} catch * {
db:output($zip)
}
Any other thoughts, BaseX users?
C.
From: Tim Thompson [mailto:timathom@gmail.com] Sent: 16 October 2015 16:40 To: Hondros, Constantine (ELS-AMS) Cc: Alexander Holupirek; BaseX
Subject: Re: [basex-talk] Which zip did BaseX choke on?
I may be wrong, but I think you need to wrap the $err:code in db:output(): see http://docs.basex.org/wiki/XQuery_Update#Returning_Results
Tim
-- Tim A. Thompson Metadata Librarian (Spanish/Portuguese Specialty) Princeton University Library
On Fri, Oct 16, 2015 at 10:28 AM, Hondros, Constantine (ELS-AMS) C.Hondros@elsevier.com wrote:
Hmm, it is surprisingly hard to get some useful logging information out of this.
Here for example I fall foul of a XQuery syntax here: “all expressions must be updating or return an empty sequence. ”
for $zip in file:list($hm)
return try {
db:add("myDB", $hm || '\' || $zip, "", map { 'intparse':
true() })
} catch * { $err:code }
Can anyone help me rephrase this so when BaseX halts on a bad zipfile, I at least know what the name of the zipfile is?
C.
From: Alexander Holupirek [mailto:alex@holupirek.de] Sent: 16 October 2015 15:23 To: Hondros, Constantine (ELS-AMS) Cc: BaseX Subject: Re: [basex-talk] Which zip did BaseX choke on?
Have you tried using try/catch?
for zip in zips return try { unzip() } catch ...
Am 16.10.2015 um 15:16 schrieb Hondros, Constantine (ELS-AMS) C.Hondros@elsevier.com:
I love BaseX for the simplicity it brings to XML handling. But this is a problem I have not encountered before.
I am creating a DB from about 17,000 small zipfiles, each containing a directory structure and somewhere within each, some XML. BaseX chokes on one of these files giving the error: “invalid entry size (expected 0 but got 11083 bytes)”.
So clearly one or more of the zips is invalid - but which one(s)?
Is there any way that BaseX can echo to me the file that is causing the error? DEBUG is set to TRUE, but I don’t get any morer verbose output. I am using the GUI, and running a BSX script too.
Thanks in advance,
C.
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
Hi Christian,
I was able to work around this in the following way (as usual, with BaseX, there is a way):
(: TEST ZIP FILES FOR VALIDITY:)
for $zip in file:list($hm) return try { if (zip:entries($hm || '' || $zip)/@href ) then () else $zip } catch * { $zip, $err:code, $err:description }
It turned out there was indeed something wrong with the zip header of one of the files:
experr:ZIP0003 Operation failed: invalid CEN header (bad signature).
Kind regards, Constantine
-----Original Message----- From: Christian Grün [mailto:christian.gruen@gmail.com] Sent: 19 October 2015 13:58 To: Hondros, Constantine (ELS-AMS) Cc: Tim Thompson; BaseX Subject: Re: [basex-talk] Which zip did BaseX choke on?
Hi Constantine,
I guess that try/catch won't help you here, because the files will only be unzipped in the update phase (after actual query evaluation).
I am not sure when the error is generated. Could you send me the full stack trace (triggered via -d)?
Thanks, Christian
Fri, Oct 16, 2015 at 4:55 PM, Hondros, Constantine (ELS-AMS) C.Hondros@elsevier.com wrote:
db:output certainly looked promising, but I get no output from a try/catch clause, alas:
for $zip in file:list($hm)
return
try {
db:add("myDB", $hm || '\' || $zip, "", map { 'intparse': true()})
} catch * {
db:output($zip)
}
Any other thoughts, BaseX users?
C.
From: Tim Thompson [mailto:timathom@gmail.com] Sent: 16 October 2015 16:40 To: Hondros, Constantine (ELS-AMS) Cc: Alexander Holupirek; BaseX
Subject: Re: [basex-talk] Which zip did BaseX choke on?
I may be wrong, but I think you need to wrap the $err:code in db:output(): see http://docs.basex.org/wiki/XQuery_Update#Returning_Results
Tim
-- Tim A. Thompson Metadata Librarian (Spanish/Portuguese Specialty) Princeton University Library
On Fri, Oct 16, 2015 at 10:28 AM, Hondros, Constantine (ELS-AMS) C.Hondros@elsevier.com wrote:
Hmm, it is surprisingly hard to get some useful logging information out of this.
Here for example I fall foul of a XQuery syntax here: “all expressions must be updating or return an empty sequence. ”
for $zip in file:list($hm)
return try {
db:add("myDB", $hm || '\' || $zip, "", map { 'intparse':
true() })
} catch * { $err:code }
Can anyone help me rephrase this so when BaseX halts on a bad zipfile, I at least know what the name of the zipfile is?
C.
From: Alexander Holupirek [mailto:alex@holupirek.de] Sent: 16 October 2015 15:23 To: Hondros, Constantine (ELS-AMS) Cc: BaseX Subject: Re: [basex-talk] Which zip did BaseX choke on?
Have you tried using try/catch?
for zip in zips return try { unzip() } catch ...
Am 16.10.2015 um 15:16 schrieb Hondros, Constantine (ELS-AMS) C.Hondros@elsevier.com:
I love BaseX for the simplicity it brings to XML handling. But this is a problem I have not encountered before.
I am creating a DB from about 17,000 small zipfiles, each containing a directory structure and somewhere within each, some XML. BaseX chokes on one of these files giving the error: “invalid entry size (expected 0 but got 11083 bytes)”.
So clearly one or more of the zips is invalid - but which one(s)?
Is there any way that BaseX can echo to me the file that is causing the error? DEBUG is set to TRUE, but I don’t get any morer verbose output. I am using the GUI, and running a BSX script too.
Thanks in advance,
C.
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
________________________________
Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
basex-talk@mailman.uni-konstanz.de