Hi, I wonder what is the status of schema validation in BaseX? I have a Java web service that is used to validate some schemas, which is using xerces2 to validate XML files. I want to transfer some of this work to my XQuery scripts in BaseX, so I can minimize the bandwidth on large files (the XML files are being retrieved from a remote repository). I did try to put xercesImpl.java on the lib directory, the validation does run, but I'm not sure about these two things
1) Is the new version of xerces being used, or is the Java default one being used? Maybe it's possible to add a function to return the library being used?
2) My Java service (which is using xerces2) is running the validation in about 5 seconds, the same validation takes 32 seconds in BaseX 8.5.1
Any ideas, tips etc are welcome
Thanks, George.
Hi George,
Just on point #1 I think BaseX does not install Xerces. Entering the line below in the GUI will tell you the version from the JDK
Q{java:com.sun.org.apache.xerces.internal.impl.Version}getVersion()
For me this returns: Xerces-J 2.7.1
If you have manually added Xerces to the classpath, then you can get the version by: Q{java:org.apache.xerces.impl.Version}getVersion()
/Andy
On 19 July 2016 at 12:46, George Sofianos gsf.greece@gmail.com wrote:
Hi, I wonder what is the status of schema validation in BaseX? I have a Java web service that is used to validate some schemas, which is using xerces2 to validate XML files. I want to transfer some of this work to my XQuery scripts in BaseX, so I can minimize the bandwidth on large files (the XML files are being retrieved from a remote repository). I did try to put xercesImpl.java on the lib directory, the validation does run, but I'm not sure about these two things
- Is the new version of xerces being used, or is the Java default one
being used? Maybe it's possible to add a function to return the library being used?
- My Java service (which is using xerces2) is running the validation in
about 5 seconds, the same validation takes 32 seconds in BaseX 8.5.1
Any ideas, tips etc are welcome
Thanks, George.
Thanks, it looks like it's in the classpath. But is it actually used? I can't be sure. I have seen some strange things happening with Xerces versions in the past with Saxon.
Anyway, it would be great if BaseX can have a feature to change the validation options. Should I open a BaseX ticket about it? or is there already a way to set these.
https://xerces.apache.org/xerces2-j/features.html
On 7/19/2016 3:05 PM, Andy Bunce wrote:
Hi George,
Just on point #1 I think BaseX does not install Xerces. Entering the line below in the GUI will tell you the version from the JDK
Q{java:com.sun.org.apache.xerces.internal.impl.Version}getVersion()
For me this returns: Xerces-J 2.7.1
If you have manually added Xerces to the classpath, then you can get the version by: Q{java:org.apache.xerces.impl.Version}getVersion()
/Andy
Looks like it wants to use it [1]. You could try running below in the GUI:
Q{java:org.basex.util.Reflect}find("org.apache.xerces.jaxp.validation.XMLSchemaFactory")
/Andy
[1] https://github.com/BaseXdb/basex/blob/b8c1ae7738664aa3912ade783b8a01a0a2285d...
On 19 July 2016 at 13:10, George Sofianos gsf.greece@gmail.com wrote:
Thanks, it looks like it's in the classpath. But is it actually used? I can't be sure. I have seen some strange things happening with Xerces versions in the past with Saxon.
Anyway, it would be great if BaseX can have a feature to change the validation options. Should I open a BaseX ticket about it? or is there already a way to set these.
https://xerces.apache.org/xerces2-j/features.html
On 7/19/2016 3:05 PM, Andy Bunce wrote:
Hi George,
Just on point #1 I think BaseX does not install Xerces. Entering the line below in the GUI will tell you the version from the JDK
Q{java:com.sun.org.apache.xerces.internal.impl.Version}getVersion()
For me this returns: Xerces-J 2.7.1
If you have manually added Xerces to the classpath, then you can get the version by: Q{java:org.apache.xerces.impl.Version}getVersion()
/Andy
Indeed, looking at the code seems it's already using it. I wonder what creates that delay though. I will have to investigate it a bit, probably by debugging BaseX, unless someone already knows. Could a SaxSource vs StreamSource be the issue? Or that doesn't affect performance? If my question is stupid, just ignore ;)
Thanks, George
On 7/19/2016 4:40 PM, Andy Bunce wrote:
Looks like it wants to use it [1]. You could try running below in the GUI:
Q{java:org.basex.util.Reflect}find("org.apache.xerces.jaxp.validation.XMLSchemaFactory")
/Andy
[1] https://github.com/BaseXdb/basex/blob/b8c1ae7738664aa3912ade783b8a01a0a2285d...
Just an update on this, if anyone is interested... I forked BaseX, basically removed the current implementation... I tried multiple things like changing the streamsource, manually creating a saxparser, setting various features, setting content handlers, disabling everything BaseX does until it returns the errors to the screen (for example recursive exception cause logging), and I still got a delay... so for now I can only guess that this delay is because of the complex schema the XML files are using, since this is the only thing the other application does differently. (and maybe it's faster because is not that accurate). So I kinda give up on this :)
The only real discovery I made is that this file: https://github.com/BaseXdb/basex/blob/master/basex-core/etc/basexgui.bat doesn't include the libraries on Windows 10... I had to change the line
for /R "%LIB%" %%a in (*.jar) do set CP=!CP!;%%a
to
set CP=!CP!;%LIB%/*
As a side note, I would like to experiment more with BaseX code... I wonder if anyone has managed to use latest Eclipse / Windowbuilder with it.. It usually crashes when I tried to use it in the past.
On 7/19/2016 4:40 PM, Andy Bunce wrote:
Looks like it wants to use it [1]. You could try running below in the GUI:
Q{java:org.basex.util.Reflect}find("org.apache.xerces.jaxp.validation.XMLSchemaFactory")
/Andy
[1] https://github.com/BaseXdb/basex/blob/b8c1ae7738664aa3912ade783b8a01a0a2285d...
The only real discovery I made is that this file: https://github.com/BaseXdb/basex/blob/master/basex-core/etc/basexgui.bat doesn't include the libraries on Windows 10... I had to change the line
for /R "%LIB%" %%a in (*.jar) do set CP=!CP!;%%a to set CP=!CP!;%LIB%/*
Thanks for the note. We’ll try to find a solution that works with both Windows 10 and older versions.
basex-talk@mailman.uni-konstanz.de