I'll have a look at this. If tagsoup is present on a Debian system it should be detected automatically. If not, its a fault of the package and i'll fix it.
On 22.02.2012, at 12:24, Christian GrĂ¼n wrote:
Tagsoup needs to be embedded in your classpath -- which is the case if BaseX is downloaded from our homepage). If you have installed BaseX via the Debian package manager, you'll have to manually embed the tagsoup.jar in the BaseX start scripts.
Hope this helps, Christian
Well all I know is that http://docs.basex.org/wiki/Parsers should mention what to do to read HTML, and on my machine there is $ apt-cache search tagsoup-java libtagsoup-java - SAX-compliant parser for real-life HTML libtagsoup-java-doc - API Documentation for TagSoup
Mainly it is tags like <img ...> without /> that throw basex off track. _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Alexander Holupirek |-- http://www.informatik.uni-konstanz.de/~holupire |-- Database & Information Systems Group, U Konstanz `-- Room E 221, 0049 7531 88 2188 (phone) 3577 (fax)