How do I prevent BaseX from insisting on adding these attributes? <a shape="rect" <br clear="none">
What more do I need to tell it? declare option db:parser "html"; declare option db:htmlopt "method=html,nons=true"; declare option output:method "html"; declare option output:version "4.01"; declare option output:doctype-public "-//W3C//DTD HTML 4.01//EN"; declare option output:doctype-system "http://www.w3.org/TR/html4/strict.dtd"; declare option output:include-content-type "yes";
Doesn't it know that those are invalid attributes in 4.01, and although I appreciate BaseX's effort in making my life more wonderful, perhaps there is a way to get it to discourage itself from all attributes adding assistance maybe?
On 09.05.2012, at 15:52, jidanni@jidanni.org wrote:
How do I prevent BaseX from insisting on adding these attributes? <a shape="rect"
<br clear="none">
What more do I need to tell it? declare option db:parser "html"; declare option db:htmlopt "method=html,nons=true"; declare option output:method "html"; declare option output:version "4.01"; declare option output:doctype-public "-//W3C//DTD HTML 4.01//EN"; declare option output:doctype-system "http://www.w3.org/TR/html4/strict.dtd"; declare option output:include-content-type "yes";
Doesn't it know that those are invalid attributes in 4.01, and although I appreciate BaseX's effort in making my life more wonderful, perhaps there is a way to get it to discourage itself from all attributes adding assistance maybe?
It's not clear to me, what you are trying to achieve. What attributes is BaseX adding to what?
Please post a small snippet or example, so that we are able to test the problem.
Thanks, Alex
"AH" == Alexander Holupirek alexander.holupirek@uni-konstanz.de writes:
AH> Please post a small snippet or example, so that we are able to test the problem.
Taking the example from the Debian basex man page, we add an innocent <br> and <a>:
cat > bad.html <<\EOF <html> <ul> <li>A<a href="o">z</a> <li>B<br> </ul> </html> EOF basex -c 'set parser html; set htmlopt method=html,nons=true; create db htmldb bad.html' basex -q "doc('htmldb')"
<html> <body> <ul> <li>A<a shape="rect" href="o">z</a> HORRIBLE </li> <li>B<br clear="none"/> TERRIBLE </li> </ul> </body> </html>
How can I stop basex from insisting on adding such atrocious junk?
It isn’t the fault of BaseX. The parser (tagsoup, if you choose HTML parsing) inserts the default values for attributes. You should be able to suppress it by adding nodefaults=true to HTMLOPT.
On 2012-05-09 17:07, jidanni@jidanni.org wrote:
"AH" == Alexander Holupirekalexander.holupirek@uni-konstanz.de writes:
AH> Please post a small snippet or example, so that we are able to test the problem.
Taking the example from the Debian basex man page, we add an innocent <br> and<a>:
cat> bad.html<<\EOF <html> <ul> <li>A<a href="o">z</a> <li>B<br> </ul> </html> EOF basex -c 'set parser html; set htmlopt method=html,nons=true; create db htmldb bad.html' basex -q "doc('htmldb')"
<html> <body> <ul> <li>A<a shape="rect" href="o">z</a> HORRIBLE </li> <li>B<br clear="none"/> TERRIBLE </li> </ul> </body> </html>
How can I stop basex from insisting on adding such atrocious junk? _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
basex-talk@mailman.uni-konstanz.de