Would it also help us converting HTML5, or it is a general suggestion? ;)
Unfortunately no, it was a general suggestion :( In our projects though, we are using https://jsoup.org/ and it works well, also very easy to use. I still prefer XPath over the CSS selectors.
Out of interest: Where would this come into play? When using http:send-request, or also at other places?
I'm talking about calls that happend using XQuery doc(http://randomhost.rn/random.xml). I'm not sure if they request gzipped files. I think I've tested it once and it didn't. For example trying to get a 233MB XML file using gzip compression, will only need to fetch 27.8MB (this is a random file, the compression may vary for different XML files). We are working with files that can be over 1GB, so it can make a difference in bandwidth and execution (compilation) time.