Hi Michael,
Speaking for myself, I think a better heuristic than dropping all whitespace-only text nodes and removing leading and trailing whitespace would be dropping whitespace-only text nodes only if every text-node seen so far as a child of this parent has been whitespace-only, and stripping leading whitespace only after a start-tag and trailing whitespace only after an end-tag.
Some more time passed, and I finally tried to rewrite your little proposal into some XQuery code to test the implications. When executing it, I end up with the following result…
<p>This <em>is</em><strong>IMPORTANT</strong></p>
Maybe it’s because the removal of heading OR trailing whitespaces can also lead to a zero-length text node? Maybe I should simply spend some more time on thinking about it? ;)
I have attached a mini example to this mail; suggestions (from everyone) are welcome.
Christian