But the indentation is quite different from what I see in Saxon or oXygen output when I indent. You see this with more complex examples.
That’s true, every query processor uses custom indentation algorithms; the specification gives much freedom here [1]. If indentation is important, it’s always recommendable to either preserve the original formatting or use xml:space='preserve' for mixed-context sections.
DOH!
I should be using xml:space="preserve". But is there no way to declare that when I import a file to the database? Sometimes I don't want to change the original file, but I do want to preserve whitespace.
I’ll never be happy with the decision in XML to lump together indentation of structure and content.
On standards groups, we always spent a LOT more time discussing whitespace than character content, it took up enormous amounts of time. And part of it is that there's not really a good way in XML to distinguish indentation from whitespace content. What would you have done differently? If there's an obvious, simple way this could have been improved, I'd be curious what it is.
Jonathan