Hi,

Running it directly lead to the same result, out of memory.

Pruning is deleting branches, or nodes in our case, I learned that terminology from an AI course.


I really wish to see a solution that creates a new xml as it is likely much faster and more memory friendly and traverses  fewer nodes rather than deleting nodes from the original which is not only slower but more data needs to be deleted.

( I'm making a lot of assumptions on how the delete works now )




Från: Christian Grün <christian.gruen@gmail.com>
Skickat: den 9 juli 2016 21:19
Till: Henning Phan
Kopia: BaseX
Ämne: Re: [basex-talk] Removing the xmlns attribute and/or adding the prefix
 
> Yes you did get it right and I did try the query though sadly java runs out of memory.

In that case, you could try to create a database of your document and
run the update operations directly on that database..

1. open your database
2. run the query:

  for $e in //*
  let $m := $e/@mark
  return if($m) then (
    delete node $m,
    (: [.] is same as: where $d ne "" :)
    for $d in data($m)[.]
    return (
      delete node $e/text(),
      insert node $d as first into $e
    )
  ) else (
    delete node $e
  )

If that doesn’t help…

1. delete all nodes without @mark attribute
2. optimize your database and
3. rewrite nodes with the @mark attribute


> I rather not prune my xml because It's around 5GB.

What do you mean by pruning?