Hello,
I constructed the following XML file for another test of the software “BaseX 9.7”.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <test_data> <info> <id>12</id> <topics> <topic>Demo1</topic> <topic>Demo2</topic> </topics> </info> <info> <id>23</id> <topics> <topic>Demo1</topic> <topic>Demo2</topic> </topics> </info> <info> <id>34</id> <topics> <topic>Test1</topic> <topic>Test2</topic> <topic>Test3</topic> </topics> </info> <info> <id>45</id> <topics> <topic>Test1</topic> <topic>Test2</topic> <topic>Test3</topic> </topics> </info> <info> <id>56</id> <topics> <topic>Test1</topic> <topic>Test2</topic> <topic>Test3</topic> </topics> </info> <info> <id>67</id> <topics> <topic>Probe1</topic> </topics> </info> </test_data>
I tried the following XQuery script out accordingly.
declare option output:method "csv"; declare option output:csv "header=yes, separator=|"; for $x in //test_data/info group by $topics := string-join($x/topics/topic/data(), "*") let $incidence := count($topics) order by $incidence descending return <csv> <record> <topic_combination>{$topics}</topic_combination> <incidence>{$incidence}</incidence> </record> </csv>
Corresponding test result:
topic_combination|incidence Demo1*Demo2|1 Test1*Test2*Test3|1 Probe1|1
I would like to see the numbers “2” and “3” instead at the end of two rows for such a data analysis approach. I would appreciate further advices for this use case.
Regards, Markus