Thanks for looking at it.
 
As I send in email chain,  you will see in output this.  But even though we group (innermost group by) by @NAME attribute, stillwe are getting dublicate values here like "Spec", Features"
 
<CID cidid="384">
  <PROPVALS name="Features"/>
  <PROPVALS name="Model"/>
  <PROPVALS name="Spec"/>
  <PROPVALS name="Features"/>
  <PROPVALS name="Model"/>
  <PROPVALS name="Spec"/>
  <PROPVALS name="Features"/>
  <PROPVALS name="Manufacturer Warranty"/>
  <PROPVALS name="Model"/>
  <PROPVALS name="Spec"/>
</CID>

On Sat, May 28, 2011 at 2:37 PM, Michael Seiferle <michael.seiferle@uni-konstanz.de> wrote:
Hi Erol,

I received your file and ran a minimally modified version (added: let $prod := doc('textxquery.xml')) of your testquery.

The results looked fine to me, so I tested it with two other implementations, both equal to BaseX' results .

What is the expected output of your query? Maybe we can work it out the "other way round" :-)

Kind regards
Michael


> let $prod := doc('textxquery.xml')
> for $p in $prod//RECORD
>                                 let $cid := $p/PROP[@NAME eq "SubCategoryId"]/PVAL
>                                 let $allprops := $p/PROPGROUP/PROP
>                                 let $propgprs := for $prop in $allprops
>                                                                       let $propname := $prop/@NAME
>                                                                       let $propvals := $prop/PVAL
>                                                                       group by $propname
>                                                                       order by $propname
>                                                                       return  <PROPVALS name="{$propname}">{$propvals}</PROPVALS>
>
>                                 group by $cid
>                                 order by $cid
>                                 return <CID cidid="{$cid}">{$propgprs}</CID>


Am 27.05.2011 um 20:29 schrieb Erol Akarsu:

> Michael ,
>
> Just run previous xquery script I have sent over testxquery.xml. You will see it.
>
> Thanks
>
>
> On Fri, May 27, 2011 at 1:44 PM, Michael Seiferle <michael.seiferle@uni-konstanz.de> wrote:
> Hi Erol,
> Thanks for your report. Could you (off list) provide me with a small part of the source file for reproducing this issue?
>
> Kind regards
> Michael
>
> --
> Mit freundlichen Grüßen
> Michael Seiferle
>
>
> Am 27.05.2011 um 16:38 schrieb Erol Akarsu <eakarsu@gmail.com>:
>
>> I am having issue with "group by" operator.
>>
>> If I use it within  another "group by", nested "group by" won't work as the following. $propvals in nested "group by" won't include grouped values.
>>
>> let $list := for $p in $prods//RECORD
>>                                 let $cid := $p/PROP[@NAME eq "SubCategoryId"]/PVAL
>>                                 let $allprops := $p/PROPGROUP/PROP
>>                                 let $propgprs := for $prop in $allprops
>>                                                                       let $propname := $prop/@NAME
>>                                                                       let $propvals := $prop/PVAL
>>                                                                       group by $propname
>>                                                                       order by $propname
>>                                                                       return  <PROPVALS name="{$propname}">{$propvals}</PROPVALS>
>>
>>                                 group by $cid
>>                                 order by $cid
>>                                 return <CID cidid="{$cid}">{$propgprs}</CID>
>>
>>
>>
>> Thanks
>>
>>
>> On Fri, May 13, 2011 at 9:08 AM, Erol Akarsu <eakarsu@gmail.com> wrote:
>> Is there a way Basex server can send us JSON output instead of xml?
>>
>> Thanks
>>
>> On Wed, May 11, 2011 at 10:28 AM, Erol Akarsu <eakarsu@gmail.com> wrote:
>> I am filtering one big xml db that 800Mb stored into basex db already.
>>
>> When I write filtered data into a file, I am getting out of memory.
>>
>> I know basex is generating result filtered document in memory before writing and facing memory error.
>>
>> Can basex write results block by block?
>>
>> Thanks
>>
>>
>> On Wed, May 4, 2011 at 9:38 AM, Erol Akarsu <eakarsu@gmail.com> wrote:
>> I tried to delete some nodes with "delete nodes .." xquery update command. It actually did what was requested.
>>
>> I checked the database size that is still same as previous one. I know the nodes deleted has a lot of data and I think database indexes would be adjusted accordingly.
>>
>> Then, I exported database as xml file and recreated db. I can see it has real size.
>>
>> My question is why DB is not adjusting indexes when nodes deleted?
>>
>> Thanks
>>
>> Erol Akarsu
>>
>>
>> On Fri, Apr 29, 2011 at 9:23 AM, Erol Akarsu <eakarsu@gmail.com> wrote:
>> Hi,
>>
>> Have we thought on clustering basex servers so we can partition xml documents?
>>
>> Here, I am only interested in a global partitioning:
>>
>> Let's say if we have xml document like this, I would like to partition RECORDS content so that each host will have equal number of RECORD elements. Then, we need to aggregate results of individual hosts.
>> Can we implement this simple clustering framework with Basex?
>>
>>
>> <RECORDS>
>>    <RECORD>
>> .....
>>   </RECORD>
>> <RECORD>
>> .....
>>   </RECORD>
>> </RECORDS>
>>
>> On Mon, Apr 11, 2011 at 12:35 PM, Erol Akarsu <eakarsu@gmail.com> wrote:
>> Ok,
>>
>> I was able to run full text search with another XML Database.
>>
>> I am primarily interested in how Basex will play with big XML file Wikipedia.
>>
>> Actually, Database create of wikipedia is fine. But when I add full tet search and indexes for it, it always throws out of memory exception error.
>>
>> I have changed -Xmx with 6GB that is still not enough to generate indexes for Wikipedia.
>>
>> Can you help me on how to generate indexes with a machine that case  6GB for Basex process?
>>
>> Thanks
>>
>> Erol Akarsu
>>
>>
>>
>> On Sun, Apr 10, 2011 at 4:07 AM, Andreas Weiler <andreas.weiler@uni-konstanz.de> wrote:
>> Hi,
>>
>> the following query could work for you:
>>
>> declare namespace w="http://www.mediawiki.org/xml/export-0.5/";
>> for $i in doc("enwiki-latest-pages-articles")//w:sitename
>> return $i[. contains text "Wikipedia"]/..
>>
>> -- Andreas
>>
>> Am 09.04.2011 um 20:43 schrieb Erol Akarsu:
>>
>>> Hi
>>>
>>> I am having difficulty in running full text operators. This script gives siteinfo below
>>> declare namespace w="http://www.mediawiki.org/xml/export-0.5/";
>>>  let $d :=  fn:doc ("enwiki-latest-pages-articles")
>>> return ($d//w:siteinfo)[1]
>>>
>>> But  return $d//w:siteinfo[w:sitename contains text 'Wikipedia'] does NOT give same node
>>> Why "contains" ft operator behave incorrectly? I remember it was working fine. I just dropped and recreated database and turn all indexes. Can you help me?
>>> Query info is here:
>>>
>>> Query: declare namespace w="http://www.mediawiki.org/xml/export-0.5/";
>>> Compiling:
>>> - pre-evaluating fn:doc("enwiki-latest-pages-articles")
>>> - adding text() step
>>> - optimizing descendant-or-self step(s)
>>> - removing path with no index results
>>> - pre-evaluating (())[1]
>>> - binding static variable $res
>>> - pre-evaluating fn:doc("enwiki-latest-pages-articles")
>>> - binding static variable $d
>>> - adding text() step
>>> - optimizing descendant-or-self step(s)
>>> - removing path with no index results
>>> - simplifying flwor expression
>>> Result: ()
>>> Timing:
>>>  - Parsing:  0.46 ms
>>>  - Compiling:  0.42 ms
>>>  - Evaluating:  0.17 ms
>>>  - Printing:  0.1 ms
>>>  - Total Time:  1.15 ms
>>> Query plan:
>>> <sequence size="0"/>
>>>
>>>
>>>
>>> <siteinfo xmlns="http://www.mediawiki.org/xml/export-0.5/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
>>>   <sitename>Wikipedia</sitename>
>>>   <base>http://en.wikipedia.org/wiki/Main_Page</base>
>>>   <generator>MediaWiki 1.17wmf1</generator>
>>>   <case>first-letter</case>
>>>   <namespaces>
>>>     <namespace key="-2" case="first-letter">Media</namespace>
>>>     <namespace key="-1" case="first-letter">Special</namespace>
>>>     <namespace key="0" case="first-letter"/>
>>>     <namespace key="1" case="first-letter">Talk</namespace>
>>>     <namespace key="2" case="first-letter">User</namespace>
>>>     <namespace key="3" case="first-letter">User talk</namespace>
>>>     <namespace key="4" case="first-letter">Wikipedia</namespace>
>>>     <namespace key="5" case="first-letter">Wikipedia talk</namespace>
>>>     <namespace key="6" case="first-letter">File</namespace>
>>>     <namespace key="7" case="first-letter">File talk</namespace>
>>>     <namespace key="8" case="first-letter">MediaWiki</namespace>
>>>     <namespace key="9" case="first-letter">MediaWiki talk</namespace>
>>>     <namespace key="10" case="first-letter">Template</namespace>
>>>     <namespace key="11" case="first-letter">Template talk</namespace>
>>>     <namespace key="12" case="first-letter">Help</namespace>
>>>     <namespace key="13" case="first-letter">Help talk</namespace>
>>>     <namespace key="14" case="first-letter">Category</namespace>
>>>     <namespace key="15" case="first-letter">Category talk</namespace>
>>>     <namespace key="100" case="first-letter">Portal</namespace>
>>>     <namespace key="101" case="first-letter">Portal talk</namespace>
>>>     <namespace key="108" case="first-letter">Book</namespace>
>>>     <namespace key="109" case="first-letter">Book talk</namespace>
>>>   </namespaces>
>>> </siteinfo>
>>>
>>> On Mon, Apr 4, 2011 at 7:31 AM, Erol Akarsu <eakarsu@gmail.com> wrote:
>>> I imported  wikipedia xml into basex and tried to search it.
>>>
>>> But searching it takes longer.
>>>
>>> I tried to search one element that is first child of whole document and it took 52 sec.
>>> I know the XML file is very big 31GB. How can I optimize the search?
>>>
>>> declare namespace w="http://www.mediawiki.org/xml/export-0.5/";
>>>
>>> let $d := fn:doc ("enwiki-latest-pages-articles")//w:siteinfo
>>> return $d
>>>
>>> Database info:
>>>
>>> > open enwiki-latest-pages-articles
>>> Database 'enwiki-latest-pages-articles' opened in 778.49 ms.
>>> > info database
>>> Database Properties
>>>  Name: enwiki-latest-pages-articles
>>>  Size: 23356 MB
>>>  Nodes: 228090153
>>>  Height: 6
>>>
>>> Database Creation
>>>  Path: /mnt/hgfs/C/tmp/enwiki-latest-pages-articles.xml
>>>  Time Stamp: 03.04.2011 12:29:15
>>>  Input Size: 30025 MB
>>>  Encoding: UTF-8
>>>  Documents: 1
>>>  Whitespace Chopping: ON
>>>  Entity Parsing: OFF
>>>
>>> Indexes
>>>  Up-to-date: true
>>>  Path Summary: ON
>>>  Text Index: ON
>>>  Attribute Index: ON
>>>  Full-Text Index: OFF
>>> >
>>>
>>>
>>> Timing info:
>>>
>>> Query: declare namespace w="http://www.mediawiki.org/xml/export-0.5/";
>>> Compiling:
>>> - pre-evaluating fn:doc("enwiki-latest-pages-articles")
>>> - optimizing descendant-or-self step(s)
>>> - binding static variable $d
>>> - removing variable $d
>>> - simplifying flwor expression
>>> Result: element siteinfo { ... }
>>> Timing:
>>>  - Parsing:  1.4 ms
>>>  - Compiling:  52599.0 ms
>>>  - Evaluating:  0.28 ms
>>>  - Printing:  0.62 ms
>>>  - Total Time:  52601.32 ms
>>> Query plan:
>>> <DBNode name="enwiki-latest-pages-articles" pre="5"/>
>>>
>>>
>>>
>>> Result of query:
>>>
>>> <siteinfo xmlns="http://www.mediawiki.org/xml/export-0.5/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
>>>   <sitename>Wikipedia</sitename>
>>>   <base>http://en.wikipedia.org/wiki/Main_Page</base>
>>>   <generator>MediaWiki 1.17wmf1</generator>
>>>   <case>first-letter</case>
>>>   <namespaces>
>>>     <namespace key="-2" case="first-letter">Media</namespace>
>>>     <namespace key="-1" case="first-letter">Special</namespace>
>>>     <namespace key="0" case="first-letter"/>
>>>     <namespace key="1" case="first-letter">Talk</namespace>
>>>     <namespace key="2" case="first-letter">User</namespace>
>>>     <namespace key="3" case="first-letter">User talk</namespace>
>>>     <namespace key="4" case="first-letter">Wikipedia</namespace>
>>>     <namespace key="5" case="first-letter">Wikipedia talk</namespace>
>>>     <namespace key="6" case="first-letter">File</namespace>
>>>     <namespace key="7" case="first-letter">File talk</namespace>
>>>     <namespace key="8" case="first-letter">MediaWiki</namespace>
>>>     <namespace key="9" case="first-letter">MediaWiki talk</namespace>
>>>     <namespace key="10" case="first-letter">Template</namespace>
>>>     <namespace key="11" case="first-letter">Template talk</namespace>
>>>     <namespace key="12" case="first-letter">Help</namespace>
>>>     <namespace key="13" case="first-letter">Help talk</namespace>
>>>     <namespace key="14" case="first-letter">Category</namespace>
>>>     <namespace key="15" case="first-letter">Category talk</namespace>
>>>     <namespace key="100" case="first-letter">Portal</namespace>
>>>     <namespace key="101" case="first-letter">Portal talk</namespace>
>>>     <namespace key="108" case="first-letter">Book</namespace>
>>>     <namespace key="109" case="first-letter">Book talk</namespace>
>>>   </namespaces>
>>> </siteinfo>
>>>
>>>
>>>
>>> Thanks
>>>
>>> Erol Akarsu
>>>
>>>
>>> _______________________________________________
>>> BaseX-Talk mailing list
>>> BaseX-Talk@mailman.uni-konstanz.de
>>> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> BaseX-Talk mailing list
>> BaseX-Talk@mailman.uni-konstanz.de
>> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>