Hi,
For those who follow this discussion and find it an interesting case, hereby an example of the problem I’m facing.
The example throws an error: Static variable depends on itself: ……
The example concerns a 3-step cyclic construct:
“BASE-TYPE.xqm” -> “SUPER-TYPE.xqm” -> “SUPER-SUPER-TYPE.xqm” -> “BASE-TYPE.xqm”,
starting of from “test.xq”.
I need the solution to comply with four rules:
- The module-functions must be %private and may only be addressed from outside the module through a public module-$variable. This way inheritance between modules can simply be implemented by the use of variables.
- The internal structure of a module-function must be completely hidden for use outside the module. So if the module-function calls another function it is not allowed to pass this called upon function or its name in the module-function’s parameterlist. That is why I can’t use Leo solution here.
- The 3-step cycle structure can not be changed.
- I need a generic solution.
The solution is of course that it just works; Christian is having a look at it, but I thought it an interesting enough case to share and maybe somebody comes up with an solution that does complies to these rules as a temporary 😉 work around. My solution for the time being is violating my first rule.
Have fun,
Rob Stapper
PS for using the example, just extract it and copy the extracted directory in BaseX’s src-directory. Start "test.xq” from within BaseX and it should work
Sent from Mail for Windows 10
--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
Hi Leo,
Thank you for your reply.
I’m aware of these kind of solutions, I use them frequently allthough I have to admit that these slipped my mind here. Great solutions by the way.
But getting the code working is not my problem. What I’m looking for is a generic solution for some sort of inheritance mechanism( see attachment for an example). The code-snippet is just meant to supply the BaseX-team with an basic executable example of the bare issue. I’m not even sure if it is a bug because I don’t know if it is behavior by design. But now the BaseX-team is aware of the situation.
Probably I should have said that the functions can be activated cyclically in stead of just recursively. Where cyclical can be interpret as indirect recursive. And in that case the solution doesn’t hold.
Best,
Rob
Sent from Mail for Windows 10
From: Leo Studer
Sent: Monday, October 19, 2020 7:19 PM
To: Rob Stapper
Subject: Re: [basex-talk] recursively used variables
… and you do not even need the variable
declare %private function local:test( $i, $test) { if ( $i > 0) then ($i, $test( $i - 1, $test)) else() } ;
local:test( 10, local:test#2)
Cheers
Leo
On 8 Oct 2020, at 14:17, Rob Stapper <r.stapper(a)lijbrandt.nl> wrote:
Hi,
The code[1] below and send as attachment generates a error message: “Static variable depends on itself: $Q{http://www.w3.org/2005/xquery-local-functions}test”.
I use these variables to refer to my private functions in my modules so I can easyly refer to them in a inheritance situation.
It’s not a big problem for me but I was wondering if the error-triggering is justified or that it should work.
[1]===========================================
declare variable $local:test := local:test#1 ;
declare %private function local:test( $i) { if ( $i > 0) then $local:test( $i - 1) } ;
$local:test( 10)
===========================================
Kind regards,
Rob Stapper
Sent from Mail for Windows 10
Virus-free. www.avast.com
<test.xq>
--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
Dear Christian,
I’d be happy to chime in on the quality of basexs Chinese language full-text capabilities. Chinese sources are my primary research area. What exactly do you have in mind?
Greetings
Duncan
Ceterum censeo exist-db.org esse conriganda
>
>
> Today's Topics:
>
> 1. Re: stemming chinese texts (Philippe Pons)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 14 Oct 2020 12:30:59 +0200
> From: Philippe Pons <philippe.pons(a)college-de-france.fr>
> To: basex-talk(a)mailman.uni-konstanz.de
> Subject: Re: [basex-talk] stemming chinese texts
> Message-ID:
> <d40e4b6e-29ab-f62f-1617-505db18e96a2(a)college-de-france.fr>
> Content-Type: text/plain; charset="windows-1252"; Format="flowed"
>
> Hi Christian,
>
> I suppose some of my colleagues would be able to judge the quality of
> your full-text search results.
>
> On the other hand, on code level, I'm not sure I know how to implement
> an additionnal class that extends abstract Tokenizer class.
>
> Thank you for your help
> Philippe
>
>
> Le 14/10/2020 ? 11:00, Christian Gr?n a ?crit?:
>> Hi Philippe,
>>
>> Thanks for your mail in private, in which I already gave you a little
>> assessment on what might be necessary to include the CJK tokenizers in
>> BaseX:
>>
>> The existing Apache code can be adapted and embedded into the BaseX
>> tokenizer infrastructure. On code level, an additional class needs to
>> be implemented that extends abstract Tokenizer class [1].
>>
>> As far as I can judge, the 3 Lucene CJK analyzers could all be applied
>> to traditional and simplified Chinese. If we found someone who could
>> rate the linguistic quality of our full-text search results, that?d
>> surely be helpful.
>>
>> Hope this helps,
>> Christian
>>
>> [1] https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/b…
>>
>>
>>
>> On Tue, Oct 13, 2020 at 12:32 PM Philippe Pons
>> <philippe.pons(a)college-de-france.fr> wrote:
>>> Dear Christian,
>>>
>>> Thank you very much for this quick and enlightening response.
>>>
>>> Without having had (yet) the opportunity to test it, I have indeed read the Japanese text tokenizer.
>>> Supporting Chinese tokenization would also be a great help.
>>>
>>> I have never tested what Lucene offers, especially since I have to manage texts in traditional Chinese and simplified Chinese (without reading either one myself).
>>> I would like to test Lucene's analyzers, but I don't know how to do it in BaseX?
>>>
>>> Best regards,
>>> Philippe Pons
>>>
>>>
>>>
>>> Le 12/10/2020 ? 12:01, Christian Gr?n a ?crit :
>>>
>>> Dear Philippe,
>>>
>>> As the Chinese language rarely uses inflection, there is usually no
>>> need to perform stemming on texts. However, tokenization will be
>>> necessary indeed. Right now, BaseX provides no tokenizer/analyzer for
>>> Chinese texts. It should be possible indeed to adopt code from Lucene,
>>> as we?ve already done for other languages (our software licenses allow
>>> that).
>>>
>>> Have you already worked with tokenization of Chinese texts in Lucene?
>>> If yes, which of the 3 available analyzers [1] have proven to yield
>>> the best results?
>>>
>>> As you may know, one of our users, Toshio HIRAI, has contributed a
>>> tokenizer for Japanes texts in the past [2]. If we decide to include
>>> support for Chinese tokenization, it might as well be interesting to
>>> compare the results of the Apache tokenizer with our internal
>>> tokenizer.
>>>
>>> Cordiales salutations,
>>> Christian
>>>
>>> [1] https://lucene.apache.org/core/7_2_0/analyzers-common/org/apache/lucene/ana…
>>> [2] https://docs.basex.org/wiki/Full-Text:_Japanese
>>>
>>>
>>>
>>> On Mon, Oct 12, 2020 at 11:37 AM Philippe Pons
>>> <philippe.pons(a)college-de-france.fr> wrote:
>>>
>>> Dear BaseX Team,
>>>
>>> I'm actually working on chinese texts in TEI.
>>> I would like to know if stemming chinese text is possible in BaseX, as we can do with other languages (like english or deutsch)?
>>> Or maybe there is a way to add this functionnality with Lucene?
>>>
>>> Best regards,
>>> Philippe Pons
>>>
>>> --
>>> Ing?nieur d'?tude charg? de l'?dition de corpus num?riques
>>> Centre de recherche sur les civilisations de l'Asie Orientale
>>> CRCAO - UMR 8155 (Coll?ge de France, EPHE, CNRS, PSL Research University, Univ Paris Diderot, Sorbonne Paris Cit?)
>>> 49bis avenue de la Belle Gabrielle
>>> 75012 Paris
>>> https://cv.archives-ouvertes.fr/ppons
>>>
>>>
>
>
Hi Hugo -
[responding to the list, too, because I'm well-known for missing the
obvious! :)]
On Thu, Oct 15, 2020 at 4:26 PM Silamphre <hugoscheithauer(a)gmail.com> wrote:
> Hi Bridger,
>
> Thank you for your kind reply and your assistance. I have tried using that
> command directly in the Ubuntu terminal, but unfortunately it works only
> when I set the value to 2, which is way too big for an interface aha. It
> seems to only be working with integers.
> I'll try editing the script, but since it refers to a java program and I'm
> not familiar with it, do you know if I can just copy that command line in
> the script, or if I need to write a more specific command line ?
>
> On a different OS, I had problems with JDK font rendering, so I was adding
flags like this to BaseX in the `basexgui` script like so:
```shell
# Run code
exec java -cp "$CP" -Dsun.java2d.uiScale=2.0 $BASEX_JVM org.basex.BaseXGUI
"$@"
```
i.e. adding the flag *after* the `-cp`/classpath flag in the command.
I just tried this on my current Linux system, and by bumping the number a
bit, I did increase some of the scaling, but only in the editor pane of the
GUI; i.e. I noticed the scaling particularly in the tabs. To be honest, I'm
not sure what to make of that, or how to take a different approach. Maybe
Christian or another use can give us some insights.
> Thank you again.
>
You are most welcome. Sorry that we couldn't get it completely solved.
Kind regards,
> Hugo
>
> Best,
Bridger
> Le jeu. 15 oct. 2020 à 19:32, Bridger Dyson-Smith <bdysonsmith(a)gmail.com>
> a écrit :
>
>> Hi Silamphre,
>>
>> I'm not sure if this will help or not, but editing the `basexgui` script
>> to include `-Dsun.java2d.uiScale=1.25` might help[1]? I confess that I
>> don't have UI scaling enabled on my unix-like system, or maybe you've
>> already tried that approach.
>> Hope that helps!
>> Best,
>> Bridger
>>
>> [1]
>> https://stackoverflow.com/questions/58699877/how-to-fix-scaling-of-a-java-b…
>>
>> On Thu, Oct 15, 2020 at 12:06 PM Silamphre <hugoscheithauer(a)gmail.com>
>> wrote:
>>
>>> Hello everyone,
>>>
>>> I started using BaseX for a class project a few days ago. I'd like to
>>> run it on my Ubuntu 20.04. I've installed openjdk version 14 to do so.
>>>
>>> I've set my Ubuntu display settings at 125% fractional scaling, but it
>>> appears that BaseX does not scale with this setting. The GUI consequently
>>> appears really tiny, almost unusable.
>>>
>>> Does someone have a way to solve this issue, and make BaseX scales
>>> accordingly to my Ubuntu display settings, please?
>>>
>>> Thank you for your help.
>>>
>>> Kind regards,
>>> Hugo
>>>
>>>
>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campai…> Garanti
>>> sans virus. www.avast.com
>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campai…>
>>> <#m_8249631326652388190_m_-8493036691201945712_m_240745489779756524_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>
>>
Hi all -
I hope you don't mind a question about serializing distinct XPaths. I'm
trying to generate some reports for documentation and the available
built-in and library functions[1] aren't quite what I need.
The output I'm after is:
test/aaa/bbb[@type="foo"]
test/aaa/bbb[@type="foo"][@enc="bar"]
test/aaa/bbb[@type="bzz"][@enc="bar"]
test/aaa/bbb[@type="qux"][@enc="bar"][@key="yes"][@point="start"]
test/aaa/bbb[@type="qux"][@enc="bar"][@key="yes"][@point="end"]
I have a couple of functions that are getting me close, but I can't quite
manage the output strings, and multiple children are causing me trouble
(entirely too much like real life). Any help or suggestions would be
greatly appreciated. I've created a gist[2] with the following example:
Thanks very much for your time and trouble.
Best,
Bridger
```xquery
xquery version "3.1";
declare variable $input :=
<test>
<aaa>
<bbb type="foo">bbb content</bbb>
</aaa>
<aaa>
<bbb type="foo" enc="bar">bbb content</bbb>
</aaa>
<aaa>
<bbb type="bzz" enc="bar">bbb content</bbb>
<bbb type="qux" enc="bar" key="yes" point="start">bbb content</bbb>
<bbb type="qux" enc="bar" key="yes" point="end">bbb content</bbb>
</aaa>
</test>;
declare function local:elem(
$nodes as node()*
) as xs:string* {
for $node in $nodes
return(
string-join(
if ($node/@*)
then (string-join((name($node) || string-join(for $att in
$node/@* return local:atty($att))), "/"), local:elem($node/child::*))
else if ($node/child::*)
then (for $child in $node/child::* return
local:elem($child), local:elem($node/child::*))
else (name($node) || "/" || local:elem($node/child::*))
)
)
};
declare function local:atty(
$att as attribute()
) as xs:string* {
"[@" || name($att) || "='" || data($att) || "']"
};
local:e2($input
(:
this currently returns
test/aaa/bbb[@type='foo']/
/
/aaa/bbb[@type='foo'][@enc='bar']/
/
/aaa/bbb[@type='bzz'][@enc='bar']/
/bbb[@type='qux'][@enc='bar'][@key='yes'][@point='start']/
/bbb[@type='qux'][@enc='bar'][@key='yes'][@point='end']/
/
:)
```
[1] fn:path, and the related functions from functx
(functx:distinct-element-paths, functx:path-to-node, and
functx:path-to-node-with-pos). The functx functions are really close and
awesome, but I need to incorporate attributes into my output.
[2] https://gist.github.com/CanOfBees/8cfb435ac06986c9b0b0c215a786f4d7
Hello everyone,
I started using BaseX for a class project a few days ago. I'd like to run
it on my Ubuntu 20.04. I've installed openjdk version 14 to do so.
I've set my Ubuntu display settings at 125% fractional scaling, but it
appears that BaseX does not scale with this setting. The GUI consequently
appears really tiny, almost unusable.
Does someone have a way to solve this issue, and make BaseX scales
accordingly to my Ubuntu display settings, please?
Thank you for your help.
Kind regards,
Hugo
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campai…>
Garanti
sans virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campai…>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
Hi Christian,
I suppose some of my colleagues would be able to judge the quality of
your full-text search results.
On the other hand, on code level, I'm not sure I know how to implement
an additionnal class that extends abstract Tokenizer class.
Thank you for your help
Philippe
Le 14/10/2020 à 11:00, Christian Grün a écrit :
> Hi Philippe,
>
> Thanks for your mail in private, in which I already gave you a little
> assessment on what might be necessary to include the CJK tokenizers in
> BaseX:
>
> The existing Apache code can be adapted and embedded into the BaseX
> tokenizer infrastructure. On code level, an additional class needs to
> be implemented that extends abstract Tokenizer class [1].
>
> As far as I can judge, the 3 Lucene CJK analyzers could all be applied
> to traditional and simplified Chinese. If we found someone who could
> rate the linguistic quality of our full-text search results, that’d
> surely be helpful.
>
> Hope this helps,
> Christian
>
> [1] https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/b…
>
>
>
> On Tue, Oct 13, 2020 at 12:32 PM Philippe Pons
> <philippe.pons(a)college-de-france.fr> wrote:
>> Dear Christian,
>>
>> Thank you very much for this quick and enlightening response.
>>
>> Without having had (yet) the opportunity to test it, I have indeed read the Japanese text tokenizer.
>> Supporting Chinese tokenization would also be a great help.
>>
>> I have never tested what Lucene offers, especially since I have to manage texts in traditional Chinese and simplified Chinese (without reading either one myself).
>> I would like to test Lucene's analyzers, but I don't know how to do it in BaseX?
>>
>> Best regards,
>> Philippe Pons
>>
>>
>>
>> Le 12/10/2020 à 12:01, Christian Grün a écrit :
>>
>> Dear Philippe,
>>
>> As the Chinese language rarely uses inflection, there is usually no
>> need to perform stemming on texts. However, tokenization will be
>> necessary indeed. Right now, BaseX provides no tokenizer/analyzer for
>> Chinese texts. It should be possible indeed to adopt code from Lucene,
>> as we’ve already done for other languages (our software licenses allow
>> that).
>>
>> Have you already worked with tokenization of Chinese texts in Lucene?
>> If yes, which of the 3 available analyzers [1] have proven to yield
>> the best results?
>>
>> As you may know, one of our users, Toshio HIRAI, has contributed a
>> tokenizer for Japanes texts in the past [2]. If we decide to include
>> support for Chinese tokenization, it might as well be interesting to
>> compare the results of the Apache tokenizer with our internal
>> tokenizer.
>>
>> Cordiales salutations,
>> Christian
>>
>> [1] https://lucene.apache.org/core/7_2_0/analyzers-common/org/apache/lucene/ana…
>> [2] https://docs.basex.org/wiki/Full-Text:_Japanese
>>
>>
>>
>> On Mon, Oct 12, 2020 at 11:37 AM Philippe Pons
>> <philippe.pons(a)college-de-france.fr> wrote:
>>
>> Dear BaseX Team,
>>
>> I'm actually working on chinese texts in TEI.
>> I would like to know if stemming chinese text is possible in BaseX, as we can do with other languages (like english or deutsch)?
>> Or maybe there is a way to add this functionnality with Lucene?
>>
>> Best regards,
>> Philippe Pons
>>
>> --
>> Ingénieur d'étude chargé de l'édition de corpus numériques
>> Centre de recherche sur les civilisations de l'Asie Orientale
>> CRCAO - UMR 8155 (Collège de France, EPHE, CNRS, PSL Research University, Univ Paris Diderot, Sorbonne Paris Cité)
>> 49bis avenue de la Belle Gabrielle
>> 75012 Paris
>> https://cv.archives-ouvertes.fr/ppons
>>
>>
Dear BaseX Team,
I'm actually working on chinese texts in TEI.
I would like to know if stemming chinese text is possible in BaseX, as
we can do with other languages (like english or deutsch)?
Or maybe there is a way to add this functionnality with Lucene?
Best regards,
Philippe Pons
--
Ingénieur d'étude chargé de l'édition de corpus numériques
Centre de recherche sur les civilisations de l'Asie Orientale
CRCAO - UMR 8155 (Collège de France, EPHE, CNRS, PSL Research University, Univ Paris Diderot, Sorbonne Paris Cité)
49bis avenue de la Belle Gabrielle
75012 Paris
https://cv.archives-ouvertes.fr/ppons
> I'm actually working on chinese texts in TEI.
> I would like to know if stemming chinese text is possible in BaseX, as
> we can do with other languages (like english or deutsch)?
> Or maybe there is a way to add this functionnality with Lucene?
>
> Best regards,
> Philippe Pons
>
Dear Philippe,
if by stemming you mean the removal of prefixes and suffixes to arrive at normalized word stems, the concept simply doesn’t apply to Chinese, so no it can’t be done.
What you are most likely looking for is the ability to tokenize strings into n-grams, which lucene can do.
https://lucene.apache.org/core/6_6_1/analyzers-common/org/apache/lucene/ana… <https://lucene.apache.org/core/6_6_1/analyzers-common/org/apache/lucene/ana…>
Greetings
Duncan