Differences in serialization of arrays with JSON vs. adaptive methods

List overview All Threads
Download

newer

older

db:replace modified date

Shouldn't CHOP = false make...

Joe Wicentowski

10 Aug 2017 10 Aug '17

10:35 a.m.

Hi all,

First, I'm just back from DH2017, where Clifford Anderson and I taught two workshops on XQuery using BaseX, along with eXist and Saxon. BaseX performed like a champ. We were able to configure the GUI window to show just the query and results windows—perfect when you're projecting the screen in a large room and want everyone to see. Many thanks for such a great teaching tool! (Our materials are at https://github.com/CliffordAnderson/XQuery4Humanists.)

Back to the topic of this post, though, I noticed a slight difference between BaseX's serialization of arrays when using JSON vs. adaptive methods: with JSON, the array's items are separated by newlines, whereas with adaptive, the items are separated by spaces. This is interesting since the serialization spec notes that the adaptive method delegates the handling of the "indent" parameter to JSON. Some code to reproduce this is below.

I'm curious to know - is there a particular reason for this difference?

Thanks, Joe

serialization-test.xq ```xquery xquery version "3.1";

declare namespace output="http://www.w3.org/2010/xslt-xquery-serialization"; let $array := ["Cheapside","London","Dean Prior","Devon"] for $method in ("json", "adaptive") let $serialization-parameters := output:serialization-parameters output:method{$method}</output:method> output:indentyes</output:indent> </output:serialization-parameters> return fn:serialize($array, $serialization-parameters) ```

serialization-test_results.txt ```txt [ "Cheapside", "London", "Dean Prior", "Devon" ] ["Cheapside", "London", "Dean Prior", "Devon"] ```

Attachments:

attachment.html (text/html — 2.4 KB)

Show replies by date

Giuseppe Celano

10 Aug 10 Aug

10:50 a.m.

New subject: Differences in serialization of arrays with JSON vs. adaptive methods

Hi Joe,

I am happy to hear you are also spreading the word! XQuery has a most clean data model, and BaseX has implemented and extended the language so efficiently and elegantly.

Best, Giuseppe

Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/

...

On 10 Aug 2017, at 16:35, Joe Wicentowski joewiz@gmail.com wrote:

Hi all,

First, I'm just back from DH2017, where Clifford Anderson and I taught two workshops on XQuery using BaseX, along with eXist and Saxon. BaseX performed like a champ. We were able to configure the GUI window to show just the query and results windows—perfect when you're projecting the screen in a large room and want everyone to see. Many thanks for such a great teaching tool! (Our materials are at https://github.com/CliffordAnderson/XQuery4Humanists https://github.com/CliffordAnderson/XQuery4Humanists.)

Back to the topic of this post, though, I noticed a slight difference between BaseX's serialization of arrays when using JSON vs. adaptive methods: with JSON, the array's items are separated by newlines, whereas with adaptive, the items are separated by spaces. This is interesting since the serialization spec notes that the adaptive method delegates the handling of the "indent" parameter to JSON. Some code to reproduce this is below.

I'm curious to know - is there a particular reason for this difference?

Thanks, Joe

serialization-test.xq
xquery version "3.1";

declare namespace output="http://www.w3.org/2010/xslt-xquery-serialization <http://www.w3.org/2010/xslt-xquery-serialization>";
let $array := ["Cheapside","London","Dean Prior","Devon"]
for $method in ("json", "adaptive")
let $serialization-parameters := 
  <output:serialization-parameters>
    <output:method>{$method}</output:method>
    <output:indent>yes</output:indent>
  </output:serialization-parameters>
return
  fn:serialize($array, $serialization-parameters)
serialization-test_results.txt
[
  "Cheapside",
  "London",
  "Dean Prior",
  "Devon"
]
["Cheapside", "London", "Dean Prior", "Devon"]

Christian Grün

1:37 p.m.

New subject: Differences in serialization of arrays with JSON vs. adaptive methods

Dear Joe,

Thanks for the kind feedback. I am glad to hear BaseX was useful in your DH 2017 workshops.

...

the serialization spec notes that the adaptive method delegates the handling of the "indent" parameter to JSON.

Could you possibly point me to this rule in the spec? I remember there was a lot discussion about the adaptive serialization method in the W3 Working Group. As it was difficult to define rules that cover requirements of all members, the initial version differs quite a lot from the final proposal, and various details were left to the implementation (because it was assumed that the method will mostly be used for debugging). I looked up the final version serialization spec [1], which states in 10.1.4 that:

“The indent and suppress-indentation parameters are not directly applicable to the Adaptive output method.”

In BaseX, the parameter is considered indeed when serializing maps and arrays (and other data types as well), but there are various differences between the two serialization methods. Consider the following example (which should also work with other XQuery processors):

xquery version "3.1"; for $method in ('adaptive', 'json') return ( "METHOD: " || $method, "OUTPUT: " || ( try { serialize( map { 'functions': [ false#0, true#0 ]}, map { 'method': $method } ) } catch * { $err:description } ) )

The adaptive can be used to serialize items of any type, whereas the json method is restricted to types that can be represented in JSON.

Does this help? Christian

[1] https://www.w3.org/TR/xslt-xquery-serialization-31/#ADAPTIVE_INDENT

On Thu, Aug 10, 2017 at 4:35 PM, Joe Wicentowski joewiz@gmail.com wrote:

...

Hi all,

First, I'm just back from DH2017, where Clifford Anderson and I taught two workshops on XQuery using BaseX, along with eXist and Saxon. BaseX performed like a champ. We were able to configure the GUI window to show just the query and results windows—perfect when you're projecting the screen in a large room and want everyone to see. Many thanks for such a great teaching tool! (Our materials are at https://github.com/CliffordAnderson/XQuery4Humanists.)

Back to the topic of this post, though, I noticed a slight difference between BaseX's serialization of arrays when using JSON vs. adaptive methods: with JSON, the array's items are separated by newlines, whereas with adaptive, the items are separated by spaces. This is interesting since the serialization spec notes that the adaptive method delegates the handling of the "indent" parameter to JSON. Some code to reproduce this is below.

I'm curious to know - is there a particular reason for this difference?

Thanks, Joe

serialization-test.xq
xquery version "3.1";

declare namespace output="http://www.w3.org/2010/xslt-xquery-serialization";
let $array := ["Cheapside","London","Dean Prior","Devon"]
for $method in ("json", "adaptive")
let $serialization-parameters :=
  <output:serialization-parameters>
    <output:method>{$method}</output:method>
    <output:indent>yes</output:indent>
  </output:serialization-parameters>
return
  fn:serialize($array, $serialization-parameters)
serialization-test_results.txt
[
  "Cheapside",
  "London",
  "Dean Prior",
  "Devon"
]
["Cheapside", "London", "Dean Prior", "Devon"]

Joe Wicentowski

3:51 p.m.

New subject: Differences in serialization of arrays with JSON vs. adaptive methods

Hi Christian,

Thanks for your reply. I agree that the spec is not entirely clear here, but my understanding of the spec was based on the interpretations advanced by Michael Kay and Liam Quin on this xquery-talk thread about the question of indentation under the adaptive method:

http://markmail.org/message/dixi7e7qq2ttde74

Joe

On Thu, Aug 10, 2017 at 1:37 PM, Christian Grün christian.gruen@gmail.com wrote:

...

Dear Joe,

Thanks for the kind feedback. I am glad to hear BaseX was useful in your DH 2017 workshops.

...
the serialization spec notes that the adaptive method delegates the

handling

...
of the "indent" parameter to JSON.

Could you possibly point me to this rule in the spec? I remember there was a lot discussion about the adaptive serialization method in the W3 Working Group. As it was difficult to define rules that cover requirements of all members, the initial version differs quite a lot from the final proposal, and various details were left to the implementation (because it was assumed that the method will mostly be used for debugging). I looked up the final version serialization spec [1], which states in 10.1.4 that:

“The indent and suppress-indentation parameters are not directly applicable to the Adaptive output method.”

In BaseX, the parameter is considered indeed when serializing maps and arrays (and other data types as well), but there are various differences between the two serialization methods. Consider the following example (which should also work with other XQuery processors):

xquery version "3.1"; for $method in ('adaptive', 'json') return ( "METHOD: " || $method, "OUTPUT: " || ( try { serialize( map { 'functions': [ false#0, true#0 ]}, map { 'method': $method } ) } catch * { $err:description } ) )

The adaptive can be used to serialize items of any type, whereas the json method is restricted to types that can be represented in JSON.

Does this help? Christian

[1] https://www.w3.org/TR/xslt-xquery-serialization-31/#ADAPTIVE_INDENT

On Thu, Aug 10, 2017 at 4:35 PM, Joe Wicentowski joewiz@gmail.com wrote:

...
Hi all,

First, I'm just back from DH2017, where Clifford Anderson and I taught

two

...
workshops on XQuery using BaseX, along with eXist and Saxon. BaseX performed like a champ. We were able to configure the GUI window to show just the query and results windows—perfect when you're projecting the

screen

...
in a large room and want everyone to see. Many thanks for such a great teaching tool! (Our materials are at https://github.com/CliffordAnderson/XQuery4Humanists.)

Back to the topic of this post, though, I noticed a slight difference between BaseX's serialization of arrays when using JSON vs. adaptive methods: with JSON, the array's items are separated by newlines, whereas with adaptive, the items are separated by spaces. This is interesting

since

...
the serialization spec notes that the adaptive method delegates the

handling

...
of the "indent" parameter to JSON. Some code to reproduce this is below.

I'm curious to know - is there a particular reason for this difference?

Thanks, Joe

serialization-test.xq
xquery version "3.1";

declare namespace output="http://www.w3.org/
2010/xslt-xquery-serialization";

...
let $array := ["Cheapside","London","Dean Prior","Devon"] for $method in ("json", "adaptive") let $serialization-parameters := output:serialization-parameters output:method{$method}</output:method> output:indentyes</output:indent> </output:serialization-parameters> return fn:serialize($array, $serialization-parameters)
serialization-test_results.txt
```txt
[
  "Cheapside",
  "London",
  "Dean Prior",
  "Devon"
]
["Cheapside", "London", "Dean Prior", "Devon"]

Christian Grün

5:04 p.m.

New subject: Differences in serialization of arrays with JSON vs. adaptive methods

Hi Joe,

Thanks for the link. So I noticed that you were quoting exactly the same phrase of the spec as I did. ;)

I just checked what Saxon does: It seems to ignore the value of the indent parameter when serializing arrays with the adaptive method.

So I guess that every implementation of XQuery 3.1 serializes arrays slightly differently, and the spec is probably too fuzzy to give a more precise answer.

In general, I would have been happy if the adaptive method had been renamed to 'debug', and if another method had been added to the spec that is similar to our custom 'basex' method (which allows users to serialize all items – including maps, arrays and attributes – in a flavor that does not look like debugging output). In fact the initial version of the 'adaptive' method was more similar to ours (for example, strings were output without quotes). It changed a lot over the time, and we eventually decided to include a custom method.

Well, it’s easy to ask for new features, and much more demanding to write specifications that satisfy everyone.

Christian

Am 10.08.2017 9:52 nachm. schrieb "Joe Wicentowski" joewiz@gmail.com:

Hi Christian,

http://markmail.org/message/dixi7e7qq2ttde74

Joe

On Thu, Aug 10, 2017 at 1:37 PM, Christian Grün christian.gruen@gmail.com wrote:

...

Dear Joe,

Thanks for the kind feedback. I am glad to hear BaseX was useful in your DH 2017 workshops.

...
the serialization spec notes that the adaptive method delegates the

handling

...
of the "indent" parameter to JSON.

Could you possibly point me to this rule in the spec? I remember there was a lot discussion about the adaptive serialization method in the W3 Working Group. As it was difficult to define rules that cover requirements of all members, the initial version differs quite a lot from the final proposal, and various details were left to the implementation (because it was assumed that the method will mostly be used for debugging). I looked up the final version serialization spec [1], which states in 10.1.4 that:

“The indent and suppress-indentation parameters are not directly applicable to the Adaptive output method.”

In BaseX, the parameter is considered indeed when serializing maps and arrays (and other data types as well), but there are various differences between the two serialization methods. Consider the following example (which should also work with other XQuery processors):

xquery version "3.1"; for $method in ('adaptive', 'json') return ( "METHOD: " || $method, "OUTPUT: " || ( try { serialize( map { 'functions': [ false#0, true#0 ]}, map { 'method': $method } ) } catch * { $err:description } ) )

The adaptive can be used to serialize items of any type, whereas the json method is restricted to types that can be represented in JSON.

Does this help? Christian

[1] https://www.w3.org/TR/xslt-xquery-serialization-31/#ADAPTIVE_INDENT

On Thu, Aug 10, 2017 at 4:35 PM, Joe Wicentowski joewiz@gmail.com wrote:

...
Hi all,

First, I'm just back from DH2017, where Clifford Anderson and I taught

two

...
workshops on XQuery using BaseX, along with eXist and Saxon. BaseX performed like a champ. We were able to configure the GUI window to show just the query and results windows—perfect when you're projecting the

screen

...
in a large room and want everyone to see. Many thanks for such a great teaching tool! (Our materials are at https://github.com/CliffordAnderson/XQuery4Humanists.)

Back to the topic of this post, though, I noticed a slight difference between BaseX's serialization of arrays when using JSON vs. adaptive methods: with JSON, the array's items are separated by newlines, whereas with adaptive, the items are separated by spaces. This is interesting

since

...
the serialization spec notes that the adaptive method delegates the

handling

...
of the "indent" parameter to JSON. Some code to reproduce this is below.

I'm curious to know - is there a particular reason for this difference?

Thanks, Joe

serialization-test.xq
xquery version "3.1";

declare namespace output="http://www.w3.org/2010
/xslt-xquery-serialization";

...
let $array := ["Cheapside","London","Dean Prior","Devon"] for $method in ("json", "adaptive") let $serialization-parameters := output:serialization-parameters output:method{$method}</output:method> output:indentyes</output:indent> </output:serialization-parameters> return fn:serialize($array, $serialization-parameters)
serialization-test_results.txt
```txt
[
  "Cheapside",
  "London",
  "Dean Prior",
  "Devon"
]
["Cheapside", "London", "Dean Prior", "Devon"]

Joe Wicentowski

5:40 p.m.

New subject: Differences in serialization of arrays with JSON vs. adaptive methods

Hi Christian,

I actually quite like the adaptive serialization method and have made it the default in eXide. From my perspective in teaching XQuery, showing an xs:string item in quotes (and integers sans quotes) helps reinforce the concept of data types. It feels to me like a datatype-sensitive view of results, rather than a debug method. Besides string handling, though, are there other aspects of "adaptive" that you dislike compared to the default "basex" method?

Have you considered adding a preference or toggle for selecting the default serialization method used in the GUI's results? For comparison/inspiration, you might see the screenshots of the serialization method dropdown and indent checkbox I added to eXide: https://github.com/wolfgangmm/eXide/pull/168#issuecomment-307592370 - which in my mind makes eXide quite a powerful for quickly experimenting with different serializations of results - a particularly time-saving feature given how verbose the boilerplate is for specifying serialization methods.

Joe

On Thu, Aug 10, 2017 at 5:04 PM, Christian Grün christian.gruen@gmail.com wrote:

...

Hi Joe,

Thanks for the link. So I noticed that you were quoting exactly the same phrase of the spec as I did. ;)

I just checked what Saxon does: It seems to ignore the value of the indent parameter when serializing arrays with the adaptive method.

So I guess that every implementation of XQuery 3.1 serializes arrays slightly differently, and the spec is probably too fuzzy to give a more precise answer.

In general, I would have been happy if the adaptive method had been renamed to 'debug', and if another method had been added to the spec that is similar to our custom 'basex' method (which allows users to serialize all items – including maps, arrays and attributes – in a flavor that does not look like debugging output). In fact the initial version of the 'adaptive' method was more similar to ours (for example, strings were output without quotes). It changed a lot over the time, and we eventually decided to include a custom method.

Well, it’s easy to ask for new features, and much more demanding to write specifications that satisfy everyone.

Christian

Am 10.08.2017 9:52 nachm. schrieb "Joe Wicentowski" joewiz@gmail.com:

Hi Christian,

Thanks for your reply. I agree that the spec is not entirely clear here, but my understanding of the spec was based on the interpretations advanced by Michael Kay and Liam Quin on this xquery-talk thread about the question of indentation under the adaptive method:

http://markmail.org/message/dixi7e7qq2ttde74

Joe

On Thu, Aug 10, 2017 at 1:37 PM, Christian Grün <christian.gruen@gmail.com

...
wrote:

...
Dear Joe,

Thanks for the kind feedback. I am glad to hear BaseX was useful in your DH 2017 workshops.

...
the serialization spec notes that the adaptive method delegates the

handling

...
of the "indent" parameter to JSON.

Could you possibly point me to this rule in the spec? I remember there was a lot discussion about the adaptive serialization method in the W3 Working Group. As it was difficult to define rules that cover requirements of all members, the initial version differs quite a lot from the final proposal, and various details were left to the implementation (because it was assumed that the method will mostly be used for debugging). I looked up the final version serialization spec [1], which states in 10.1.4 that:

“The indent and suppress-indentation parameters are not directly applicable to the Adaptive output method.”

In BaseX, the parameter is considered indeed when serializing maps and arrays (and other data types as well), but there are various differences between the two serialization methods. Consider the following example (which should also work with other XQuery processors):

xquery version "3.1"; for $method in ('adaptive', 'json') return ( "METHOD: " || $method, "OUTPUT: " || ( try { serialize( map { 'functions': [ false#0, true#0 ]}, map { 'method': $method } ) } catch * { $err:description } ) )

The adaptive can be used to serialize items of any type, whereas the json method is restricted to types that can be represented in JSON.

Does this help? Christian

[1] https://www.w3.org/TR/xslt-xquery-serialization-31/#ADAPTIVE_INDENT

On Thu, Aug 10, 2017 at 4:35 PM, Joe Wicentowski joewiz@gmail.com wrote:

...
Hi all,

First, I'm just back from DH2017, where Clifford Anderson and I taught

two

...
workshops on XQuery using BaseX, along with eXist and Saxon. BaseX performed like a champ. We were able to configure the GUI window to

show

...
just the query and results windows—perfect when you're projecting the

screen

...
in a large room and want everyone to see. Many thanks for such a great teaching tool! (Our materials are at https://github.com/CliffordAnderson/XQuery4Humanists.)

Back to the topic of this post, though, I noticed a slight difference between BaseX's serialization of arrays when using JSON vs. adaptive methods: with JSON, the array's items are separated by newlines, whereas with adaptive, the items are separated by spaces. This is interesting

since

...
the serialization spec notes that the adaptive method delegates the

handling

...
of the "indent" parameter to JSON. Some code to reproduce this is

below.

...
I'm curious to know - is there a particular reason for this difference?

Thanks, Joe

serialization-test.xq
xquery version "3.1";

declare namespace output="http://www.w3.org/2010
/xslt-xquery-serialization";

...
let $array := ["Cheapside","London","Dean Prior","Devon"] for $method in ("json", "adaptive") let $serialization-parameters := output:serialization-parameters output:method{$method}</output:method> output:indentyes</output:indent> </output:serialization-parameters> return fn:serialize($array, $serialization-parameters)
serialization-test_results.txt
```txt
[
  "Cheapside",
  "London",
  "Dean Prior",
  "Devon"
]
["Cheapside", "London", "Dean Prior", "Devon"]

Christian Grün

6:40 p.m.

New subject: Differences in serialization of arrays with JSON vs. adaptive methods

Hi Joe,

...

Have you considered adding a preference or toggle for selecting the default serialization method used in the GUI's results?

Sounds like an enticing idea! Something similar is embedded in our Database Export dialog (see menü items 'Database', 'Export…'). I haven’t touched it for years, and it could surely be revised as well. I will definitely think about adding something like this in to our Result View [1].

...

From my perspective in teaching XQuery, showing an xs:string item in quotes (and integers sans quotes) helps reinforce the concept of data types.

This is a good thought indeed.

...

Besides string handling, though, are there other aspects of "adaptive" that you dislike compared to the default "basex" method?

I would say that both methods (now) serve different purposes:

• Our 'basex' method was included because BaseX is used in many different contexts, and we were looking for a single serialization method that can be used for as many use cases as possible at the same time. If BaseX is used on command-line, it can be convenient if the textual output (usually XML, strings, numbers) can directly be passed on to other commands, or saved in text files. If the GUI is used, text from the result view can be copied and pasted to other tools (such as CSV output, which can be pasted in Excel, etc.).

• The 'adaptive' simplifies the recycling of results in other XQuery expressions. I agree it also helps users to understand the differences between data types. I find it a bit confusing, however, that some items will be output with a constructor function, whereas other will simply be output as strings. Some examples:

1, xs:double(1), 'a"b', xs:anyURI('a"b'), xs:QName('xml:x'), <x a='b'/>/@a

…will be serialized as…

1 1 "a""b" xs:anyURI("a""b") Q{http://www.w3.org/XML/1998/namespace%7Dx a="b"

It would probably have been more consistent to create output that can always be reused, and that always contains the datatype:

xs:integer(1) xs:double(1) xs:string("a""b") xs:anyURI("a""b") xs:QName("xml:x"), attribute a { "b" }

Well, maybe the type could have been omitted for xs:integer and xs:string, but as constructors are added for many types, I believe that any ambiguities should have been avoided.

There are surely many things that would need to be considered (for example, a namespace of a prefix might not be declared; anonymous functions could only be re-used if the full function body was serialized as well; etc).

Just my two cents, Christian

[1] https://github.com/BaseXdb/basex/issues/1484

Christian Grün

11:29 p.m.

New subject: Differences in serialization of arrays with JSON vs. adaptive methods

Hi there,

To begin with: I forgot to mention that you can change the serialization method by launching the command "set serializer method=adaptive" in the command input text field on top of the BaseX window.

I agree this is not very comfortable, so I have now added a new interaction component for changing all serialization parameters in the GUI. I have decided to move the input components to the Preferences dialog (Ctrl-Shift-P, Visualization panel), as I’d like to keep the main interface clean. If more people ask for it, I might add a dropdown menu for the serialization method on top of the Result View, let’s see.

The new component for adjusting the serialization parameters is also available now in the Export Database dialog. A new stable snapshot is available [1], and the next minor release will be available around end of August.

Looking forward to your feedback, Christian

[1] http://files.basex.org/releases/latest/

On Fri, Aug 11, 2017 at 12:40 AM, Christian Grün christian.gruen@gmail.com wrote:

...

Hi Joe,

...
Have you considered adding a preference or toggle for selecting the default serialization method used in the GUI's results?

Sounds like an enticing idea! Something similar is embedded in our Database Export dialog (see menü items 'Database', 'Export…'). I haven’t touched it for years, and it could surely be revised as well. I will definitely think about adding something like this in to our Result View [1].

...
From my perspective in teaching XQuery, showing an xs:string item in quotes (and integers sans quotes) helps reinforce the concept of data types.

This is a good thought indeed.

...
Besides string handling, though, are there other aspects of "adaptive" that you dislike compared to the default "basex" method?

I would say that both methods (now) serve different purposes:

• Our 'basex' method was included because BaseX is used in many different contexts, and we were looking for a single serialization method that can be used for as many use cases as possible at the same time. If BaseX is used on command-line, it can be convenient if the textual output (usually XML, strings, numbers) can directly be passed on to other commands, or saved in text files. If the GUI is used, text from the result view can be copied and pasted to other tools (such as CSV output, which can be pasted in Excel, etc.).

• The 'adaptive' simplifies the recycling of results in other XQuery expressions. I agree it also helps users to understand the differences between data types. I find it a bit confusing, however, that some items will be output with a constructor function, whereas other will simply be output as strings. Some examples:

1, xs:double(1), 'a"b', xs:anyURI('a"b'), xs:QName('xml:x'), <x a='b'/>/@a

…will be serialized as…

1 1 "a""b" xs:anyURI("a""b") Q{http://www.w3.org/XML/1998/namespace%7Dx a="b"

It would probably have been more consistent to create output that can always be reused, and that always contains the datatype:

xs:integer(1) xs:double(1) xs:string("a""b") xs:anyURI("a""b") xs:QName("xml:x"), attribute a { "b" }

Well, maybe the type could have been omitted for xs:integer and xs:string, but as constructors are added for many types, I believe that any ambiguities should have been avoided.

There are surely many things that would need to be considered (for example, a namespace of a prefix might not be declared; anonymous functions could only be re-used if the full function body was serialized as well; etc).

Just my two cents, Christian

[1] https://github.com/BaseXdb/basex/issues/1484

Joe Wicentowski

11 Aug 11 Aug

2:19 p.m.

New subject: Differences in serialization of arrays with JSON vs. adaptive methods

Hi Christian,

I just gave it a try, and the new serialization preferences work like a charm.

It's pretty amazing to see a request go from idea to implementation so quickly - thank you! I think it's a nice touch that serialization options specified in a query are respected, so the preferences only affect the defaults that apply to a query - they don't override anything in a query.

Your listing of ambiguities in the adaptive method are certainly valid. The quote escaping does bother me, so it does come at a cost, I guess. I do find myself quickly switching away to JSON or text serialization to avoid some of these quirks when I don't like the look of adaptive.

And, to close the loop, I agree that the spec remains ambiguous in the area of indentation of maps and arrays, and while BaseX in no way violates the spec here, I personally would advocate for consistency in the way maps and arrays are indented and serialized across the various map- and array-aware methods. But this is a terribly minor point, and folks who want uniformity can always use another tool to pretty-print their data.

Finally, I just remembered observation that came up during the class - the BaseX GUI editor's syntax highlighting sometimes breaks down in odd places, such as the middle of a function name, e.g., in `fn:format-date()`, the `fn` and `-date` are colored black, but `format` is colored blue. I've attached a screenshot showing the phenomenon. Again, this isn't a major issue, and syntax highlighting probably can't always be perfect, but I thought I'd mention it since it came up.

Thanks again for all of your kind assistance! Joe

On Thu, Aug 10, 2017 at 11:29 PM, Christian Grün christian.gruen@gmail.com wrote:

...

Hi there,

To begin with: I forgot to mention that you can change the serialization method by launching the command "set serializer method=adaptive" in the command input text field on top of the BaseX window.

I agree this is not very comfortable, so I have now added a new interaction component for changing all serialization parameters in the GUI. I have decided to move the input components to the Preferences dialog (Ctrl-Shift-P, Visualization panel), as I’d like to keep the main interface clean. If more people ask for it, I might add a dropdown menu for the serialization method on top of the Result View, let’s see.

The new component for adjusting the serialization parameters is also available now in the Export Database dialog. A new stable snapshot is available [1], and the next minor release will be available around end of August.

Looking forward to your feedback, Christian

[1] http://files.basex.org/releases/latest/

On Fri, Aug 11, 2017 at 12:40 AM, Christian Grün christian.gruen@gmail.com wrote:

...
Hi Joe,

...
Have you considered adding a preference or toggle for selecting the

default

...
...
serialization method used in the GUI's results?

Sounds like an enticing idea! Something similar is embedded in our Database Export dialog (see menü items 'Database', 'Export…'). I haven’t touched it for years, and it could surely be revised as well. I will definitely think about adding something like this in to our Result View [1].

...
From my perspective in teaching XQuery, showing an xs:string item in quotes (and integers sans quotes) helps reinforce the concept of data types.

This is a good thought indeed.

...
Besides string handling, though, are there other aspects of "adaptive" that you dislike compared to the

default

...
...
"basex" method?

I would say that both methods (now) serve different purposes:

• Our 'basex' method was included because BaseX is used in many different contexts, and we were looking for a single serialization method that can be used for as many use cases as possible at the same time. If BaseX is used on command-line, it can be convenient if the textual output (usually XML, strings, numbers) can directly be passed on to other commands, or saved in text files. If the GUI is used, text from the result view can be copied and pasted to other tools (such as CSV output, which can be pasted in Excel, etc.).

• The 'adaptive' simplifies the recycling of results in other XQuery expressions. I agree it also helps users to understand the differences between data types. I find it a bit confusing, however, that some items will be output with a constructor function, whereas other will simply be output as strings. Some examples:

1, xs:double(1), 'a"b', xs:anyURI('a"b'), xs:QName('xml:x'), <x a='b'/>/@a

…will be serialized as…

1 1 "a""b" xs:anyURI("a""b") Q{http://www.w3.org/XML/1998/namespace%7Dx a="b"

It would probably have been more consistent to create output that can always be reused, and that always contains the datatype:

xs:integer(1) xs:double(1) xs:string("a""b") xs:anyURI("a""b") xs:QName("xml:x"), attribute a { "b" }

Well, maybe the type could have been omitted for xs:integer and xs:string, but as constructors are added for many types, I believe that any ambiguities should have been avoided.

There are surely many things that would need to be considered (for example, a namespace of a prefix might not be declared; anonymous functions could only be re-used if the full function body was serialized as well; etc).

Just my two cents, Christian

[1] https://github.com/BaseXdb/basex/issues/1484

Christian Grün

2:42 p.m.

New subject: Differences in serialization of arrays with JSON vs. adaptive methods

Hi Joe,

...

Thanks again for all of your kind assistance!

…and thanks for your helpful feedback.

...

Finally, I just remembered observation that came up during the class - the BaseX GUI editor's syntax highlighting sometimes breaks down in odd places, such as the middle of a function name, e.g., in `fn:format-date()`, the `fn` and `-date` are colored black, but `format` is colored blue.

Oh yes, that’s true. Our custom renderer was optimized for performance. It’s possible to open and highlight pretty large XML documents, but the highlighting is very basic. The function 'format-date' is split into three tokens (format, -, date), and as 'format' is detected as keyword, it will be highlighted as the only term.

I’ve just decided to add all tokens that can be created from function names to our highlighter, and all pre-declared namespace prefixes. The rendered queries will even be “bluer” than before, but the overall appearance will hopefully be more consistent. Yet another snapshot is online [1].

...

I thought I'd mention it since it came up.

Always appreciated! Christian

[1] http://files.basex.org/releases/latest/

Joe Wicentowski

2:51 p.m.

New subject: Differences in serialization of arrays with JSON vs. adaptive methods

Hi Christian

...

Our custom renderer was optimized for performance. It’s possible to open and highlight pretty large XML documents, but the highlighting is very basic.

Ah, I see. Very interesting.

rendered queries will even be “bluer” than before, but the overall

...

appearance will hopefully be more consistent.

Having tried out the snapshot, I can confirm the queries are bluer!

Best, Joe

Christian Grün

13 Aug 13 Aug

5:48 a.m.

New subject: Differences in serialization of arrays with JSON vs. adaptive methods

...

Having tried out the snapshot, I can confirm the queries are bluer!

;) Talking about colors, I have slightly revised our color schemes (for XQuery, XML, JSON and JavaScript files) and made them more subtle.

Real input parsing is somewhere on our agenda as well. I noticed that the highlighting in eXide works excellent; I doubt we can compete with that in the near future.

2896

Age (days ago)

2899

Last active (days ago)

basex-talk@mailman.uni-konstanz.de

11 comments

3 participants

tags (0)

participants (3)

Christian Grün
Giuseppe Celano
Joe Wicentowski