Hi,
I have noticed different speeds when running the following functions (from slowest to fastest):
parse-json(unparsed-text('example.txt')) json-doc("example.txt") parse-json(file:read-text('example.txt'))
similarly for documents on the web:
parse-doc('http://example.com/text') parse-json(fetch:text('http://example.com/text'))
Does this make sense to you? Is there any recommendation to follow? Despite the runtime speed difference, I admit I love writing one single function (json-doc) to get map conversion. Everything is so immediate :)
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
Hi Giuseppe,
The semantics of the used functions are slightly different, but I guess the differences in terms of performance should be rather marginal. Just in case: Have you have done some in-depth comparisons that I could have a look at? Did you work with large input files, or did you run the queries repeatedly and look at the average times?
Cheers, Christian
I have noticed different speeds when running the following functions (from slowest to fastest):
parse-json(unparsed-text('example.txt')) json-doc("example.txt") parse-json(file:read-text('example.txt'))
similarly for documents on the web:
parse-doc('http://example.com/text') parse-json(fetch:text('http://example.com/text'))
Does this make sense to you? Is there any recommendation to follow? Despite the runtime speed difference, I admit I love writing one single function (json-doc) to get map conversion. Everything is so immediate :)
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
Hi Christian,
The latter option. I just opened a file and run the same query repeatedly. It is not an in-depth comparison at all, but the times shown in the Query Info were clearly different (even if just ms).
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
On Aug 14, 2017, at 5:55 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Giuseppe,
The semantics of the used functions are slightly different, but I guess the differences in terms of performance should be rather marginal. Just in case: Have you have done some in-depth comparisons that I could have a look at? Did you work with large input files, or did you run the queries repeatedly and look at the average times?
Cheers, Christian
I have noticed different speeds when running the following functions (from slowest to fastest):
parse-json(unparsed-text('example.txt')) json-doc("example.txt") parse-json(file:read-text('example.txt'))
similarly for documents on the web:
parse-doc('http://example.com/text') parse-json(fetch:text('http://example.com/text'))
Does this make sense to you? Is there any recommendation to follow? Despite the runtime speed difference, I admit I love writing one single function (json-doc) to get map conversion. Everything is so immediate :)
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
Hi Giuseppe,
I did some tests on command-line:
basex -v -z -r100 "parse-json(unparsed-text('example.json'))" basex -v -z -r100 "parse-json(file:read-text('example.json'))" basex -v -z -r100 "json-doc('example.json')"
I tested the calls with a small and a large file (10 KB, 1.5 MB), and evaluation times were very similar, so I guess I need some more input to reproduce your results.
Best, Christian
On Mon, Aug 14, 2017 at 6:18 PM, Giuseppe Celano celano@informatik.uni-leipzig.de wrote:
Hi Christian,
The latter option. I just opened a file and run the same query repeatedly. It is not an in-depth comparison at all, but the times shown in the Query Info were clearly different (even if just ms).
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
On Aug 14, 2017, at 5:55 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Giuseppe,
The semantics of the used functions are slightly different, but I guess the differences in terms of performance should be rather marginal. Just in case: Have you have done some in-depth comparisons that I could have a look at? Did you work with large input files, or did you run the queries repeatedly and look at the average times?
Cheers, Christian
I have noticed different speeds when running the following functions (from slowest to fastest):
parse-json(unparsed-text('example.txt')) json-doc("example.txt") parse-json(file:read-text('example.txt'))
similarly for documents on the web:
parse-doc('http://example.com/text') parse-json(fetch:text('http://example.com/text'))
Does this make sense to you? Is there any recommendation to follow? Despite the runtime speed difference, I admit I love writing one single function (json-doc) to get map conversion. Everything is so immediate :)
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
Hi Christian,
I confirm that the command you use always outputs very similar time results, even when you manually repeat it for 3 or 4 times, i.e., running it without using "-r". However, if you run (manually) for a few times each of the queries from the GUI, I get differences (total time from the Query Info Panel, for a 333 KB file):
parse-json(file:read-text('text.txt') = around 40 ms json-doc('text.txt') = around 80 ms parse-json(unparsed-text('text.txt') = around 120 ms
Looking at the values reported in the Query Info Panel, the "compiling time" seems to be responsible for that.
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
On Aug 15, 2017, at 1:28 AM, Christian Grün christian.gruen@gmail.com wrote:
Hi Giuseppe,
I did some tests on command-line:
basex -v -z -r100 "parse-json(unparsed-text('example.json'))" basex -v -z -r100 "parse-json(file:read-text('example.json'))" basex -v -z -r100 "json-doc('example.json')"
I tested the calls with a small and a large file (10 KB, 1.5 MB), and evaluation times were very similar, so I guess I need some more input to reproduce your results.
Best, Christian
On Mon, Aug 14, 2017 at 6:18 PM, Giuseppe Celano celano@informatik.uni-leipzig.de wrote:
Hi Christian,
The latter option. I just opened a file and run the same query repeatedly. It is not an in-depth comparison at all, but the times shown in the Query Info were clearly different (even if just ms).
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
On Aug 14, 2017, at 5:55 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Giuseppe,
The semantics of the used functions are slightly different, but I guess the differences in terms of performance should be rather marginal. Just in case: Have you have done some in-depth comparisons that I could have a look at? Did you work with large input files, or did you run the queries repeatedly and look at the average times?
Cheers, Christian
I have noticed different speeds when running the following functions (from slowest to fastest):
parse-json(unparsed-text('example.txt')) json-doc("example.txt") parse-json(file:read-text('example.txt'))
similarly for documents on the web:
parse-doc('http://example.com/text') parse-json(fetch:text('http://example.com/text'))
Does this make sense to you? Is there any recommendation to follow? Despite the runtime speed difference, I admit I love writing one single function (json-doc) to get map conversion. Everything is so immediate :)
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
Hi Giuseppe,
please note that the evaluation times that are displayed in the GUI are not too reommendable for serious performance tests. There are various side effects that influence the result times of single query executions.
However, you can enter the "SET RUNS ..." command in the command input panel. If the chosen number of runs is big enough (e.g. 1000), the returned runtimes get more reliable.
If you want to get a more fine granular time output on command line, you can use -V.
Hope this helps, Christian
PS: If it turns out that you get similar performance results, feel free to pass me on your JSON files.
On Tue, Aug 15, 2017 at 11:17 AM, Giuseppe Celano celano@informatik.uni-leipzig.de wrote:
Hi Christian,
I confirm that the command you use always outputs very similar time results, even when you manually repeat it for 3 or 4 times, i.e., running it without using "-r". However, if you run (manually) for a few times each of the queries from the GUI, I get differences (total time from the Query Info Panel, for a 333 KB file):
parse-json(file:read-text('text.txt') = around 40 ms json-doc('text.txt') = around 80 ms parse-json(unparsed-text('text.txt') = around 120 ms
Looking at the values reported in the Query Info Panel, the "compiling time" seems to be responsible for that.
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
On Aug 15, 2017, at 1:28 AM, Christian Grün christian.gruen@gmail.com wrote:
Hi Giuseppe,
I did some tests on command-line:
basex -v -z -r100 "parse-json(unparsed-text('example.json'))" basex -v -z -r100 "parse-json(file:read-text('example.json'))" basex -v -z -r100 "json-doc('example.json')"
I tested the calls with a small and a large file (10 KB, 1.5 MB), and evaluation times were very similar, so I guess I need some more input to reproduce your results.
Best, Christian
On Mon, Aug 14, 2017 at 6:18 PM, Giuseppe Celano celano@informatik.uni-leipzig.de wrote:
Hi Christian,
The latter option. I just opened a file and run the same query repeatedly. It is not an in-depth comparison at all, but the times shown in the Query Info were clearly different (even if just ms).
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
On Aug 14, 2017, at 5:55 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Giuseppe,
The semantics of the used functions are slightly different, but I guess the differences in terms of performance should be rather marginal. Just in case: Have you have done some in-depth comparisons that I could have a look at? Did you work with large input files, or did you run the queries repeatedly and look at the average times?
Cheers, Christian
I have noticed different speeds when running the following functions (from slowest to fastest):
parse-json(unparsed-text('example.txt')) json-doc("example.txt") parse-json(file:read-text('example.txt'))
similarly for documents on the web:
parse-doc('http://example.com/text') parse-json(fetch:text('http://example.com/text'))
Does this make sense to you? Is there any recommendation to follow? Despite the runtime speed difference, I admit I love writing one single function (json-doc) to get map conversion. Everything is so immediate :)
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
Hi Giuseppe,
as you indicated in your last e-mail (in private), it was the debugging output that slowed down the execution of the unparse-text(...) function (the full text strings was added to both the query optimization info and the query plan). I have introduced a size limit [1].
Thanks for sharing your observations, Christian
[1] http://files.basex.org/releases/latest/
On Mon, Aug 14, 2017 at 3:01 PM, Giuseppe Celano celano@informatik.uni-leipzig.de wrote:
Hi,
I have noticed different speeds when running the following functions (from slowest to fastest):
parse-json(unparsed-text('example.txt')) json-doc("example.txt") parse-json(file:read-text('example.txt'))
similarly for documents on the web:
parse-doc('http://example.com/text') parse-json(fetch:text('http://example.com/text'))
Does this make sense to you? Is there any recommendation to follow? Despite the runtime speed difference, I admit I love writing one single function (json-doc) to get map conversion. Everything is so immediate :)
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
basex-talk@mailman.uni-konstanz.de