Dear BaseX people, I have a feature request:
a function which does the same as file:list, but returns the full paths. (In the explanation, I call it "file:list2")
Rationale: (a) It is the full paths what we need in order to parse the files (doc(), json:doc(), csv:doc(), html:doc()). (b) The combination of file listing and parsing in a single expression is of matchless elegance and expressivenessJust think about how we can extend this into a single list+parse+navigate+report expression, like: file:list2($folder, true(), $fname) ! doc(.) ! //returnCode[. ne 0]/concat(base-uri(.), ': ', .) (Wow! Is there any language or tool competing with this compactness and readability?)
Currently, if I don't overlook something, I have to do this: file:list($dir, false(), $fname) ! concat($dir, '/', .) ! doc(.) => count()
I emphasize the psychological aspect. Every time I have the pleasure to give a workshop about XQuery, I show off how elegant it is to parse a whole file system tree into a forest of nodes in a single expression. And I am always embarrassed about this clumsy fumbling montage of a path. It's as if Mozart had fallen asleep while writing his xquery sonata, knocked over the inkwell and now there is this splash smeared out on the score. You would be my heroes if you can spare us that.
Kind regards,Hans-Jürgen
On Fri, 2022-02-11 at 18:40 +0000, Hans-Juergen Rennau wrote:
(a) It is the full paths what we need in order to parse the files (doc(), json:doc(), csv:doc(), html:doc()).
Why?
A relative path should work fine; if necessary you can use resolve- uri() to turn a relative path into a full URI.
(b) The combination of file listing and parsing in a single expression is of matchless elegance and expressiveness
Currently, if I don't overlook something, I have to do this: file:list($dir, false(), $fname) ! concat($dir, '/', .) ! doc(.) => count()
Probably, you should use resolve-uri($dir, base-uri(.)) although i suppose you also need to watch for filenames containing colons...
I’ve run into the same thing in my Validation Dashboard—I need to go from files within a directory to their full path because I’m translating file system paths in to repo-local paths and whatnot.
As Hans-Juergen shows, it’s not hard to combine the path you just passed to file:path() and the name but it’s annoying to have to do it.
Cheers,
E.
_____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.comhttps://www.servicenow.com LinkedInhttps://www.linkedin.com/company/servicenow | Twitterhttps://twitter.com/servicenow | YouTubehttps://www.youtube.com/user/servicenowinc | Facebookhttps://www.facebook.com/servicenow
From: BaseX-Talk basex-talk-bounces@mailman.uni-konstanz.de on behalf of Liam R. E. Quin liam@fromoldbooks.org Date: Friday, February 11, 2022 at 1:07 PM To: Hans-Juergen Rennau hrennau@yahoo.de, BaseX basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] Feature request - file listing [External Email]
On Fri, 2022-02-11 at 18:40 +0000, Hans-Juergen Rennau wrote:
(a) It is the full paths what we need in order to parse the files (doc(), json:doc(), csv:doc(), html:doc()).
Why?
A relative path should work fine; if necessary you can use resolve- uri() to turn a relative path into a full URI.
(b) The combination of file listing and parsing in a single expression is of matchless elegance and expressiveness
Currently, if I don't overlook something, I have to do this: file:list($dir, false(), $fname) ! concat($dir, '/', .) ! doc(.) => count()
Probably, you should use resolve-uri($dir, base-uri(.)) although i suppose you also need to watch for filenames containing colons...
-- Liam Quin, https://urldefense.com/v3/__https://www.delightfulcomputing.com/__;!!N4vogdj...https://urldefense.com/v3/__https:/www.delightfulcomputing.com/__;!!N4vogdjhuJM!UiKCPedrJtAEOktUyXkuK6E_cB-poSonq-v2i0BLuTk97ARgvkqXh6kzGQDRC4GqOSKbIw$ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Web-slave, antique illustrations: https://urldefense.com/v3/__http://www.fromoldbooks.org__;!!N4vogdjhuJM!UiKC...https://urldefense.com/v3/__http:/www.fromoldbooks.org__;!!N4vogdjhuJM!UiKCPedrJtAEOktUyXkuK6E_cB-poSonq-v2i0BLuTk97ARgvkqXh6kzGQDRC4E1wn9xTg$
Well, Liam, as I said - it can be done, my request is for supporting elegance. Here's the litmus test. How would you rewrite this little function ...
declare function f:docs($dir, $fname, $deep) as document-node()* { file:list($dir, true(), $fname) ! concat(file:resolve-path($dir), '/', .) ! (try {doc(.)} catch * {}) };
Such a function is a great thing - imagine I want to know the XSLT versions used in a project, so all I have to do is ... f:docs("foo-project", "*.xsl", $true())/*/@version => distinct-values() => sort()
The goal is to allow the user to be able to specify $dir relative to the current working dir (not the static URI of the query, as doc() expects ...). To use resolve-uri() is even more cumbersome, because you would require the absolute URI corresponding to $dir, which you can only get by first retrieving the full path of $dir (using file:resolve-path()) and then transforming the full path into a URI, as the result of file:resolve-path() is not accepted by resolve-uri(). So the alternative to this "concat fiddling" would by an even uglier line: file:list($dir, false(), $fname) ! resolve-uri(., file:path-to-uri(file:resolve-path($dir))) ! doc(.) Then, *please* let us have this instead: file:list2($dir, false(), $fname) ! doc(.) Kind regards,Hans-Jürgen Am Freitag, 11. Februar 2022, 20:06:44 MEZ hat Liam R. E. Quin liam@fromoldbooks.org Folgendes geschrieben:
On Fri, 2022-02-11 at 18:40 +0000, Hans-Juergen Rennau wrote:
(a) It is the full paths what we need in order to parse the files (doc(), json:doc(), csv:doc(), html:doc()).
Why?
A relative path should work fine; if necessary you can use resolve- uri() to turn a relative path into a full URI.
(b) The combination of file listing and parsing in a single expression is of matchless elegance and expressiveness
Currently, if I don't overlook something, I have to do this: file:list($dir, false(), $fname) ! concat($dir, '/', .) ! doc(.) => count()
Probably, you should use resolve-uri($dir, base-uri(.)) although i suppose you also need to watch for filenames containing colons...
On Fri, Feb 11, 2022 at 08:12:22PM +0000, Hans-Juergen Rennau scripsit:
Well, Liam, as I said - it can be done, my request is for supporting elegance. Here's the litmus test. How would you rewrite this little function ...
declare function f:docs($dir, $fname, $deep) as document-node()* { file:list($dir, true(), $fname) ! concat(file:resolve-path($dir), '/', .) ! (try {doc(.)} catch * {}) };
It's entirely likely I'm missing something here, but isn't the intent of this pretty much
db:create("DB", file:list($dir, true(), $fname), ())
where that third param to db:create lets you remap the path if required?
At which point the use case becomes db:open('DB')/*/@version => distinct-values() => sort()
In general, I don't want to have to think about files; it's great that the file module is there for those occasions where this cannot be avoided, but in general I want the XML loaded into a database (where someone with much better test cases has written the functions!) much more than I want to think about file system details.
What am I missing?
Hi Graydon, that's interesting - my perspective is 100% different. I've been working with XQuery for 15+years day in day out, and my working with files is 99.9% of the time, working with databases 0.1%. (And that's a cautious estimate.) I work with a constantly changing set of document sets, I simply cannot afford the fuss of creating and deleting databases all the time. (It would be a downright waste of time - BaseX is so fast when dealing with the file system that the difference would be hardly noticeable.)
And how about tool writing? Would you ask the users to prepare tool use by loading files into a database? Or would you want to clutter up tools with the creation and deletion of auxilliary databases nobody needs, only in order to answer a question or report about what is there in the file system?
This shows that there are different perspectives.
Hans-Jürgen
Am Freitag, 11. Februar 2022, 22:11:57 MEZ hat Graydon graydonish@gmail.com Folgendes geschrieben:
On Fri, Feb 11, 2022 at 08:12:22PM +0000, Hans-Juergen Rennau scripsit:
Well, Liam, as I said - it can be done, my request is for supporting elegance. Here's the litmus test. How would you rewrite this little function ...
declare function f:docs($dir, $fname, $deep) as document-node()* { file:list($dir, true(), $fname) ! concat(file:resolve-path($dir), '/', .) ! (try {doc(.)} catch * {}) };
It's entirely likely I'm missing something here, but isn't the intent of this pretty much
db:create("DB", file:list($dir, true(), $fname), ())
where that third param to db:create lets you remap the path if required?
At which point the use case becomes db:open('DB')/*/@version => distinct-values() => sort()
In general, I don't want to have to think about files; it's great that the file module is there for those occasions where this cannot be avoided, but in general I want the XML loaded into a database (where someone with much better test cases has written the functions!) much more than I want to think about file system details.
What am I missing?
On Fri, 2022-02-11 at 20:12 +0000, Hans-Juergen Rennau wrote:
Well, Liam, as I said - it can be done, my request is for supporting elegance.
Fair. So, file:list-with-full-path() or something maybe.
To some extent i expect the interface to the operating system to be somewhat messy, i suppose.
Liam
Unnecessary and non-standard ( file: functions are not XQuery standard, but are EXPath functions).
Use: file:children( file:resolve-path( $folder ) )
If you give file:children an absolute path, that will be included in results, so just resolve the relative folder path first.
— Steve M.
On Feb 11, 2022, at 1:40 PM, Hans-Juergen Rennau hrennau@yahoo.de wrote:
Dear BaseX people,
I have a feature request:
a function which does the same as file:list, but returns the full paths. (In the explanation, I call it "file:list2")
Rationale: (a) It is the full paths what we need in order to parse the files (doc(), json:doc(), csv:doc(), html:doc()). (b) The combination of file listing and parsing in a single expression is of matchless elegance and expressiveness Just think about how we can extend this into a single list+parse+navigate+report expression, like:
file:list2($folder, true(), $fname) ! doc(.) ! //returnCode[. ne 0]/concat(base-uri(.), ': ', .)
(Wow! Is there any language or tool competing with this compactness and readability?)
Currently, if I don't overlook something, I have to do this: file:list($dir, false(), $fname) ! concat($dir, '/', .) ! doc(.) => count()
I emphasize the psychological aspect. Every time I have the pleasure to give a workshop about XQuery, I show off how elegant it is to parse a whole file system tree into a forest of nodes in a single expression. And I am always embarrassed about this clumsy fumbling montage of a path. It's as if Mozart had fallen asleep while writing his xquery sonata, knocked over the inkwell and now there is this splash smeared out on the score. You would be my heroes if you can spare us that.
Kind regards, Hans-Jürgen
Excellent, I had overlooked this possibility, however - currently file:children and file:descendants (which are not part of the EXPath standard) do not support a file name pattern, nor do they support the deep flag. So addition of these two parameters would be a considerable improvement, but still do not offer the solution I would wish for, because I want (what I regard as) a key operation of XQuery (aggregated document access) to be represented in the most straightforward way. Concerning insistence on keeping to the XQuery standard, you must be indifferent to the following BaseX modules: Archive Module https://docs.basex.org/wiki/Archive_Module Binary Module https://docs.basex.org/wiki/Binary_Module Conversion Module https://docs.basex.org/wiki/Conversion_Module Cryptographic Module https://docs.basex.org/wiki/Cryptographic_Module CSV Module https://docs.basex.org/wiki/CSV_Module Database Module https://docs.basex.org/wiki/Database_Module Geo Module https://docs.basex.org/wiki/Geo_Module Hashing Module https://docs.basex.org/wiki/Hashing_Module HTML Module https://docs.basex.org/wiki/HTML_Module Inspection Module https://docs.basex.org/wiki/Inspection_Module JSON Module https://docs.basex.org/wiki/JSON_Module Lazy Module https://docs.basex.org/wiki/Lazy_Module Process Module https://docs.basex.org/wiki/Process_Module Profiling Module https://docs.basex.org/wiki/Profiling_Module Random Module https://docs.basex.org/wiki/Random_Module Request Module https://docs.basex.org/wiki/Request_Module SQL Module https://docs.basex.org/wiki/SQL_Module Strings Module https://docs.basex.org/wiki/Strings_Module Validation Module https://docs.basex.org/wiki/Validation_Module Web Module https://docs.basex.org/wiki/Web_Module XQuery Module https://docs.basex.org/wiki/XQuery_Module
-
Am Freitag, 11. Februar 2022, 23:05:30 MEZ hat Majewski, Steven Dennis (sdm7g) sdm7g@virginia.edu Folgendes geschrieben:
Unnecessary and non-standard ( file: functions are not XQuery standard, but are EXPath functions). Use: file:children( file:resolve-path( $folder ) ) If you give file:children an absolute path, that will be included in results, so just resolve the relative folder path first. — Steve M.
On Feb 11, 2022, at 1:40 PM, Hans-Juergen Rennau hrennau@yahoo.de wrote:
Dear BaseX people, I have a feature request:
a function which does the same as file:list, but returns the full paths. (In the explanation, I call it "file:list2")
Rationale: (a) It is the full paths what we need in order to parse the files (doc(), json:doc(), csv:doc(), html:doc()). (b) The combination of file listing and parsing in a single expression is of matchless elegance and expressivenessJust think about how we can extend this into a single list+parse+navigate+report expression, like: file:list2($folder, true(), $fname) ! doc(.) ! //returnCode[. ne 0]/concat(base-uri(.), ': ', .) (Wow! Is there any language or tool competing with this compactness and readability?)
Currently, if I don't overlook something, I have to do this: file:list($dir, false(), $fname) ! concat($dir, '/', .) ! doc(.) => count()
I emphasize the psychological aspect. Every time I have the pleasure to give a workshop about XQuery, I show off how elegant it is to parse a whole file system tree into a forest of nodes in a single expression. And I am always embarrassed about this clumsy fumbling montage of a path. It's as if Mozart had fallen asleep while writing his xquery sonata, knocked over the inkwell and now there is this splash smeared out on the score. You would be my heroes if you can spare us that.
Kind regards,Hans-Jürgen
Then, *please* let us have this instead: file:list2($dir, false(), $fname) ! doc(.)
This is what I would write and recommend:
file:descendants($dir)[file:name(.) = $fname] ! doc(.)
Excellent, I had overlooked this possibility, however - currently file:children and file:descendants (which are not part of the EXPath standard) do not support a file name pattern, nor do they support the deep flag.
The two functions are candidates for the next version of the EXPath spec. The deep flag is not required, as we have two functions for that. In hindsight, I would have discarded file:list, as the standard XQuery strings functions are much more flexible and powerful, and as the "glob syntax" is used in no other module.
basex-talk@mailman.uni-konstanz.de