Dear BaseX people,
functions file:descendants and file:children return relative paths, if input parameter $dir is a relative folder path. I regard this as a bug - relative folder paths should be resolved against the current working directory, so that a relative path and the corresponding absolute path are equivalent parameters. The functions should always return absolute paths, as the documentation promises.
After all, a key use case is to find files which to access, and the accessor functions (doc(), json:doc(), unparsed-text() etc.) do *not* resolve relative paths against the current working dir.
Kind regards, Hans-Jürgen
On Fri, Sep 23, 2022 at 08:01:12PM +0200, Hans-Jürgen Rennau scripsit:
functions file:descendants and file:children return relative paths, if input parameter $dir is a relative folder path. I regard this as a bug - relative folder paths should be resolved against the current working directory, so that a relative path and the corresponding absolute path are equivalent parameters. The functions should always return absolute paths, as the documentation promises.
The ability to get a file listing by relative path makes it simple to transfer a directory structure from one location to another location, mirroring in the result the structure of the source. This is valued and useful and I've always assumed it was in there on purpose. I would certainly wish it to be kept in the behaviour of the file functions.
Yes, the core accessor functions use absolute paths, but there are file functions for constructing those from a base and a relative part; this facilities the "transfer structure" use case and is generally helpful for the broarder use case of "must construct a path".
Maybe this is a misunderstanding, Graydon: the ability to use relative paths *as parameter value*, and to have it automatically resolved against the current working directory, is certainly essential. But the *result* should be independent of whether the parameter was supplied as relative or absolute path. It is simply a bug.Use the file:list() function if you want to get relative paths. Kind regards,Hans-Jürgen Am Samstag, 24. September 2022 um 19:25:31 MESZ hat Graydon graydonish@gmail.com Folgendes geschrieben:
On Fri, Sep 23, 2022 at 08:01:12PM +0200, Hans-Jürgen Rennau scripsit:
functions file:descendants and file:children return relative paths, if input parameter $dir is a relative folder path. I regard this as a bug - relative folder paths should be resolved against the current working directory, so that a relative path and the corresponding absolute path are equivalent parameters. The functions should always return absolute paths, as the documentation promises.
The ability to get a file listing by relative path makes it simple to transfer a directory structure from one location to another location, mirroring in the result the structure of the source. This is valued and useful and I've always assumed it was in there on purpose. I would certainly wish it to be kept in the behaviour of the file functions.
Yes, the core accessor functions use absolute paths, but there are file functions for constructing those from a base and a relative part; this facilities the "transfer structure" use case and is generally helpful for the broarder use case of "must construct a path".
On Sat, Sep 24, 2022 at 06:22:41PM +0000, Hans-Juergen Rennau scripsit:
Maybe this is a misunderstanding, Graydon: the ability to use relative paths *as parameter value*, and to have it automatically resolved against the current working directory, is certainly essential. But the *result* should be independent of whether the parameter was supplied as relative or absolute path. It is simply a bug.Use the file:list() function if you want to get relative paths.
It's entirely possible I'm miscomprehending something.
In the 10.1 GUI, file:base-dir() gives a result. Per https://docs.basex.org/wiki/File_Module#file:list that result is the current working directory. It looks like that value is the last directory I saved a file in from the GUI.
concat(file:base-dir(),'../xslt') => file:descendants() works. But of course that's an absolute path.
file:resolve-path('../xslt',file:base-dir()) => file:children() works. That's what I'd think of as providing a relative path.
file:descendants('../xslt') does not work: 'Resource "../xslt" not found' is no directory.
I wouldn't expect it to work, but that might be an error of expectation rather than properly reflecting what should happen.
I think I'm misunderstanding what you mean by providing the parameter value to file:children or file:descendant as a relative path. Could you provide an example of the bug?
-- Graydon
Hi, Graydon, you ask what I mean by "Providing the parameter value to file:children as a relative path".
With "the parameter value" I mean the value passed to file:children() or file:descendants(), and it's clear what is meant by a relative path (a path not starting at the foot of the file system). For this reason, your statement:"file:resolve-path('../xslt',file:base-dir()) => file:children() works. That's what I'd think of as providing a relative path."confuses me, because you do provide an absolute path, as the argument is an invocation of file:resolve-path(). As you bring function file:base-dir() into play, let me point out one remarkable fact: whereas all (?) functions of the file module resolve relative paths against the current working directory, this function does not return the current working directory (for that we have file:current-dir()), but the "parent directory of the static base URI", in other words: the directory containing the XQuery module containing the code. Therefore note that if file:descendants('../xslt') yields an error, it means that the current working directory does not have a sibling xslt directory - not that the directory containing the XQuery module does not have that sibling. Finally, let me summarize my point: (1) The functions file:children() and file:descendants() take a single parameter which is a path which may be specified as absolute path or relative path to be resolved against the current working directory. (2) Irrespectived of how the argument was supplied, the functions must always return absolute paths.(3) If you want a file listing to return relative paths, use function file:list(). Kind regards,Hans-Jürgen Am Samstag, 24. September 2022 um 21:03:18 MESZ hat Graydon graydonish@gmail.com Folgendes geschrieben:
On Sat, Sep 24, 2022 at 06:22:41PM +0000, Hans-Juergen Rennau scripsit:
Maybe this is a misunderstanding, Graydon: the ability to use relative paths *as parameter value*, and to have it automatically resolved against the current working directory, is certainly essential. But the *result* should be independent of whether the parameter was supplied as relative or absolute path. It is simply a bug.Use the file:list() function if you want to get relative paths.
It's entirely possible I'm miscomprehending something.
In the 10.1 GUI, file:base-dir() gives a result. Per https://docs.basex.org/wiki/File_Module#file:list that result is the current working directory. It looks like that value is the last directory I saved a file in from the GUI.
concat(file:base-dir(),'../xslt') => file:descendants() works. But of course that's an absolute path.
file:resolve-path('../xslt',file:base-dir()) => file:children() works. That's what I'd think of as providing a relative path.
file:descendants('../xslt') does not work: 'Resource "../xslt" not found' is no directory.
I wouldn't expect it to work, but that might be an error of expectation rather than properly reflecting what should happen.
I think I'm misunderstanding what you mean by providing the parameter value to file:children or file:descendant as a relative path. Could you provide an example of the bug?
-- Graydon
On Sun, Sep 25, 2022 at 10:04:51AM +0000, Hans-Juergen Rennau scripsit:
Hi, Graydon,
Hello!
you ask what I mean by "Providing the parameter value to file:children as a relative path".
With "the parameter value" I mean the value passed to file:children() or file:descendants(), and it's clear what is meant by a relative path (a path not starting at the foot of the file system). For this reason, your statement:"file:resolve-path('../xslt',file:base-dir()) => file:children() works. That's what I'd think of as providing a relative path."confuses me, because you do provide an absolute path, as the argument is an invocation of file:resolve-path().
I do provide an absolute path, but via a relative path that has to be resolved against a base. I think that's how relative paths work because I've never been able to get a true relative path to work in practice, and have put this down to miscomprehending something.
I now think that I was miscomprehending what the docs say about what file:base-dir() returns; I took that to mean the returned value would be the location of the BaseX executable, not the location of the XQuery module. (thank you! the below was usefully clear about this.)
As you bring function file:base-dir() into play, let me point out one remarkable fact: whereas all (?) functions of the file module resolve relative paths against the current working directory, this function does not return the current working directory (for that we have file:current-dir()), but the "parent directory of the static base URI", in other words: the directory containing the XQuery module containing the code.
If I run: (file:base-dir(),file:current-dir(),file:resolve-path('..'),file:children('..'))
in the GUI, without saving it, and having closed everything else, I get:
/home/graydon/xcential/git/Oregon-DPMS/modelling/amending/ixml/xquery/ /home/graydon/ /home/ ../graydon/
That first path is the last place I saved a query from the GUI.
If I close that tab, close the GUI, restart the GUI, and paste in that same bit of XPath, I get the same result. So I think the value for file:base-dir persists. (I'm a bit surprised by file:current-dir(), too, but will put that down to starting BaseX from a panel shortcut.)
It's not immediately clear to me that the file:base-dir value should persist. It seems like this could produce a lot of surprise.
Finally, let me summarize my point: (1) The functions file:children() and file:descendants() take a single parameter which is a path which may be specified as absolute path or relative path to be resolved against the current working directory. (2) Irrespectived of how the argument was supplied, the functions must always return absolute paths.(3) If you want a file listing to return relative paths, use function file:list().
No arguement from me on these points.
Though I will note that my recollection is that they never have.
"It's not immediately clear to me that the file:base-dir value should persist." I agree. The question at hand is how the "static base URI" is defined:(1) So far, so clear: inside an XQuery module, it is the base URI of that module(2) Apparent rule: when called on the command-line, it is the current working directory(3) Apparent rule: when called in the GUI, it is - I don't know. I think, in cases (2) and (3) it has an implementation defined meaning, which should be documented. Kind regards,Hans-Jürgen Am Sonntag, 25. September 2022 um 18:36:33 MESZ hat Graydon graydonish@gmail.com Folgendes geschrieben:
On Sun, Sep 25, 2022 at 10:04:51AM +0000, Hans-Juergen Rennau scripsit:
Hi, Graydon,
Hello!
you ask what I mean by "Providing the parameter value to file:children as a relative path".
With "the parameter value" I mean the value passed to file:children() or file:descendants(), and it's clear what is meant by a relative path (a path not starting at the foot of the file system). For this reason, your statement:"file:resolve-path('../xslt',file:base-dir()) => file:children() works. That's what I'd think of as providing a relative path."confuses me, because you do provide an absolute path, as the argument is an invocation of file:resolve-path().
I do provide an absolute path, but via a relative path that has to be resolved against a base. I think that's how relative paths work because I've never been able to get a true relative path to work in practice, and have put this down to miscomprehending something.
I now think that I was miscomprehending what the docs say about what file:base-dir() returns; I took that to mean the returned value would be the location of the BaseX executable, not the location of the XQuery module. (thank you! the below was usefully clear about this.)
As you bring function file:base-dir() into play, let me point out one remarkable fact: whereas all (?) functions of the file module resolve relative paths against the current working directory, this function does not return the current working directory (for that we have file:current-dir()), but the "parent directory of the static base URI", in other words: the directory containing the XQuery module containing the code.
If I run: (file:base-dir(),file:current-dir(),file:resolve-path('..'),file:children('..'))
in the GUI, without saving it, and having closed everything else, I get:
/home/graydon/xcential/git/Oregon-DPMS/modelling/amending/ixml/xquery/ /home/graydon/ /home/ ../graydon/
That first path is the last place I saved a query from the GUI.
If I close that tab, close the GUI, restart the GUI, and paste in that same bit of XPath, I get the same result. So I think the value for file:base-dir persists. (I'm a bit surprised by file:current-dir(), too, but will put that down to starting BaseX from a panel shortcut.)
It's not immediately clear to me that the file:base-dir value should persist. It seems like this could produce a lot of surprise.
Finally, let me summarize my point: (1) The functions file:children() and file:descendants() take a single parameter which is a path which may be specified as absolute path or relative path to be resolved against the current working directory. (2) Irrespectived of how the argument was supplied, the functions must always return absolute paths.(3) If you want a file listing to return relative paths, use function file:list().
No arguement from me on these points.
Though I will note that my recollection is that they never have.
Thanks for your discussion. I’ll give some comments to…
1. Returned paths of file:descendants and file:children
The paths returned by these functions start with the directory argument specified by the user. The argument may point to a relative or absolute path. Code can often be simplified by replacing file:list with file:children or file:descendants:
let $root := 'a/b/c/' for $path in file:list($root) return file:read-text($root || $path)
for $path in file:children('a/b/c') return file:read-text($path)
I have revised the wording in the documentation and added some examples; I hope it’s better understandable now what the functions are supposed to do.
2. The Base vs. Current Working Directory
The concept of the »current working directory« goes back to the early times of the specification of the File Module (which was defined together with guys from Zorba/28msec). There were various reasons why we did not resort to the base URI as default location:
• The static base URI may be empty, and it might point to a non-local resource • If a function of the File Module is located in a library module, we regarded the behavior as counterintuitive if a file function used this module’s URI to resolve local files.
From today’s perspective, I believe it would have been easier to get
rid of the »current working directory« and to exclusively work with the static base URI. In particular, it would have been easier to exchange functions from the File Module and other functions.
In more complex code, it’s usually a good choice to rewrite relative input paths to absolute native paths at the very beginning. The file:path-to-native can be used for that if the addressed path exists. In addition to file:resolve-path, it’s expected to resolve symbolic links and return filenames in their actual case on Windows systems (e.g., “C:\users\” might be rewritten to “C:\Users\”). We didn’t specify the exact behavior in the spec, as it may depend on the operating system. For example, as UNIX-based system works case-sensitive, so there may be two files “a” and “A” in the same directory. And so on.
To answer Hans-Jürgen’s question,
(3) Apparent rule: when called in the GUI, it is - I don't know.
The static base URI is set to a non-existing resource (named "file") in the directory in which an editor file was stored most recently. In earlier versions of BaseX, it was undefined as long as the edited file was not stored. As you can guess, it’s not recommendable to use file:base-dir as long as you don’t know where your query will be stored.
Thank you for this clarification, Christian. In my opinion, it is a pity that the functions file:children and file:descendants do not return absolute paths - I see only disadvantages. When the argument is a relative path, you get back relative paths which only make sense in the context of the current working directory. If passing them on to a file:* function, that is fine (as your example shows), if passing them on to other functions, especially the key functions returning node trees - doc(), json:doc(), csv:doc() - they do not work and have first to be resolved to absolute paths, e.g. using file:path-to-native(). The function result thus must be used differently dependent on which function will consume it. Another shortcoming - if the file paths are wanted as paths - e.g. written into a report - relative paths to be resolved against the current working directory will probably be useless, so again they must first be resolved before being used. Last and least, the paths are often ugly, like "../xslt". Paths relative to the current working directory are very important as user input, hence their consistent use in the file module as input parameter makes much sense, especially when writing tools. But they are very rarely (if ever) to be preferred over absolute paths in other situations. (This is in sharp contrast to relative paths to be resolved against the containing resource's base URI.) I cannot imagine a situation where I would appreciate to get a CWD-based relative path from a function, rather than an absolute path. However, probably there is a rationale, or an important usecase, behind the current behaviour which I overlooked. Kind regards,Hans-Jürgen
Am Montag, 26. September 2022 um 09:50:16 MESZ hat Christian Grün christian.gruen@gmail.com Folgendes geschrieben:
Thanks for your discussion. I’ll give some comments to…
1. Returned paths of file:descendants and file:children
The paths returned by these functions start with the directory argument specified by the user. The argument may point to a relative or absolute path. Code can often be simplified by replacing file:list with file:children or file:descendants:
let $root := 'a/b/c/' for $path in file:list($root) return file:read-text($root || $path)
for $path in file:children('a/b/c') return file:read-text($path)
I have revised the wording in the documentation and added some examples; I hope it’s better understandable now what the functions are supposed to do.
2. The Base vs. Current Working Directory
The concept of the »current working directory« goes back to the early times of the specification of the File Module (which was defined together with guys from Zorba/28msec). There were various reasons why we did not resort to the base URI as default location:
• The static base URI may be empty, and it might point to a non-local resource • If a function of the File Module is located in a library module, we regarded the behavior as counterintuitive if a file function used this module’s URI to resolve local files.
From today’s perspective, I believe it would have been easier to get
rid of the »current working directory« and to exclusively work with the static base URI. In particular, it would have been easier to exchange functions from the File Module and other functions.
In more complex code, it’s usually a good choice to rewrite relative input paths to absolute native paths at the very beginning. The file:path-to-native can be used for that if the addressed path exists. In addition to file:resolve-path, it’s expected to resolve symbolic links and return filenames in their actual case on Windows systems (e.g., “C:\users\” might be rewritten to “C:\Users\”). We didn’t specify the exact behavior in the spec, as it may depend on the operating system. For example, as UNIX-based system works case-sensitive, so there may be two files “a” and “A” in the same directory. And so on.
To answer Hans-Jürgen’s question,
(3) Apparent rule: when called in the GUI, it is - I don't know.
The static base URI is set to a non-existing resource (named "file") in the directory in which an editor file was stored most recently. In earlier versions of BaseX, it was undefined as long as the edited file was not stored. As you can guess, it’s not recommendable to use file:base-dir as long as you don’t know where your query will be stored.
Hi Hans-Jürgen,
[…] if passing them on to other functions, especially the key functions returning node trees - doc(), json:doc(), csv:doc() - they do not work and have first to be resolved to absolute paths, e.g. using file:path-to-native().
Back then, we even thought about disallowing any relative paths to most of the functions of the File Module, but that seemed too rigorous to us.
If there’s no need for relative paths in your applications, maybe you can go back one step in your code and check why you pass on relative paths to file:children and file:descendants at all. I believe it is better to normalize all paths before calling any utility function in BaseX that works with file paths. That way, you can e.g. decide if you want to use file:path-to-native or file:resolve-path (each of them has advantages and drawbacks that should be considered individually).
By the way, if you want to write code that does not rely on a specific XQuery processor, it may be cleaner anyway to convert each absolute path to a valid URI. For example, Saxon won’t allow you to run doc('C:\file.xml'); it rejects the argument as »invalid relative URI«. doc(file:path-to-uri('C:\file.xml')) works with both processors (provided that the applied version of Saxon comes with support for the File Module).
Best, Christian
basex-talk@mailman.uni-konstanz.de