Hello --
BaseX will happily consume zip archives; this is just splendid for loading up a bunch of docx files.
Now I find myself wanting the name of the docx file -- the original name of the archive -- and I don't know how to retrieve that. (or if it can be!) But I think it must be there somewhere because db:path repeats the standard OOXML file paths:
[Content_Types].xml word/document.xml word/footnotes.xml word/footer1.xml word/endnotes.xml word/theme/theme1.xml word/settings.xml docProps/custom.xml customXml/itemProps2.xml docProps/app.xml customXml/item2.xml customXml/itemProps1.xml word/fontTable.xml customXml/item1.xml customXml/item3.xml customXml/itemProps3.xml customXml/item4.xml customXml/itemProps4.xml word/numbering.xml word/styles.xml word/webSettings.xml docProps/core.xml word/people.xml
over and over; if they were all going to exactly that there'd be one copy, and all several hundred docx files are there by content. (db:list-details tells me about > 4000 individual xml files.)
If I can get the name of the original archive, how do I do that?
Thanks! Graydon
Ok, so it looks like:
1. find where BaseX is really getting its config files ($HOME in my case); http://docs.basex.org/wiki/Configuration#Configuration_Files says "Q{org.basex.util.Prop}USERHOME()" which is exceedingly helpful! 2. add, to .basex (NOT .basexgui) AFTER # Local Options the necessary config switch 3. the config switch is "ARCHIVENAME = true" 4. restart BaseX and go to recreate the DB and you'll see a ticky-box option for "add the archive name to the path"
This gets me the behaviour I was expecting would happen, but I'm still curious if there's a way to get the archive name back in the default case, because it does look like BaseX is in no way confused about which of those identically named files belong together.
Thanks! Graydon
On Mon, Sep 18, 2017 at 8:55 AM, Graydon Saunders graydonish@gmail.com wrote:
Hello --
BaseX will happily consume zip archives; this is just splendid for loading up a bunch of docx files.
Now I find myself wanting the name of the docx file -- the original name of the archive -- and I don't know how to retrieve that. (or if it can be!) But I think it must be there somewhere because db:path repeats the standard OOXML file paths:
[Content_Types].xml word/document.xml word/footnotes.xml word/footer1.xml word/endnotes.xml word/theme/theme1.xml word/settings.xml docProps/custom.xml customXml/itemProps2.xml docProps/app.xml customXml/item2.xml customXml/itemProps1.xml word/fontTable.xml customXml/item1.xml customXml/item3.xml customXml/itemProps3.xml customXml/item4.xml customXml/itemProps4.xml word/numbering.xml word/styles.xml word/webSettings.xml docProps/core.xml word/people.xml
over and over; if they were all going to exactly that there'd be one copy, and all several hundred docx files are there by content. (db:list-details tells me about > 4000 individual xml files.)
If I can get the name of the original archive, how do I do that?
Thanks! Graydon
Hi Graydon,
the config switch is "ARCHIVENAME = true"
Exactly, that’s the option you’ll need to enable to get the archive names included in your database paths. It can also be passed on to XQuery functions (db:create, db:add, etc.).
This gets me the behaviour I was expecting would happen, but I'm still curious if there's a way to get the archive name back in the default case, because it does look like BaseX is in no way confused about which of those identically named files belong together.
By default, the archive name will be ignored. In BaseX, it’s possible to have several documents with the same name (this provides better performance if document paths are irrelevant), and db:list-details simply returns all document names in the order in which the documents were added.
Hope this helps, Christian
Thank you!
Someday I will get it through my head that it's not really a file system down there. :)
On Mon, Sep 18, 2017 at 11:00 AM, Christian Grün christian.gruen@gmail.com wrote:
Hi Graydon,
the config switch is "ARCHIVENAME = true"
Exactly, that’s the option you’ll need to enable to get the archive names included in your database paths. It can also be passed on to XQuery functions (db:create, db:add, etc.).
This gets me the behaviour I was expecting would happen, but I'm still curious if there's a way to get the archive name back in the default
case,
because it does look like BaseX is in no way confused about which of
those
identically named files belong together.
By default, the archive name will be ignored. In BaseX, it’s possible to have several documents with the same name (this provides better performance if document paths are irrelevant), and db:list-details simply returns all document names in the order in which the documents were added.
Hope this helps, Christian
basex-talk@mailman.uni-konstanz.de