Hi all,
Is it possible to bypass the following restriction (I cannot change "ep_ins_med_q.xsd" (Central Bank xbrl scheme))?:
BaseX 9.1
Command: CREATE DB bfo D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/ Error: "D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/rep/2018-03-31/ep/ep_ins_med_q.xsd""ep_ins_med_q.xsd" (Line 21): Too many distinct namespaces (limit: 256).
Best regards, Sergei.
Hi Sergei,
The corresponding issue will turn 5 next March: https://github.com/BaseXdb/basex/issues/902
If you are an XML developer who wants to index all the XML, XSLT, XProc, RNG, XSD, Schematron, etc. files on your hard disk in an XML database, chances are that you’ll need more than 256 namespaces.
I’m willing to shell out up to 1,200 Euros (plus VAT) out of my own pockets for this feature. Any other funders? Christian, how much do we need to raise collectively for you to prioritize storage layout redesign?
Gerrit
On 31.10.2018 22:47, Сергей Чесноков wrote:
Hi all,
Is it possible to bypass the following restriction (I cannot change "ep_ins_med_q.xsd" (Central Bank xbrl scheme))?:
BaseX 9.1
Command: CREATE DB bfo D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/ Error: "D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/rep/2018-03-31/ep/ep_ins_med_q.xsd""ep_ins_med_q.xsd" (Line 21): Too many distinct namespaces (limit: 256).
Best regards, Sergei.
Hi Gerrit,
thanks for your generous offer to sponsor the requested feature. I am ashamed to confirm it’s been a long time since this issue has been opened and not closed yet. You are asking how much money will be required to get this fixed. I’m not sure after all. Maybe I would rather ask for 3, 4 weeks of “leisure time”, or – even better – get a proposal into my hands how this could be resolved without compromising backward conformance.
Some more details: The current storage layout per node has been fixed to 16 bytes. One byte (8 bits) is reserved for the namespace reference. The other 15 bytes (minus a few unused bits) are reserved for other references and flags. We could extend the storage to 24 or 32 bits. As a result, the central database main table would get larger, so this would affect both old databases (that need to be imported) and the overall performance of the system. If we decide to go this step, we could indeed overcome various of the current limitations.
Any volunteers out there who are ready for the challenge? Christian
On Wed, Oct 31, 2018 at 11:40 PM Imsieke, Gerrit, le-tex gerrit.imsieke@le-tex.de wrote:
Hi Sergei,
The corresponding issue will turn 5 next March: https://github.com/BaseXdb/basex/issues/902
If you are an XML developer who wants to index all the XML, XSLT, XProc, RNG, XSD, Schematron, etc. files on your hard disk in an XML database, chances are that you’ll need more than 256 namespaces.
I’m willing to shell out up to 1,200 Euros (plus VAT) out of my own pockets for this feature. Any other funders? Christian, how much do we need to raise collectively for you to prioritize storage layout redesign?
Gerrit
On 31.10.2018 22:47, Сергей Чесноков wrote:
Hi all,
Is it possible to bypass the following restriction (I cannot change "ep_ins_med_q.xsd" (Central Bank xbrl scheme))?:
BaseX 9.1
Command: CREATE DB bfo D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/ Error: "D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/rep/2018-03-31/ep/ep_ins_med_q.xsd""ep_ins_med_q.xsd" (Line 21): Too many distinct namespaces (limit: 256).
Best regards, Sergei.
-- Gerrit Imsieke Geschäftsführer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@le-tex.de, http://www.le-tex.de
Registergericht / Commercial Register: Amtsgericht Leipzig Registernummer / Registration Number: HRB 24930
Geschäftsführer / Managing Directors: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt
One approach to avoid migration and backwards compatibility issues would be to support a standard storage and an extended storage side by side. The storage and query functions would know beforehand which layout variant the current database is in, and they could use the appropriate optimized functions. If dynamic lookup of these layout-specific functions would be too costly, maybe providing two separate binaries (classic and extended storage) might be an option. I cannot fathom how many pieces of code need to be modified in order to be able to maintain a common codebase for both layouts. I’m certainly naïve in this regard because back in the days, I thought: How hard can it be to move from a 16 bit architecture to a 32 bit architecture?
Gerrit
On 02.11.2018 17:25, Christian Grün wrote:
Hi Gerrit,
thanks for your generous offer to sponsor the requested feature. I am ashamed to confirm it’s been a long time since this issue has been opened and not closed yet. You are asking how much money will be required to get this fixed. I’m not sure after all. Maybe I would rather ask for 3, 4 weeks of “leisure time”, or – even better – get a proposal into my hands how this could be resolved without compromising backward conformance.
Some more details: The current storage layout per node has been fixed to 16 bytes. One byte (8 bits) is reserved for the namespace reference. The other 15 bytes (minus a few unused bits) are reserved for other references and flags. We could extend the storage to 24 or 32 bits. As a result, the central database main table would get larger, so this would affect both old databases (that need to be imported) and the overall performance of the system. If we decide to go this step, we could indeed overcome various of the current limitations.
Any volunteers out there who are ready for the challenge? Christian
On Wed, Oct 31, 2018 at 11:40 PM Imsieke, Gerrit, le-tex gerrit.imsieke@le-tex.de wrote:
Hi Sergei,
The corresponding issue will turn 5 next March: https://github.com/BaseXdb/basex/issues/902
If you are an XML developer who wants to index all the XML, XSLT, XProc, RNG, XSD, Schematron, etc. files on your hard disk in an XML database, chances are that you’ll need more than 256 namespaces.
I’m willing to shell out up to 1,200 Euros (plus VAT) out of my own pockets for this feature. Any other funders? Christian, how much do we need to raise collectively for you to prioritize storage layout redesign?
Gerrit
On 31.10.2018 22:47, Сергей Чесноков wrote:
Hi all,
Is it possible to bypass the following restriction (I cannot change "ep_ins_med_q.xsd" (Central Bank xbrl scheme))?:
BaseX 9.1
Command: CREATE DB bfo D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/ Error: "D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/rep/2018-03-31/ep/ep_ins_med_q.xsd""ep_ins_med_q.xsd" (Line 21): Too many distinct namespaces (limit: 256).
Best regards, Sergei.
-- Gerrit Imsieke Geschäftsführer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@le-tex.de, http://www.le-tex.de
Registergericht / Commercial Register: Amtsgericht Leipzig Registernummer / Registration Number: HRB 24930
Geschäftsführer / Managing Directors: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt
How hard can it be to move from a 16 bit architecture to a 32 bit architecture?
Yes, that can be tricky. Just one example that affects Java and BaseX: Java arrays are limited to 2^31 entries. Although 64 bit CPUs were getting popular a long time ago, I don’t believe this limit will ever be lifted in a future version of Java. Right now, we use simple integer offsets to address nodes in main-memory database instances. If we decide to introduce support for more than 2 billion nodes per database, we’d need to use additional redirections (which is possible indeed, but requires some more effort than replacing int with long values).
Maybe we could start a little new database project from scratch? ;) Various issues could then be solved more portable (but the switch to 128 bit architectures is probably still far away, so we could probably stick with 64 bit…).
Hi,
Is there a BaseX function for converting a string date in the form of “March 2017” to xs:date or xs:dateTime?
Thanks, Ron
On November 2, 2018 at 2:37:31 PM, Imsieke, Gerrit, le-tex ( gerrit.imsieke@le-tex.de) wrote:
One approach to avoid migration and backwards compatibility issues would be to support a standard storage and an extended storage side by side. The storage and query functions would know beforehand which layout variant the current database is in, and they could use the appropriate optimized functions. If dynamic lookup of these layout-specific functions would be too costly, maybe providing two separate binaries (classic and extended storage) might be an option. I cannot fathom how many pieces of code need to be modified in order to be able to maintain a common codebase for both layouts. I’m certainly naïve in this regard because back in the days, I thought: How hard can it be to move from a 16 bit architecture to a 32 bit architecture?
Gerrit
On 02.11.2018 17:25, Christian Grün wrote:
Hi Gerrit,
thanks for your generous offer to sponsor the requested feature. I am ashamed to confirm it’s been a long time since this issue has been opened and not closed yet. You are asking how much money will be required to get this fixed. I’m not sure after all. Maybe I would rather ask for 3, 4 weeks of “leisure time”, or – even better – get a proposal into my hands how this could be resolved without compromising backward conformance.
Some more details: The current storage layout per node has been fixed to 16 bytes. One byte (8 bits) is reserved for the namespace reference. The other 15 bytes (minus a few unused bits) are reserved for other references and flags. We could extend the storage to 24 or 32 bits. As a result, the central database main table would get larger, so this would affect both old databases (that need to be imported) and the overall performance of the system. If we decide to go this step, we could indeed overcome various of the current limitations.
Any volunteers out there who are ready for the challenge? Christian
On Wed, Oct 31, 2018 at 11:40 PM Imsieke, Gerrit, le-tex gerrit.imsieke@le-tex.de wrote:
Hi Sergei,
The corresponding issue will turn 5 next March:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_BaseXdb_base...
If you are an XML developer who wants to index all the XML, XSLT, XProc, RNG, XSD, Schematron, etc. files on your hard disk in an XML database, chances are that you’ll need more than 256 namespaces.
I’m willing to shell out up to 1,200 Euros (plus VAT) out of my own pockets for this feature. Any other funders? Christian, how much do we need to raise collectively for you to prioritize storage layout redesign?
Gerrit
On 31.10.2018 22:47, Сергей Чесноков wrote:
Hi all,
Is it possible to bypass the following restriction (I cannot change "ep_ins_med_q.xsd" (Central Bank xbrl scheme))?:
BaseX 9.1
Command: CREATE DB bfo D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/ Error: "D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/
www.cbr.ru/xbrl/bfo/rep/2018-03-31/ep/ep_ins_med_q.xsd""ep_ins_med_q.xsd"
(Line 21): Too many distinct namespaces (limit: 256).
Best regards, Sergei.
-- Gerrit Imsieke Geschäftsführer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@le-tex.de,
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.le-2Dtex.de&d=Dw...
Registergericht / Commercial Register: Amtsgericht Leipzig Registernummer / Registration Number: HRB 24930
Geschäftsführer / Managing Directors: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt
Hi Ron,
If your timestamp is available in IETF format, you can use fn:parse-ietf-date [1]. Otherwise, you’ll need to write a simple function by yourself:
declare variable $MONTHS := ( 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December' );
declare function local:to-date($string) { let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(substring-after($string, ' ')) return xs:date(string-join(( format-number($y, '0000'), format-number($m, '00'), '01' ), '-')) }; local:to-date('March 2017')
Best, Christian
[1] https://www.w3.org/TR/xpath-functions-31/#func-parse-ietf-date
On Fri, Nov 2, 2018 at 9:09 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi,
Is there a BaseX function for converting a string date in the form of “March 2017” to xs:date or xs:dateTime?
Thanks, Ron
Hi Christian,
Much appreciated! I hardened the code (see below) since the dates (from CT.gov) occasionally also have the day of the month (e.g., “March 21, 2014”). Currently the function is dropping the day in such cases but I will look into capturing it in a future iteration.
Best, Ron
declare function local:to-date($string) { if (fn:matches($string, '[A-Za-z]+ [0-9]+') or fn:matches($string, '[A-Za-z]+ [0-9]+, [0-9]+')) then let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(functx:substring-after-last($string, ' ')) return xs:date(string-join( ( format-number($y, '0000'), format-number($m, '00'), '01' ), '-') ) else () };
On November 2, 2018 at 4:20:41 PM, Christian Grün (christian.gruen@gmail.com) wrote:
Hi Ron,
If your timestamp is available in IETF format, you can use fn:parse-ietf-date [1]. Otherwise, you’ll need to write a simple function by yourself:
declare variable $MONTHS := ( 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December' );
declare function local:to-date($string) { let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(substring-after($string, ' ')) return xs:date(string-join(( format-number($y, '0000'), format-number($m, '00'), '01' ), '-')) }; local:to-date('March 2017')
Best, Christian
[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.w3.org_TR_xpath-2Df...
On Fri, Nov 2, 2018 at 9:09 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi,
Is there a BaseX function for converting a string date in the form of
“March 2017” to xs:date or xs:dateTime?
Thanks, Ron
Hi Ron,
You might find Ryan Grimm's date-parser library module useful if you have a larger range of date formats to handle:
https://github.com/marklogic-community/commons/blob/master/dates/date-parser...
While it was written with some MarkLogic-specific code, I adapted it for use with eXist (but haven't tested it with BaseX):
https://github.com/HistoryAtState/twitter/blob/master/modules/date-parser.xq...
Best, Joe
On Fri, Nov 2, 2018 at 6:48 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi Christian,
Much appreciated! I hardened the code (see below) since the dates (from CT.gov) occasionally also have the day of the month (e.g., “March 21, 2014”). Currently the function is dropping the day in such cases but I will look into capturing it in a future iteration.
Best, Ron
declare function local:to-date($string) { if (fn:matches($string, '[A-Za-z]+ [0-9]+') or fn:matches($string, '[A-Za-z]+ [0-9]+, [0-9]+')) then let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(functx:substring-after-last($string, ' ')) return xs:date(string-join( ( format-number($y, '0000'), format-number($m, '00'), '01' ), '-') ) else () };
On November 2, 2018 at 4:20:41 PM, Christian Grün ( christian.gruen@gmail.com) wrote:
Hi Ron,
If your timestamp is available in IETF format, you can use fn:parse-ietf-date [1]. Otherwise, you’ll need to write a simple function by yourself:
declare variable $MONTHS := ( 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December' );
declare function local:to-date($string) { let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(substring-after($string, ' ')) return xs:date(string-join(( format-number($y, '0000'), format-number($m, '00'), '01' ), '-')) }; local:to-date('March 2017')
Best, Christian
[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.w3.org_TR_xpath-2Df...
On Fri, Nov 2, 2018 at 9:09 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi,
Is there a BaseX function for converting a string date in the form of
“March 2017” to xs:date or xs:dateTime?
Thanks, Ron
Hi Joe,
Thanks for sharing that. I tried adapting your eXist port to BaseX and ran into issues with namespaces. At this point I don’t have the time or expertise to complete this but hopefully someone else will take up the challenge.
Best, Ron
On November 3, 2018 at 12:19:49 AM, Joe Wicentowski (joewiz@gmail.com) wrote:
Hi Ron,
You might find Ryan Grimm's date-parser library module useful if you have a larger range of date formats to handle:
https://github.com/marklogic-community/commons/blob/master/dates/date-parser... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_marklogic-2Dcommunity_commons_blob_master_dates_date-2Dparser.xqy&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=e-vvgsru1Z5x9LvDMbzC4IlnNmLD9Gh74LxZE20a3-A&s=QXbGrJ9vbbC9tQ9k_8qdKN-M6_zHwScCTYBIHAAlxbc&e=
While it was written with some MarkLogic-specific code, I adapted it for use with eXist (but haven't tested it with BaseX):
https://github.com/HistoryAtState/twitter/blob/master/modules/date-parser.xq... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_HistoryAtState_twitter_blob_master_modules_date-2Dparser.xqm&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=e-vvgsru1Z5x9LvDMbzC4IlnNmLD9Gh74LxZE20a3-A&s=AGe6aYSio39Gb6aYBk41uIcAk3LA3ex3jxCg1ZUCA3M&e=
Best, Joe
On Fri, Nov 2, 2018 at 6:48 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi Christian,
Much appreciated! I hardened the code (see below) since the dates (from CT.gov https://urldefense.proofpoint.com/v2/url?u=http-3A__CT.gov&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=e-vvgsru1Z5x9LvDMbzC4IlnNmLD9Gh74LxZE20a3-A&s=sTeb3eQJ-UU9JJcyp70O4qY6e-PGeWQUHJJhfe3A9uE&e=) occasionally also have the day of the month (e.g., “March 21, 2014”). Currently the function is dropping the day in such cases but I will look into capturing it in a future iteration.
Best, Ron
declare function local:to-date($string) { if (fn:matches($string, '[A-Za-z]+ [0-9]+') or fn:matches($string, '[A-Za-z]+ [0-9]+, [0-9]+')) then let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(functx:substring-after-last($string, ' ')) return xs:date(string-join( ( format-number($y, '0000'), format-number($m, '00'), '01' ), '-') ) else () };
On November 2, 2018 at 4:20:41 PM, Christian Grün ( christian.gruen@gmail.com) wrote:
Hi Ron,
If your timestamp is available in IETF format, you can use fn:parse-ietf-date [1]. Otherwise, you’ll need to write a simple function by yourself:
declare variable $MONTHS := ( 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December' );
declare function local:to-date($string) { let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(substring-after($string, ' ')) return xs:date(string-join(( format-number($y, '0000'), format-number($m, '00'), '01' ), '-')) }; local:to-date('March 2017')
Best, Christian
[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.w3.org_TR_xpath-2Df...
On Fri, Nov 2, 2018 at 9:09 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi,
Is there a BaseX function for converting a string date in the form of
“March 2017” to xs:date or xs:dateTime?
Thanks, Ron
Hi Ron,
I took a quick look at Joe's module and made the following changes:
1. Change datetime:format-date(..,"yyyy-MM-dd") to format-date(..,"[YYYY]-[MM]-[DD]") 2. *Change xdt*:dayTimeDuration("P1D") -> *xs*:dayTimeDuration 3. Change function in namespace *local:* to *dates:*
And saved as a gist [1]. Then in BaseX
import module namespace dates = "http://xqdev.com/dateparser" at " https://gist.githubusercontent.com/apb2006/ee0effdd53ca80daf4f1b3e99794ed89/... ";
dates:parseDate("March 2017")
Returns <date resolution="month"> <range> <start>2017-03-01</start> <end>2017-03-31</end> </range> <value>2017-03-01</value> </date>
Joe, I think all these changes are compatible with eXist too? [2]
Hope this helps. /Andy
[1] https://gist.github.com/apb2006/ee0effdd53ca80daf4f1b3e99794ed89 [2] https://sourceforge.net/p/exist/mailman/message/34988159/
On Mon, 5 Nov 2018 at 02:03, Ron Katriel rkatriel@mdsol.com wrote:
Hi Joe,
Thanks for sharing that. I tried adapting your eXist port to BaseX and ran into issues with namespaces. At this point I don’t have the time or expertise to complete this but hopefully someone else will take up the challenge.
Best, Ron
On November 3, 2018 at 12:19:49 AM, Joe Wicentowski (joewiz@gmail.com) wrote:
Hi Ron,
You might find Ryan Grimm's date-parser library module useful if you have a larger range of date formats to handle:
https://github.com/marklogic-community/commons/blob/master/dates/date-parser... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_marklogic-2Dcommunity_commons_blob_master_dates_date-2Dparser.xqy&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=e-vvgsru1Z5x9LvDMbzC4IlnNmLD9Gh74LxZE20a3-A&s=QXbGrJ9vbbC9tQ9k_8qdKN-M6_zHwScCTYBIHAAlxbc&e=
While it was written with some MarkLogic-specific code, I adapted it for use with eXist (but haven't tested it with BaseX):
https://github.com/HistoryAtState/twitter/blob/master/modules/date-parser.xq... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_HistoryAtState_twitter_blob_master_modules_date-2Dparser.xqm&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=e-vvgsru1Z5x9LvDMbzC4IlnNmLD9Gh74LxZE20a3-A&s=AGe6aYSio39Gb6aYBk41uIcAk3LA3ex3jxCg1ZUCA3M&e=
Best, Joe
On Fri, Nov 2, 2018 at 6:48 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi Christian,
Much appreciated! I hardened the code (see below) since the dates (from CT.gov https://urldefense.proofpoint.com/v2/url?u=http-3A__CT.gov&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=e-vvgsru1Z5x9LvDMbzC4IlnNmLD9Gh74LxZE20a3-A&s=sTeb3eQJ-UU9JJcyp70O4qY6e-PGeWQUHJJhfe3A9uE&e=) occasionally also have the day of the month (e.g., “March 21, 2014”). Currently the function is dropping the day in such cases but I will look into capturing it in a future iteration.
Best, Ron
declare function local:to-date($string) { if (fn:matches($string, '[A-Za-z]+ [0-9]+') or fn:matches($string, '[A-Za-z]+ [0-9]+, [0-9]+')) then let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(functx:substring-after-last($string, ' ')) return xs:date(string-join( ( format-number($y, '0000'), format-number($m, '00'), '01' ), '-') ) else () };
On November 2, 2018 at 4:20:41 PM, Christian Grün ( christian.gruen@gmail.com) wrote:
Hi Ron,
If your timestamp is available in IETF format, you can use fn:parse-ietf-date [1]. Otherwise, you’ll need to write a simple function by yourself:
declare variable $MONTHS := ( 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December' );
declare function local:to-date($string) { let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(substring-after($string, ' ')) return xs:date(string-join(( format-number($y, '0000'), format-number($m, '00'), '01' ), '-')) }; local:to-date('March 2017')
Best, Christian
[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.w3.org_TR_xpath-2Df...
On Fri, Nov 2, 2018 at 9:09 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi,
Is there a BaseX function for converting a string date in the form of
“March 2017” to xs:date or xs:dateTime?
Thanks, Ron
Hi Andy,
Thanks for the quick solution! I imported the new module and was able to use the new function as follows:
xs:date(dates:parseDate($trial/completion_date)//value/text()) <= fn:current-date()
I verified the correctness for ~300 dates and the logic appears solid (for the kind of dates discussed below).
Any chance this could find its way into a BaseX release?
Best, Ron
On November 5, 2018 at 10:27:48 AM, Andy Bunce (bunce.andy@gmail.com) wrote:
Hi Ron,
I took a quick look at Joe's module and made the following changes:
1. Change datetime:format-date(..,"yyyy-MM-dd") to format-date(..,"[YYYY]-[MM]-[DD]") 2. *Change xdt*:dayTimeDuration("P1D") -> *xs*:dayTimeDuration 3. Change function in namespace *local:* to *dates:*
And saved as a gist [1]. Then in BaseX
import module namespace dates = "http://xqdev.com/dateparser https://urldefense.proofpoint.com/v2/url?u=http-3A__xqdev.com_dateparser&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=wSyNTnK69xJeik7x5IxCLRt7PrrO28V-8jSVck27iww&s=sADX9f7jGPXSqviT-mIVK6uu0h1KSFz4E2tV1u6qgvI&e=" at " https://gist.githubusercontent.com/apb2006/ee0effdd53ca80daf4f1b3e99794ed89/... https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.githubusercontent.com_apb2006_ee0effdd53ca80daf4f1b3e99794ed89_raw_7eedb8343c3cbf0e4ef0063be5af532e8bbfe485_date-2Dparser.xqm&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=wSyNTnK69xJeik7x5IxCLRt7PrrO28V-8jSVck27iww&s=Nmj_UydJx1eMVXLbIyF5CfNVy6fAvikicbuI_kJzxls&e= ";
dates:parseDate("March 2017")
Returns <date resolution="month"> <range> <start>2017-03-01</start> <end>2017-03-31</end> </range> <value>2017-03-01</value> </date>
Joe, I think all these changes are compatible with eXist too? [2]
Hope this helps. /Andy
[1] https://gist.github.com/apb2006/ee0effdd53ca80daf4f1b3e99794ed89 https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_apb2006_ee0effdd53ca80daf4f1b3e99794ed89&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=wSyNTnK69xJeik7x5IxCLRt7PrrO28V-8jSVck27iww&s=c2tQIVKVH2ZgccsDKII0OHq6wrgN6Bmgazlx9hTfv7s&e= [2] https://sourceforge.net/p/exist/mailman/message/34988159/ https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceforge.net_p_exist_mailman_message_34988159_&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=wSyNTnK69xJeik7x5IxCLRt7PrrO28V-8jSVck27iww&s=SuWEw26jNiCrBlPLzjym5JzZSIECNWAADDEmjTIZevo&e=
On Mon, 5 Nov 2018 at 02:03, Ron Katriel rkatriel@mdsol.com wrote:
Hi Joe,
Thanks for sharing that. I tried adapting your eXist port to BaseX and ran into issues with namespaces. At this point I don’t have the time or expertise to complete this but hopefully someone else will take up the challenge.
Best, Ron
On November 3, 2018 at 12:19:49 AM, Joe Wicentowski (joewiz@gmail.com) wrote:
Hi Ron,
You might find Ryan Grimm's date-parser library module useful if you have a larger range of date formats to handle:
https://github.com/marklogic-community/commons/blob/master/dates/date-parser... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_marklogic-2Dcommunity_commons_blob_master_dates_date-2Dparser.xqy&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=e-vvgsru1Z5x9LvDMbzC4IlnNmLD9Gh74LxZE20a3-A&s=QXbGrJ9vbbC9tQ9k_8qdKN-M6_zHwScCTYBIHAAlxbc&e=
While it was written with some MarkLogic-specific code, I adapted it for use with eXist (but haven't tested it with BaseX):
https://github.com/HistoryAtState/twitter/blob/master/modules/date-parser.xq... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_HistoryAtState_twitter_blob_master_modules_date-2Dparser.xqm&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=e-vvgsru1Z5x9LvDMbzC4IlnNmLD9Gh74LxZE20a3-A&s=AGe6aYSio39Gb6aYBk41uIcAk3LA3ex3jxCg1ZUCA3M&e=
Best, Joe
On Fri, Nov 2, 2018 at 6:48 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi Christian,
Much appreciated! I hardened the code (see below) since the dates (from CT.gov https://urldefense.proofpoint.com/v2/url?u=http-3A__CT.gov&d=DwMFaQ&c=fi2D4-9xMzmjyjREwHYlAw&r=44jDQvzmnB_-ovfO6Iusj0ItciJrcWMOQQwd2peEBBE&m=e-vvgsru1Z5x9LvDMbzC4IlnNmLD9Gh74LxZE20a3-A&s=sTeb3eQJ-UU9JJcyp70O4qY6e-PGeWQUHJJhfe3A9uE&e=) occasionally also have the day of the month (e.g., “March 21, 2014”). Currently the function is dropping the day in such cases but I will look into capturing it in a future iteration.
Best, Ron
declare function local:to-date($string) { if (fn:matches($string, '[A-Za-z]+ [0-9]+') or fn:matches($string, '[A-Za-z]+ [0-9]+, [0-9]+')) then let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(functx:substring-after-last($string, ' ')) return xs:date(string-join( ( format-number($y, '0000'), format-number($m, '00'), '01' ), '-') ) else () };
On November 2, 2018 at 4:20:41 PM, Christian Grün ( christian.gruen@gmail.com) wrote:
Hi Ron,
If your timestamp is available in IETF format, you can use fn:parse-ietf-date [1]. Otherwise, you’ll need to write a simple function by yourself:
declare variable $MONTHS := ( 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December' );
declare function local:to-date($string) { let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(substring-after($string, ' ')) return xs:date(string-join(( format-number($y, '0000'), format-number($m, '00'), '01' ), '-')) }; local:to-date('March 2017')
Best, Christian
[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.w3.org_TR_xpath-2Df...
On Fri, Nov 2, 2018 at 9:09 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi,
Is there a BaseX function for converting a string date in the form of
“March 2017” to xs:date or xs:dateTime?
Thanks, Ron
We are working on a web site, which will facilitate the management of new external modules for BaseX. Users are expected to be able to upload new modules, search for existing modules and install them in local BaseX instances with a few clicks. This module will definitely be a good candidate for this web site.
On Tue, Nov 6, 2018 at 1:35 AM Ron Katriel rkatriel@mdsol.com wrote:
Hi Andy,
Thanks for the quick solution! I imported the new module and was able to use the new function as follows:
xs:date(dates:parseDate($trial/completion_date)//value/text()) <= fn:current-date()
I verified the correctness for ~300 dates and the logic appears solid (for the kind of dates discussed below).
Any chance this could find its way into a BaseX release?
Best, Ron
On November 5, 2018 at 10:27:48 AM, Andy Bunce (bunce.andy@gmail.com) wrote:
Hi Ron,
I took a quick look at Joe's module and made the following changes:
Change datetime:format-date(..,"yyyy-MM-dd") to format-date(..,"[YYYY]-[MM]-[DD]") Change xdt:dayTimeDuration("P1D") -> xs:dayTimeDuration Change function in namespace local: to dates:
And saved as a gist [1]. Then in BaseX
import module namespace dates = "http://xqdev.com/dateparser" at "https://gist.githubusercontent.com/apb2006/ee0effdd53ca80daf4f1b3e99794ed89/...";
dates:parseDate("March 2017")
Returns
<date resolution="month"> <range> <start>2017-03-01</start> <end>2017-03-31</end> </range> <value>2017-03-01</value> </date>
Joe, I think all these changes are compatible with eXist too? [2]
Hope this helps. /Andy
[1] https://gist.github.com/apb2006/ee0effdd53ca80daf4f1b3e99794ed89 [2] https://sourceforge.net/p/exist/mailman/message/34988159/
On Mon, 5 Nov 2018 at 02:03, Ron Katriel rkatriel@mdsol.com wrote:
Hi Joe,
Thanks for sharing that. I tried adapting your eXist port to BaseX and ran into issues with namespaces. At this point I don’t have the time or expertise to complete this but hopefully someone else will take up the challenge.
Best, Ron
On November 3, 2018 at 12:19:49 AM, Joe Wicentowski (joewiz@gmail.com) wrote:
Hi Ron,
You might find Ryan Grimm's date-parser library module useful if you have a larger range of date formats to handle:
https://github.com/marklogic-community/commons/blob/master/dates/date-parser...
While it was written with some MarkLogic-specific code, I adapted it for use with eXist (but haven't tested it with BaseX):
https://github.com/HistoryAtState/twitter/blob/master/modules/date-parser.xq...
Best, Joe
On Fri, Nov 2, 2018 at 6:48 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi Christian,
Much appreciated! I hardened the code (see below) since the dates (from CT.gov) occasionally also have the day of the month (e.g., “March 21, 2014”). Currently the function is dropping the day in such cases but I will look into capturing it in a future iteration.
Best, Ron
declare function local:to-date($string) { if (fn:matches($string, '[A-Za-z]+ [0-9]+') or fn:matches($string, '[A-Za-z]+ [0-9]+, [0-9]+')) then let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(functx:substring-after-last($string, ' ')) return xs:date(string-join( ( format-number($y, '0000'), format-number($m, '00'), '01' ), '-') ) else () };
On November 2, 2018 at 4:20:41 PM, Christian Grün (christian.gruen@gmail.com) wrote:
Hi Ron,
If your timestamp is available in IETF format, you can use fn:parse-ietf-date [1]. Otherwise, you’ll need to write a simple function by yourself:
declare variable $MONTHS := ( 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December' );
declare function local:to-date($string) { let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(substring-after($string, ' ')) return xs:date(string-join(( format-number($y, '0000'), format-number($m, '00'), '01' ), '-')) }; local:to-date('March 2017')
Best, Christian
[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.w3.org_TR_xpath-2Df...
On Fri, Nov 2, 2018 at 9:09 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi,
Is there a BaseX function for converting a string date in the form of “March 2017” to xs:date or xs:dateTime?
Thanks, Ron
Perfect. As always appreciate the great support and quick turnaround!
Best, Ron
On Nov 6, 2018, at 3:56 AM, Christian Grün christian.gruen@gmail.com wrote:
We are working on a web site, which will facilitate the management of new external modules for BaseX. Users are expected to be able to upload new modules, search for existing modules and install them in local BaseX instances with a few clicks. This module will definitely be a good candidate for this web site.
On Tue, Nov 6, 2018 at 1:35 AM Ron Katriel rkatriel@mdsol.com wrote:
Hi Andy,
Thanks for the quick solution! I imported the new module and was able to use the new function as follows:
xs:date(dates:parseDate($trial/completion_date)//value/text()) <= fn:current-date()
I verified the correctness for ~300 dates and the logic appears solid (for the kind of dates discussed below).
Any chance this could find its way into a BaseX release?
Best, Ron
On November 5, 2018 at 10:27:48 AM, Andy Bunce (bunce.andy@gmail.com) wrote:
Hi Ron,
I took a quick look at Joe's module and made the following changes:
Change datetime:format-date(..,"yyyy-MM-dd") to format-date(..,"[YYYY]-[MM]-[DD]") Change xdt:dayTimeDuration("P1D") -> xs:dayTimeDuration Change function in namespace local: to dates:
And saved as a gist [1]. Then in BaseX
import module namespace dates = "https://urldefense.proofpoint.com/v2/url?u=http-3A__xqdev.com_dateparser&..." at "https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.githubusercontent....";
dates:parseDate("March 2017")
Returns
<date resolution="month"> <range> <start>2017-03-01</start> <end>2017-03-31</end> </range> <value>2017-03-01</value> </date>
Joe, I think all these changes are compatible with eXist too? [2]
Hope this helps. /Andy
[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_apb2006... [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceforge.net_p_exist...
On Mon, 5 Nov 2018 at 02:03, Ron Katriel rkatriel@mdsol.com wrote:
Hi Joe,
Thanks for sharing that. I tried adapting your eXist port to BaseX and ran into issues with namespaces. At this point I don’t have the time or expertise to complete this but hopefully someone else will take up the challenge.
Best, Ron
On November 3, 2018 at 12:19:49 AM, Joe Wicentowski (joewiz@gmail.com) wrote:
Hi Ron,
You might find Ryan Grimm's date-parser library module useful if you have a larger range of date formats to handle:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_marklogic-2D...
While it was written with some MarkLogic-specific code, I adapted it for use with eXist (but haven't tested it with BaseX):
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_HistoryAtSta...
Best, Joe
On Fri, Nov 2, 2018 at 6:48 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi Christian,
Much appreciated! I hardened the code (see below) since the dates (from CT.gov) occasionally also have the day of the month (e.g., “March 21, 2014”). Currently the function is dropping the day in such cases but I will look into capturing it in a future iteration.
Best, Ron
declare function local:to-date($string) { if (fn:matches($string, '[A-Za-z]+ [0-9]+') or fn:matches($string, '[A-Za-z]+ [0-9]+, [0-9]+')) then let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(functx:substring-after-last($string, ' ')) return xs:date(string-join( ( format-number($y, '0000'), format-number($m, '00'), '01' ), '-') ) else () };
On November 2, 2018 at 4:20:41 PM, Christian Grün (christian.gruen@gmail.com) wrote:
Hi Ron,
If your timestamp is available in IETF format, you can use fn:parse-ietf-date [1]. Otherwise, you’ll need to write a simple function by yourself:
declare variable $MONTHS := ( 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December' );
declare function local:to-date($string) { let $m := index-of($MONTHS, substring-before($string, ' ')) let $y := xs:integer(substring-after($string, ' ')) return xs:date(string-join(( format-number($y, '0000'), format-number($m, '00'), '01' ), '-')) }; local:to-date('March 2017')
Best, Christian
[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.w3.org_TR_xpath-2Df...
On Fri, Nov 2, 2018 at 9:09 PM Ron Katriel rkatriel@mdsol.com wrote:
Hi,
Is there a BaseX function for converting a string date in the form of “March 2017” to xs:date or xs:dateTime?
Thanks, Ron
On Fri, 2018-11-02 at 17:25 +0100, Christian Grün wrote:
did this ever happen?
Some more details: The current storage layout per node has been fixed to 16 bytes. One byte (8 bits) is reserved for the namespace reference.
Here are a couple of hacky appraches in the spirit of brainstorming ;)
* reserve 7 bits for the namespace, and 1 bit for "uses extended namespace". In the top-bit-set objects only, add an extra byte. * Use the first 2 bytes of the element name :)
Liam
Hi Liam,
did this ever happen?
What exactly? ;)
- reserve 7 bits for the namespace, and 1 bit for "uses extended
namespace". In the top-bit-set objects only, add an extra byte.
- Use the first 2 bytes of the element name :)
There are a few online resources with details on our chosen storage layout [1,2,3]. I think that we pretty much use every available bit for storing references and flags (except e.g. for text nodes, which have less metadata than elements or attributes), but maybe there’s indeed something specific that we could optimize? More suggestions are welcome.
Cheers Christian
[1] http://files.basex.org/publications/Gruen%20%5B2010%5D,%20Storing%20and%20Qu... [2] http://docs.basex.org/wiki/Storage_Layout [3] https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/ba...
On Wed, 2019-01-16 at 15:31 +0100, Christian Grün wrote:
Hi Liam,
did this ever happen?
What exactly? ;)
sorry! support for more than 256 namespaces in one db.
basex-talk@mailman.uni-konstanz.de