Hi Navel,
BaseX Xquery CSV module is for dealing with CSV files which we don't have.
With this module, you can serialize XML data as CSV (and parse CSV back to XML). But that’s certainly just one way to do it.
Hi Owen,
However, I wonder how BaseX might deal with more than a million records.
Feel free to get back to us once you encounter any limits.
Best, Christian
Naval Sarda nsarda@epicomm.net schrieb am Do., 3. Okt. 2024, 14:45:
Hi Owen, Christian,
We are working on Stratml files which are in XML format. BaseX Xquery CSV module is for dealing with CSV files which we don't have.
We can scale up BaseX by adding more servers and splitting the data between more servers if the record counts goes high.
Naval
On 02/10/24 10:54 pm, Owen Ambur wrote:
Naval, can you answer Christian's question?
My sense is that a lot of features have been built into BaseX that we could use but I don't have the knowledge or expertise to understand how best to do so. At least, I hope we're not reinventing capabilities that are already available to address our technical objectives https://aboutthem.info/SQS.xml as well as our broader conceptual objectives https://aboutthem.info/ATI.xml.
Christian, this morning we had a very encouraging Zoom meeting with folks at GSA who are finally getting around to figuring out how to help Uncle Sam's agencies comply with section 10 https://www.linkedin.com/pulse/trustworthy-institutions-owen-ambur/ of the GPRA Modernization Act. I encouraged them not to reinvent the StratML schemas and it would be great if the BaseX community could help us demonstrate the benefits of using them, not only by U.S. federal agencies but agencies at all levels of government, worldwide, as well as tax-favored organizations and others whose plans and reports should be matters of public record.
There are >5.8K plans in the StratML collection https://stratml.us/drybridge/index.htm and indexed in the query service. Their URLs are listed in sitemap format at https://stratml.us/docs/sitemap.xml
I am contemplating whether to hire someone to convert the relevant elements of the IRS Form 990 database https://www.irs.gov/charities-non-profits/tax-exempt-organization-search-bulk-data-downloads to StratML format, using the templates at https://stratml.us/drybridge/index.htm#MPR4CO However, I wonder how BaseX might deal with more than a million records.
Owen Ambur https://www.linkedin.com/in/owenambur/
On Wednesday, October 2, 2024 at 12:18:16 PM EDT, Christian Grün christian.gruen@gmail.com christian.gruen@gmail.com wrote:
Hi Owen,
Is it based on the BaseX XQuery CSV Module?
Best, Christian
Owen Ambur owen.ambur@verizon.net schrieb am Mi., 2. Okt. 2024, 18:12:
I'm not sure how this exchange might relate but my developer has provided CSV export capabilities for query results listings at StratML https://search.aboutthem.info/
StratML https://search.aboutthem.info/
Any comments or suggestions on how we might enhance the functionality of our StratML-enabled query service would be most welcome.
Owen Ambur https://www.linkedin.com/in/owenambur/
On Tuesday, October 1, 2024 at 04:42:12 AM EDT, Christian Grün < christian.gruen@gmail.com> wrote:
Hi Omar,
Thanks for the observation. For the SERIALIZER option, there was a corresponding note which I have added to the EXPORTER option, along with a little example for exporting CSV [1].
Best, Christian
[1] https://docs.basex.org/main/Options#exporter
On Mon, Sep 30, 2024 at 4:04 PM Omar Siam Omar.Siam@oeaw.ac.at wrote:
Hi,
I just tried after some time to script a CSV export with a BXS file. I forgot how to set multiple options for the csv serialization. In the end this worked: <set option="EXPORTER">method=csv,csv=header=true,,lax=false,,quotes=true</set> I can not find it on docs.basex.org. Perhaps this should be added and explained.
Best regards
-- Mag. Ing. Omar Siam Austrian Center for Digital Humanities and Cultural Heritage Österreichische Akademie der Wissenschaften | Austrian Academy of Sciences Stellvertretende Behindertenvertrauensperson | Deputy representative for disabled persons Bäckerstraße 13, 1010 Wien, Österreich | Vienna, Austria T: +43 1 51581-7295 omar.siam@oeaw.ac.at | www.oeaw.ac.at/acdh
basex-talk@mailman.uni-konstanz.de