Florian,
The xsl-excel-engine project might help you get started working with xlsx files:
https://github.com/foglcz/xsl-excel-engine
xsl-excel-engine is for writing XML files so it does not do what you are asking, but the wiki documentation provides an introduction to the Excel file format.
It includes scripts to parse stringValues.xml which you might be able to use.
Vincent
From: basex-talk-bounces@mailman.uni-konstanz.de [mailto:basex-talk-bounces@mailman.uni-konstanz.de]
On Behalf Of Dirk Kirsten
Sent: Wednesday, April 06, 2016 6:44 AM
To: Florian Eckey <florian.eckey@googlemail.com>; basex-talk@mailman.uni-konstanz.de
Subject: Re: [basex-talk] XLSX to XML
Hello Florian,
please remember to always include the list when replying as it allows
others to benefit from our exchange as well and also allows others to
help you.
I just want to point out, again, that you it doesn't make sense to say
"convert the excel file to xml", because it already is XML. Yes, there
might be multiple XML files and they reference each other, but this is
just a very normal thing for XML and for every reasonably complex system
to reference each other.
So I guess what you really want is an XQuery module which allows you to
easily manipulate xlsx files without the need to worry about internal
ooxml format stuff like shared strings. This if course makes a lot of
sense! However, as the format is ridiculously complicated it is a hard
task to write a general-purpose library for all kinds of manipulations.
As Christian indicated we wrote for ourself some small helpers functions
which dies the stuff we need in our projects, but is very far from being
complete on the xlsx standard.
Cheers
Dirk
On 04/06/2016 12:35 PM, Florian Eckey wrote:
> Hello Dirk,
>
> thanks. That was my idea as well. But the format xlsx is really complicated, because the content (sheet01.xml) in the cells is referenced to an other document (stringValues.xml) using an index. I guess anyone has implemented a simple xquery to convert the
excel file to xml?
> But if nobody has done that before, i have to spend time for the implementation on my own. :)
>
> Thanks, best,
> Florian
>
>
>
>
> Am 06.04.16, 12:26 schrieb "Dirk Kirsten" <dk@basex.org>:
>
>> Hello Florian,
>>
>> xlsx is just a zip file containing many xml files. you can simply unzip
>> the xlsx (e.g. by using the BaseX zip module), modify the xml files
>> inside using standard XQuery and rezip it again as xslx.
>>
>> Cheers
>> Dirk
>>
>> On 04/06/2016 12:18 PM, Florian Eckey wrote:
>>> Hi guys,
>>>
>>> are there any ideas how to convert excel's xlsx (not xls) files to xml
>>> with XQuery or to use a Java library which can be imported? It looks
>>> like BaseX has no internal functions as for instance MarkLogic.
>>>
>>> Any ideas or example implementations to do that in XQuery or Java?
>>>
>>> Best,
>>> Florian
>> --
>> Dirk Kirsten, BaseX GmbH,
http://basexgmbh.de
>> |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz
>> |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer:
>> | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle
>> `-- Phone: 0049 7531 91 68 276, Fax: 0049 7531 20 05 22
>>
--
Dirk Kirsten, BaseX GmbH,
http://basexgmbh.de
|-- Firmensitz: Blarerstrasse 56, 78462 Konstanz
|-- Registergericht Freiburg, HRB: 708285, Geschäftsführer:
| Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle
`-- Phone: 0049 7531 91 68 276, Fax: 0049 7531 20 05 22