Option one is definitly cleaner (and works much faster than my "dirty" version). Thanks so much Christian - You're a star :)
Noam
On Thu, May 28, 2015 at 12:09 AM, Christian Grün christian.gruen@gmail.com wrote:
Attached are two more solutions that might show you how it could work as well.
On Wed, May 27, 2015 at 10:51 PM, Noam Green green.noam@gmail.com wrote:
OK. Solved it.
Below is the final query:
declare variable $in external; declare variable $out external; declare variable $vendor external;
file:write-text-lines($out, 'Name,Host,Path,Count,Time'), let $text := file:read-text($in) let $xml := csv:parse($text, map { 'header': true() })
for $x in $xml//record[contains(vendors,string($vendor))]
let $count := string($x/ATTACKCOUNT) let $name := $x/attackname let $time := string($x/TIME_STAMP) let $path := string($x/path) let $host := string($x/host) let $result := concat ($name,',',$host,',',$path,',',$count,',',$time)
return file:append-text-lines($out, $result)
I played with it so long, I'm not even sure what I finally did. But it works, so I'm not touching it :)
Thanks again Christian for all you help! Noam
On Wed, May 27, 2015 at 11:21 PM, Noam Green green.noam@gmail.com
wrote:
Hi Christian,
The input file is quite large, so I tried to edit it and leave only 4 lines. The strangest thing happened: when trying to run the shortened
file
in the editor, I now get the same error "Content is not allowed in
prolog.".
Below is my query code: declare variable $in external; declare variable $out external; declare variable $vendor external;
(: Trying to work with CSV let $text := file:read-text($in) let $xml := csv:parse($text, map { 'header': true() }) let $inputcsv := csv:serialize($xml, map { 'header': true() } )
:)
file:write-text-lines($out, 'Name,Host,Path,Count,Time'), for $x in doc($in)//record[contains(vendors,string($vendor))]
let $count := string($x/ATTACKCOUNT) let $name := $x/attackname let $time := string($x/TIME_STAMP) let $path := string($x/path) let $host := string($x/host) let $result := concat ($name,',',$host,',',$path,',',$count,',',$time) let $nothing := string('')
return file:append-text-lines($out, $result)
I tried adding some CSV manipulation lines (see commented-out above), as once I remove the comment, the editor comments: "Unexpected end of query: 'with CSV let $..."
It doesn't help even if I add a comma after the serialize command.
I know I'm missing something stupid, but can't seem to get around it.
Cheers, Noam
On Wed, May 27, 2015 at 10:42 PM, Christian Grün christian.gruen@gmail.com wrote:
Why does it work in the editor [...]
This surprises me. Could you attach me your input file and the query?
In general, the input of doc() must always be an XML document. However, you can use csv:parse for that (once again, please check out our Wiki for an example).
On Wed, May 27, 2015 at 10:33 PM, Christian Grün christian.gruen@gmail.com wrote:
> C:\Temp>basex -b in=export_new.csv -b out=output.csv -b vendor=IID > CSV_Query.xq
It looks as if you are trying to parse a CSV file (export_new.csv)
as
XML. Is this really what you wanna do?
Christian
> > I get the follwing error: > Stopped at C:/Temp/CSV_Query.xq, 6/15: > [FODC0002] "C:/Temp/export_new.csv" (Line 1): Content is not
allowed
> in > prolog. > > This seems to be failing on the basic for $x in doc($in) command. > > What am I doing wrong? > > Thanks, > Noam > > On Wed, May 27, 2015 at 10:02 PM, Christian Grün > christian.gruen@gmail.com > wrote: >> >> Hi Noam, >> >> > I missed the option of adding a comma after the initial >> > file:write >> > command >> > (the editor was constantly asking for a return command). >> >> In XQuery, multiple expressions can always be separated with >> commas. >> For example, the following XQuery expression returns 4 items as >> results: >> >> 1, "string", <xml/>, file:read-text('abc.xml') >> >> As XQuery itself, due to its functional nature, provides no >> guarantee >> that the four expressions of the above query will be evaluated
one
>> after another, the XQuery Scripting Extension [1] was proposed.
It
>> offers a semicolon to separate expressions with side-effects
(such
>> as >> file functions). Due to its complexity, however, it was not >> implemented by many XQuery implementations. >> >> At least in BaseX (and I think in Saxon, eXist-db and Zorba as >> well), >> you can be assured that expressions separated with commas will >> always >> be evaluated one after another. >> >> Well, this was probably more than you were asking for, but maybe >> it's >> of some interest anyway ;) >> >> Christian >> >> [1] http://www.w3.org/TR/xquery-sx-10/ >> >> >> >> > >> > Thanks again. It worked perfectly (although I must admit I used >> > the >> > dirty >> > option, as the CSV examples are mainly on adapting CSV into
XML,
>> > while I >> > need the other direction). >> > >> > Noam >> > >> > On Wed, May 27, 2015 at 9:40 PM, Christian Grün >> > christian.gruen@gmail.com >> > wrote: >> >> >> >> Hi Noam, >> >> >> >> > let $csv := csv:serialize($result) >> >> > return file:write-text($out, $csv) >> >> > >> >> > The CVS that comes out only includes one line [...] >> >> >> >> As there are unlimited ways to represent XML nodes as CSV,
there
>> >> is >> >> no >> >> way to automatically a representation that always works best. >> >> For >> >> more >> >> information on creating an XML representation that will yield >> >> good >> >> results as CSV, please check out the documentation on our CSV >> >> Module >> >> [1]. >> >> >> >> > Now this works, but I can't seem to find a way to add the >> >> > headers >> >> > to >> >> > the >> >> > first line of the file. >> >> >> >> Obviously, I would recommend you to use the existing CSV >> >> features, >> >> because it will take care of all the usal nifty details. >> >> However, >> >> here >> >> is one simple way to let your file start with a header line: >> >> >> >> file:write-text-lines($out, 'Name,Host,Path,Count,Time'), >> >> let $result := concat >> >> ($name,',',$host,',',$path,',',$count,',',$time) >> >> return file:append-text-lines($out, $result) >> >> >> >> Hope this helps, >> >> Christian >> >> >> >> [1] http://docs.basex.org/wiki/CSV_Module >> > >> > > >