Hi all,
Sorry to bother you with a probably trivial question, but I haven't managed
to figure this out following the xle documentation (probably missing
something).
I'd like to set up regression test and/or small benchmark testsuites. I've
made a file with test sentences which I can evaluate using
run-syn-testsuite or parse-testfile.
I'd now like to run these and make xle register the differences so that I
can monitor progress/regression in the grammar's performance.
Basically, I'd like to run the testsuite on new data, be notified of
changes in the f-structure and decide whether these are an improvement
(update the standard to measure again) or a regression (retain old
standard).
Can anyone tell me how to do this? I've tried to play around with the
-outputPrefix and -goldPrefix options, but it is not clear to me how to use
them exactly. I've also tried to look into benchmarking, but wasn't able to
figure out how to apply this to creating a benchmark for syntactic parsing
(i.e. parse sentences and select the right f-structure, store this for
evaluation).
Any help would be greatly appreciated!
Thanks!
Antske