Hi all,

Sorry to bother you with a probably trivial question, but I haven't managed to figure this out following the xle documentation (probably missing something).

I'd like to set up regression test and/or small benchmark testsuites. I've made a file with test sentences which I can evaluate using run-syn-testsuite or parse-testfile.
I'd now like to run these and make xle register the differences so that I can monitor progress/regression in the grammar's performance.

Basically, I'd like to run the testsuite on new data, be notified of changes in the f-structure and decide whether these are an improvement (update the standard to measure again) or a regression (retain old standard).

Can anyone tell me how to do this? I've tried to play around with the -outputPrefix and -goldPrefix options, but it is not clear to me how to use them exactly. I've also tried to look into benchmarking, but wasn't able to figure out how to apply this to creating a benchmark for syntactic parsing (i.e. parse sentences and select the right f-structure, store this for evaluation).

Any help would be greatly appreciated!

Thanks!
Antske