New subject: [ParGram] Fwd: Re: ParGram sentences

19 Jan 2015

      Hi,

At the upcoming ParGram Meeting in Warsaw, we would like to experiment 
with a new way to do our usual structure comparison: We would like to 
use INESS (http://iness.uib.no/) and ParGramBank directly.

There are a couple of advantages to this. One obvious advantage is that 
ParGramBank will grow. But we can also easily switch between 
languages/sentences, and there is no need to create a humongous PDF file 
with structures in it (last year, our structure handout had 234 pages).

Paul Meurer (www.uib.no/persons/Paul.Meurer) has kindly implemented a 
couple of new features in INESS. I attach an email from Paul below. 
Please follow his instructions when adding to ParGramBank.

To be able to upload structures to INESS (in Prolog format), you need to 
have an account there. Please follow the instructions on the INESS 
homepage to create your account. Once the account is created, Paul can 
give you the necessary user rights to upload files.

Lastly, I attach the sentences in a text file to this email. As Paul 
says in his email, please use a consistent naming scheme for the 
structures; you should use something like urd-051-fs (i.e., ISO 639-3 
language code - sentence number - fs). Start the sentence number at 51 
(there are already 50 sentences in ParGramBank).

If you have more questions, please don't hesitate to send email to Paul, 
or myself. Also, Agnieszka/Paul and others, if I forgot anything, feel 
free to jump in.

Best,
Jani

-------- Weitergeleitete Nachricht --------
Betreff: 	Re: ParGram sentences
Datum: 	Thu, 15 Jan 2015 11:48:57 +0100
Von: 	Paul Meurer <paul.meurer@uni.no>
An: 	Agnieszka Patejuk <agnieszka.patejuk@googlemail.com>
Kopie (CC): 	Sebastian Sulger <sebastian.sulger@uni-konstanz.de>, Adam
Przepiórkowski <adamp@ipipan.waw.pl>, Miriam Butt
<Miriam.Butt@uni-konstanz.de>, Victoria Rosén <victoria@uib.no>

Hi,
...
BTW: I went through previous correspondence related to the meeting and
have gathered the following points to cover:
PARGRAM MEETING (3 days)
• structure comparison:
 – how: traditional or using INESS?
I think now everything needed to make INESS suitable for structure
comparison is in place. Here is what is new:

* I have implemented uploading of prolog files (one by one, or as a
gzipped archive)
* Once the sentences (not structures) are aligned, it is easy to switch
between treebanks/languages. It is enough for the sentences to be
aligned to a pivot language (e.g., Urdu). You can test this in the
ParGram treebanks.

What people should do:

* Parse their sentences in XLE
* Use a consistent naming scheme for the sentences (e.g., deu-050-fs),
where the number corresponds to the running sentence number. We should
not start with 1, but continue where we stopped last time. I think new
sentences should start at 50. Alternative translations could be called
deu-050a-fs etc.
* Upload the sentences using the _upload files_ link on the Treebank
overview page (either one-by-one, or as a gzipped archive, no other
archiving program will work).
* Disambiguate the sentences in INESS.
* Add glosses, as described in the documentation.
* Align the sentences, with English as a minimum, using the Alignment
tool. This, Jani or I could do for them.

Alternatively, for languages whose grammar is in INESS, the sentences
could be parsed in INESS directly.

Does this sound feasible?

I think the sentences should be ready quite soon, for people to be able
to do all this before Pargram.

—
Best wishes,
Paul

-- 
Sebastian Sulger
FB Sprachwissenschaft
Universität Konstanz
http://ling.uni-konstanz.de/pages/home/sulger

Fwd: Re: ParGram sentences

Sebastian Sulger

Agnieszka Patejuk

Sebastian Sulger

Agnieszka Patejuk

Paul Meurer

Agnieszka Patejuk

Kaplan, Ronald

Agnieszka Patejuk

tags

participants (4)