meeting notes, Spring Meeting 2014 - ParGram - mailman.uni-konstanz.de

24 Apr 2014


      ParGram Meeting Notes, Spring Meeting 2014
                 Nuance
Monday 24.2.2014
----
XLE STATUS
John Maxwell was able to attend from PARC so we began the day with a
discussion on the current status of XLE.
The open source version is available, but it is not clear to many
people how.  Remedy: put a note on the XLE Wiki site about how to
access the open source version.
There have also been requests for accessing the precompiled version
and the English grammar.  Notes on how to do this also need to be
added.
The English grammar should be put on a separate repository for people
to download. It would in fact be nice if this could be done via the
INESS site.  Jani to check with Paul on this.
It would be good Konstanz got a cfsm license, then Konstanz could
compile the FSM binaries for different platforms and distribute them with
XLE.
At the time of the discussion, Jani and John had together managed to
get XLE compiled on Jani's computer.  This is the first time XLE had
been compiled successfully outside of PARC.  However, the transfer
system could still not be compiled.  Subsequently, Jani managed to
compile everything on his computer (on a late Sunday night).  The
problem lay with some of the Boost C++ libraries, which keep changing.
After Jani downloaded the newest version, everything was fixed.
We are currently working on getting the compilation complete with
everything (tcl, etc.) and testing it at Konstanz.
Suggestion for the ParGram Wiki: list of available grammars, maybe get 
them to be
automatically downloadable from the INESS website (see above).
At the last ParGram meeting, it was suggested to use a creative common
licencse for the grammars like the Polish site does.  Even if
resources are freely available, people are often not sure if they can
use them or under what conditions.  So it would be good to make them
officially available via a license.
Other action items:
Move XLE documentation to Konstanz, part of XLE redmine.  Integrate
Polish addition.
Put note on starter grammars on ParGram Wiki in a prominent place.
Also common features and common templates.
----
DEBUGGING AND DOCUMENTATION ISSUES
Ron discussed debugging issues he is having with XLE.
It was acknowledged that learning out to debug with XLE is an acquired
skill and that some things could be more helpful.
In particular, there is a concern with coming into a large grammar and
understanding the history and design decisions that are embedded in
it.
They key to solving this appears to be ample documentation within an
individual grammar, but also easy accessibility to the overall
decisions taken within ParGram over the decades and a summary of the
state of the discussion for certain phenomena.  Everybody is again
encouraged to contribute to the ParGram Wiki:
http://typo.uni-konstanz.de/redmine/projects/pargram/wiki/
Some topics have already been put up.  It would be great if the
community could contribute to knowledge sharing by putting up brief
entries for further topics.
------
PROSODY-SYNTAX AND MODULARITY DISCUSSION
Tina presented issues she had been working on with respect to the
prosody-syntax interface.  In particular, there was a question about
how modularity in LFG is to be understood.  The upshot of the
discussion was that given the projection architecture of LFG, the
question arises with respect to how much information is being passed
back and forth and how much of that is in a sense duplicative.
John and Ron continue to advocate an FST approach as in the Boegel et
al. (2010) paper.
-----
F-STRUCTURE COMPARISON
Testsuites:
The f-structure comparison this year focused on the treatment of
negation (suggested by Gyuri at the last ParGram meeting) and
possessive (suggested by Jani).  We also had the first few lines of
"The Gruffalo".
The intention is to add parts of this meetings Testsuite to the
ParGramBank.  For this, however, all the grammars will need to
parallelize.
While constructing the testsuites, we also tried to avoid introducing
extraneous issues primarily caused by unwise choice of vocabulary.
That is, we tried to use vocabulary that was likely to exist in all
languages.  I.e., "chicken" instead of "gopher".  This turns out not
to be an easy problem to solve.  For example, with Wolof we ran into
trouble with "girl" and "boy".  These always introduce an extra
relative clause as "girl" translates to "child who is a woman" and
"boy" translates to "child who is a man".
The problem of crosslinguistic testsuite construction is not trivial.
We had discussed some issues at the Debrecen meeting and some further
ones came up here.  It seems that it would be worth writing a
follow-up paper to the ParGramBank paper discussing these issues in
some detail.
-----
Negation
In looking at the negation f-structures, we reviewed the ParGram Wiki
entry and Gyuri's slides from 2013. These or a version of these need
to be added to the ParGram Wiki (Miriam and Gyuri to do).  to do).
One issue was whether one could project an ADJUNCT from a negation
that is expressed morphologically.  This is technically possible, so
there is no reason in principle to resort just to a NEG + feature just
because you have morphological negation.
Changes that need to be made to the grammars to ensure parallelism:
Norwegian:  ADV-TYPE neg --> ADJUNCT-TYPE neg
STMT-TYPE --> CLAUSE-TYPE (see guidelines on this)
VFORM into a check feature?
Wolof:  has NEG + feature --- should think about moving to an Adjunct
type analysis (?)
German:  why only constituent negation on "Die Katzen sind/lachen nicht im
Haus."
Some examples in the negation testsuite included NPI items. Overall,
NPI is not dealt with in any of the grammars.
"The cat is not in the house either."
Works okay in English, but not checking for NPI.
The cat is in the house either.  --- parses fine but shouldn't.
Norwegian:
Should "Katten er i huset heller" be good?  I.e., is it doing the
wrong parallel thing like English is?
ADV-TYPE nexus?  What is this?
NPI issue does not arise with Urdu for this, since the "either" is
done with focus clitic "also" plus negation.
--
Neither is the cat in the house.
Not good in English.  "Nor is the cat in the housse"
Recommendation: If have things like never/ever, have POL negative for
the negative one.
Urdu:  change nah to not be an ADV-TYPE sadv
German:  Weder/Noch ist die Katz im Haus. ---> doesn't work
"The girl did not buy any apples."
No NPI -- "The girl bought any apples."  works also
Urdu: Why is there number information on "koi"?  Check on whether this is
necessary.  Could also have QUANT-TYPE info.  English
doesn't but Norwegian does. (QUANT-TYPE existential).  German has no
QUANT-TYPE.
For Urdu:  could introduce a "QUANT-TYPE existential"
German does not register the negative factor in "keine".  That would
be difficult for transfer.
Wolof:  QUANT-TYPE negative -- is calculated on the basis of the
any/no item plus negation on the verb. Not really parallel with the
rest, but may be the desirable thing to do for this language. Reason:
also have existential quantifiers and they are different.
In general, English grammar thought about checking on NPI items and
thought it was too costly an operation.  The reason is that one would
have to involve (IO)-FU and constraints on paths, but those are
difficult/complex to determine.
Ron was extremely surprised by the fact that the grammar contained
CASE obl.  Nuance  might look into changing CASE obl --> CASE acc for
the English grammar.
-------------------------------------------------------------------
Tuesday  25.2.2014
----
DISCUSSION ON TREEBANKS/CONSTRUCTIONS
INESS --- It would be great if the treebank listings on INESS could be
reorganized so that there is a group of ParGram treebanks and under
that there should be the parallel ParGramBank.
For next time, should do some of the testsuites that are already in
existence and parse them and add them to the ParGramBank.
EAGLES testsuite -- who has it?   Miriam asked Hans Uszkoreit and he said
he is trying to find it as well, he will let Miriam know.
One could also look into using some of DELPH-INs testsuites.
----
F-STRUCTURE COMPARISON CONTINUED
Possessive f-strs:
Norwegian:  REF + needed?
English: bright red tractor, wrong analysis
bright red tractor:  bright is not a good adjective to use, find a
different one
maybe try light instead of bright.
Wolof:  Adjunct type for "bright" in bright red tractor?
Urdu:  change numbers to not have the stem value as the PRED value,
but the actual number.
Check whether need NUM information in the Urdu numbers.
Wolof:  why is "four" NUM sg and why is it at the SPEC level???
Will probably be removed.
The farmer has a fear of spiders -- different treatment of "of
spiders" across grammars (Adjuncts in English/German, OBL in
Norwegian), so not a good example for ParGramBank.
Jani will try to come up with a sentence that doesn't involve a
complement of the noun.
Norwegian:  why is the NUMBER f-str so complex?  Can some of that go
into CHECK features?
ADJ-GEN in German?  Look up in Steffi's dissertation?
Urdu change:  genitives to the right of N, also change to SPEC POSS
rather than Adjunct analysis
----
Wednesday 26.2.14
Talks held at CSLI on lexical semantics and complex predicates -- see
the slides.
There was also a discussion led by Lori Levin on CL curriculum and on
establishing standards for resources.  The background is more and more
standards are being set (i.e., by Google), but the linguistic
standards needed are not necessarily included because there is too
little understanding of the issues involved (i.e., too little
linguistic and grammar engineering knowledge).  Also, it is expensive
to train people in the kind of grammar writing we do.  However, once
people are trained up, they are a rare commodity that are coveted by
companies.
The discussion revolved around how to fix this.  One method is to
populate the ParGram Wiki with more information on issues and fixes
and why we decided to standardize things the way we did.
PLEASE EVERYBODY HELP WITH THIS.
One method is to provide courses, for example at the LSA institute or
perhaps via on-line media (iTunes, etc.).  Chris Brew,  Lori Levin and
Annie Zaenen said they would pursue this actively.  Miriam is
recording an on-line course on grammar development in the summer
semester that will be freely available.
-----
Thursday 27.2.14
Talks held on the Cognition Parser, on parser evaluation and
disambiguation in the Wolof grammar and where
Nuance is going with NL at the moment.  See the slides.
-- 
*************************************************************
Miriam Butt
FB Sprachwissenschaft
Universitaet Konstanz
Fach 184		Tel: +49 7531 88 5109
78457 Konstanz		Fax: +49 7531 88 4865
Germany		             +49 7531 88 5115

miriam.butt@uni-konstanz.de
http://ling.uni-konstanz.de/pages/home/butt

"Xander, don't talk Latin in front of the books."
     Superstar, Buffy the Vampire Slayer

*************************************************************