Hi,
At this year's ParGram meeting, we decided we should have 2 meetings in
2014 (spring and fall).
Regarding the spring meeting, Ron Kaplan has volunteered to host it at
Nuance in California. To find a suitable date, I created a Doodle poll
which you can find here:
http://doodle.com/niwz4hgptcn3rpsr
Please participate in the poll so that we can find a date that suits
most of us. By the way: This will be a special occasion since we will be
celebrating 20 years of ParGram!
Cheers,
Jani
--
Sebastian Sulger
FB Sprachwissenschaft
Universität Konstanz
http://ling.uni-konstanz.de/pages/home/sulger
ParGram Meeting Notes, Spring Meeting 2014
Nuance
Monday 24.2.2014
----
XLE STATUS
John Maxwell was able to attend from PARC so we began the day with a
discussion on the current status of XLE.
The open source version is available, but it is not clear to many
people how. Remedy: put a note on the XLE Wiki site about how to
access the open source version.
There have also been requests for accessing the precompiled version
and the English grammar. Notes on how to do this also need to be
added.
The English grammar should be put on a separate repository for people
to download. It would in fact be nice if this could be done via the
INESS site. Jani to check with Paul on this.
It would be good Konstanz got a cfsm license, then Konstanz could
compile the FSM binaries for different platforms and distribute them with
XLE.
At the time of the discussion, Jani and John had together managed to
get XLE compiled on Jani's computer. This is the first time XLE had
been compiled successfully outside of PARC. However, the transfer
system could still not be compiled. Subsequently, Jani managed to
compile everything on his computer (on a late Sunday night). The
problem lay with some of the Boost C++ libraries, which keep changing.
After Jani downloaded the newest version, everything was fixed.
We are currently working on getting the compilation complete with
everything (tcl, etc.) and testing it at Konstanz.
Suggestion for the ParGram Wiki: list of available grammars, maybe get
them to be
automatically downloadable from the INESS website (see above).
At the last ParGram meeting, it was suggested to use a creative common
licencse for the grammars like the Polish site does. Even if
resources are freely available, people are often not sure if they can
use them or under what conditions. So it would be good to make them
officially available via a license.
Other action items:
Move XLE documentation to Konstanz, part of XLE redmine. Integrate
Polish addition.
Put note on starter grammars on ParGram Wiki in a prominent place.
Also common features and common templates.
----
DEBUGGING AND DOCUMENTATION ISSUES
Ron discussed debugging issues he is having with XLE.
It was acknowledged that learning out to debug with XLE is an acquired
skill and that some things could be more helpful.
In particular, there is a concern with coming into a large grammar and
understanding the history and design decisions that are embedded in
it.
They key to solving this appears to be ample documentation within an
individual grammar, but also easy accessibility to the overall
decisions taken within ParGram over the decades and a summary of the
state of the discussion for certain phenomena. Everybody is again
encouraged to contribute to the ParGram Wiki:
http://typo.uni-konstanz.de/redmine/projects/pargram/wiki/
Some topics have already been put up. It would be great if the
community could contribute to knowledge sharing by putting up brief
entries for further topics.
------
PROSODY-SYNTAX AND MODULARITY DISCUSSION
Tina presented issues she had been working on with respect to the
prosody-syntax interface. In particular, there was a question about
how modularity in LFG is to be understood. The upshot of the
discussion was that given the projection architecture of LFG, the
question arises with respect to how much information is being passed
back and forth and how much of that is in a sense duplicative.
John and Ron continue to advocate an FST approach as in the Boegel et
al. (2010) paper.
-----
F-STRUCTURE COMPARISON
Testsuites:
The f-structure comparison this year focused on the treatment of
negation (suggested by Gyuri at the last ParGram meeting) and
possessive (suggested by Jani). We also had the first few lines of
"The Gruffalo".
The intention is to add parts of this meetings Testsuite to the
ParGramBank. For this, however, all the grammars will need to
parallelize.
While constructing the testsuites, we also tried to avoid introducing
extraneous issues primarily caused by unwise choice of vocabulary.
That is, we tried to use vocabulary that was likely to exist in all
languages. I.e., "chicken" instead of "gopher". This turns out not
to be an easy problem to solve. For example, with Wolof we ran into
trouble with "girl" and "boy". These always introduce an extra
relative clause as "girl" translates to "child who is a woman" and
"boy" translates to "child who is a man".
The problem of crosslinguistic testsuite construction is not trivial.
We had discussed some issues at the Debrecen meeting and some further
ones came up here. It seems that it would be worth writing a
follow-up paper to the ParGramBank paper discussing these issues in
some detail.
-----
Negation
In looking at the negation f-structures, we reviewed the ParGram Wiki
entry and Gyuri's slides from 2013. These or a version of these need
to be added to the ParGram Wiki (Miriam and Gyuri to do). to do).
One issue was whether one could project an ADJUNCT from a negation
that is expressed morphologically. This is technically possible, so
there is no reason in principle to resort just to a NEG + feature just
because you have morphological negation.
Changes that need to be made to the grammars to ensure parallelism:
Norwegian: ADV-TYPE neg --> ADJUNCT-TYPE neg
STMT-TYPE --> CLAUSE-TYPE (see guidelines on this)
VFORM into a check feature?
Wolof: has NEG + feature --- should think about moving to an Adjunct
type analysis (?)
German: why only constituent negation on "Die Katzen sind/lachen nicht im
Haus."
Some examples in the negation testsuite included NPI items. Overall,
NPI is not dealt with in any of the grammars.
"The cat is not in the house either."
Works okay in English, but not checking for NPI.
The cat is in the house either. --- parses fine but shouldn't.
Norwegian:
Should "Katten er i huset heller" be good? I.e., is it doing the
wrong parallel thing like English is?
ADV-TYPE nexus? What is this?
NPI issue does not arise with Urdu for this, since the "either" is
done with focus clitic "also" plus negation.
--
Neither is the cat in the house.
Not good in English. "Nor is the cat in the housse"
Recommendation: If have things like never/ever, have POL negative for
the negative one.
Urdu: change nah to not be an ADV-TYPE sadv
German: Weder/Noch ist die Katz im Haus. ---> doesn't work
"The girl did not buy any apples."
No NPI -- "The girl bought any apples." works also
Urdu: Why is there number information on "koi"? Check on whether this is
necessary. Could also have QUANT-TYPE info. English
doesn't but Norwegian does. (QUANT-TYPE existential). German has no
QUANT-TYPE.
For Urdu: could introduce a "QUANT-TYPE existential"
German does not register the negative factor in "keine". That would
be difficult for transfer.
Wolof: QUANT-TYPE negative -- is calculated on the basis of the
any/no item plus negation on the verb. Not really parallel with the
rest, but may be the desirable thing to do for this language. Reason:
also have existential quantifiers and they are different.
In general, English grammar thought about checking on NPI items and
thought it was too costly an operation. The reason is that one would
have to involve (IO)-FU and constraints on paths, but those are
difficult/complex to determine.
Ron was extremely surprised by the fact that the grammar contained
CASE obl. Nuance might look into changing CASE obl --> CASE acc for
the English grammar.
-------------------------------------------------------------------
Tuesday 25.2.2014
----
DISCUSSION ON TREEBANKS/CONSTRUCTIONS
INESS --- It would be great if the treebank listings on INESS could be
reorganized so that there is a group of ParGram treebanks and under
that there should be the parallel ParGramBank.
For next time, should do some of the testsuites that are already in
existence and parse them and add them to the ParGramBank.
EAGLES testsuite -- who has it? Miriam asked Hans Uszkoreit and he said
he is trying to find it as well, he will let Miriam know.
One could also look into using some of DELPH-INs testsuites.
----
F-STRUCTURE COMPARISON CONTINUED
Possessive f-strs:
Norwegian: REF + needed?
English: bright red tractor, wrong analysis
bright red tractor: bright is not a good adjective to use, find a
different one
maybe try light instead of bright.
Wolof: Adjunct type for "bright" in bright red tractor?
Urdu: change numbers to not have the stem value as the PRED value,
but the actual number.
Check whether need NUM information in the Urdu numbers.
Wolof: why is "four" NUM sg and why is it at the SPEC level???
Will probably be removed.
The farmer has a fear of spiders -- different treatment of "of
spiders" across grammars (Adjuncts in English/German, OBL in
Norwegian), so not a good example for ParGramBank.
Jani will try to come up with a sentence that doesn't involve a
complement of the noun.
Norwegian: why is the NUMBER f-str so complex? Can some of that go
into CHECK features?
ADJ-GEN in German? Look up in Steffi's dissertation?
Urdu change: genitives to the right of N, also change to SPEC POSS
rather than Adjunct analysis
----
Wednesday 26.2.14
Talks held at CSLI on lexical semantics and complex predicates -- see
the slides.
There was also a discussion led by Lori Levin on CL curriculum and on
establishing standards for resources. The background is more and more
standards are being set (i.e., by Google), but the linguistic
standards needed are not necessarily included because there is too
little understanding of the issues involved (i.e., too little
linguistic and grammar engineering knowledge). Also, it is expensive
to train people in the kind of grammar writing we do. However, once
people are trained up, they are a rare commodity that are coveted by
companies.
The discussion revolved around how to fix this. One method is to
populate the ParGram Wiki with more information on issues and fixes
and why we decided to standardize things the way we did.
PLEASE EVERYBODY HELP WITH THIS.
One method is to provide courses, for example at the LSA institute or
perhaps via on-line media (iTunes, etc.). Chris Brew, Lori Levin and
Annie Zaenen said they would pursue this actively. Miriam is
recording an on-line course on grammar development in the summer
semester that will be freely available.
-----
Thursday 27.2.14
Talks held on the Cognition Parser, on parser evaluation and
disambiguation in the Wolof grammar and where
Nuance is going with NL at the moment. See the slides.
--
*************************************************************
Miriam Butt
FB Sprachwissenschaft
Universitaet Konstanz
Fach 184 Tel: +49 7531 88 5109
78457 Konstanz Fax: +49 7531 88 4865
Germany +49 7531 88 5115
miriam.butt(a)uni-konstanz.de
http://ling.uni-konstanz.de/pages/home/butt
"Xander, don't talk Latin in front of the books."
Superstar, Buffy the Vampire Slayer
*************************************************************
Hi,
in a subsequent message, I will send out the ParGram meeting notes from
the spring meeting at Nuance. I apologize for only getting to this
now, but there were things... (there always are things).
Overall, the ParGram meeting in Nuance was lightly attended in terms of
international groups. Urdu was heavily represented, Norwegian and Wolof
were represented by Bamba, German by Christian Rohrer. Meladel Mistica
was able to make the meetings, but due to the fact that she had just
moved to the USA (job at Intel), she was not able to contribute
f-structures for Indonesian.
Nevertheless, we did some f-str comparison and the results are noted in
the meeting notes. All grammar writers, please take heed.
We also had an in-depth discussion about where we are with XLE and its
open source status. Again, see the meeting notes.
Damir Cavar (organizer of LFG14) has informed me that there have been
inquiries about a ParGram meeting back-to-back with LFG. We have done
this in the last couple of years because there were no spring/fall
meetings planned (because the California home of XLE had turned
uncertain). At the ParGram meeting in Debrecen in 2013, the decision
was made that we would go back to having a spring and a fall meeting,
with the spring meeting in California at Nuance and the fall meeting in
Europe (as per the old tradition). Bergen has offered to host the fall
meeting.
As such, I had not anticipated having a ParGram meeting at LFG14. I
personally also cannot stay much beyond the 20th and neither can Tracy
(she can't stay at all beyond the 20th) At this point, we have several
options:
1) No ParGram meeting at all with LFG14 (but those of us interested
could get together for dinner and exachange notes and do some grammar
debugging on the side).
2) A one-day or half-day ParGram meeting on the 21st.
3) A longer ParGram meeting as of the 21st (but this would be without me).
For those of you planning to be in Ann Arbor in July, could you send
your preferences to Sebastian (aka Jani) at:
sebastian.sulger(a)uni-konstanz.de.
Then we need to decide what to do about the planned fall meeting in
Bergen. Could people please also write in to Jani stating whether they
would attend a meeting in Bergen in the fall?
Thanks,
Miriam
--
*************************************************************
Miriam Butt
FB Sprachwissenschaft
Universitaet Konstanz
Fach 184 Tel: +49 7531 88 5109
78457 Konstanz Fax: +49 7531 88 4865
Germany +49 7531 88 5115
miriam.butt(a)uni-konstanz.de
http://ling.uni-konstanz.de/pages/home/butt
"Xander, don't talk Latin in front of the books."
Superstar, Buffy the Vampire Slayer
*************************************************************