Thanks Chris for your long and careful message. Thanks Agnieszka for starting this conversation.
I particularly liked Chris' "Things could either have continued like linguistics-as-usual, with everyone doing their own thing, or people could compromise enough that a whole bunch of people could come together and work with a common representation." I am happy that this joining of forces has happened for UD and I hope more joining of concerns and programs can happen from now on.
Most of the people who wrote so far seem in favor of re-starting PARGRAM. I also think this is a great idea and I hope we can do it. But I want more: I would like to (re?)-start ParSEM, which is a whole new kettle of fish. This can be done later on.
As Agnieszka and Lori have mentioned doing it online should be easier and maybe just a few hours online every six months will be enough to get momentum! The Bazaar model (as Chris described it) also allows for more people to commit only a few resources when they're busy (or bored). I think we have nothing to lose and much to gain if we start an online ParGram. Shall we?
best wishes, Valeria
On Sat, Aug 7, 2021 at 7:22 PM Christopher D. Manning manning@stanford.edu wrote:
Hi Lori and everyone, thanks for writing up the detailed thoughts! Sorry for my slow replies….
*Top-level take away:*
If any ParGram or other LFG people would be interested in contributing to UD, you would be very welcome and you could clearly contribute a lot!
The UD approach and the UD representation certainly have their flaws, and one of the things that UD has suffered from is not enough linguists involved for enough hours. There is certainly still a lot to discuss, document and fix in UD, and any time that linguists, such as Bill Croft, have gotten involved with UD, good things have come out of that. Even when they are people who have a lot of qualms and differences with the way UD does things (Kim Gerdes comes to mind…), they have nevertheless contributed very substantially to the quality of UD, and we appreciate it.
*Longer discussion:*
Thanks for noting some of UD’s strengths. One more that I would add is that by its scaling and (attempt at) uniformity, UD is also enabling an increasing amount of typological linguistics and psycholinguistic work that just was not possible before. If you haven’t seen any of that, it includes papers such as:
Richard Futrell et al. 2015. Large-scale evidence of dependency length minimization …. PNAS. https://doi.org/10.1073/pnas.1502134112 Matías Guzmán Naranjo et al. 2018. Quantitative word order typology with UD. TLT. https://mguzmann89.gitlab.io/pdf/word-order-oslo.pdf Natalia Levshina. 2019. Token-based typology …. Linguistic Typology. https://www.degruyter.com/document/doi/10.1515/lingty-2019-0025/html Michael Hahn et al. 2020. Universals of word order …. PNAS. https://www.pnas.org/cgi/doi/10.1073/pnas.1910923117 Himanshu Yadav et al. 2021. Do dependency lengths explain constraints …. Linguistics Vanguard. https://doi.org/10.1515/lingvan-2019-0070 Michael Hahn et al. 2021. Modeling word and morpheme order …. Psychological Review. https://psycnet.apa.org/record/2021-31510-001
Both community sourcing and a sustained collaboration like ParGram have their advantages and disadvantages. I openly admit that. In some ways the tradeoffs are similar to the “The Cathedral and the Bazaar” discussion around software development.
There is no doubt that some of the current UD treebanks are not very good. Our hope is that over time they will be improved. Some of them have been being quite actively improved. The main constraints on that are human time, expertise, and commitment. To take one not very good example that has come up, the current Tamil TTB treebank: Some of its problems are well known and documented. There is some commitment to fix them, but things haven’t been moving very fast. I’m sure more help would be welcome. Now, a closer collaboration like ParGram can help with creating commitment and sharing expertise, but it doesn’t automatically create more human time. Incidentally, the Tamil TTB was originally created by converting the TamilTB from Prague Dependency Treebank style. I don’t actually know the history and how it was converted very well, but it doesn’t seem like the 1500-page PDT manual was very effective at yielding a high-quality starting point in this case! Nevertheless, in UD’s case, more documentation and tutorials on linguistic tests as you suggest would definitely help. It all comes down to someone finding the time to write them. It’s a large task since it essentially become akin to writing language grammars and broad coverage syntax textbooks.
I think I was the original source of the prediction that there would never be a version 3 of UD — I should be more careful what I say! — but the statement does reflect some of the complications of the scale and loose bazaar collaboration of UD, which does contrast with the small, tight, sustained collaboration of ParGram. The problem is that relatively few people have so far stuck around as long term collaborators, helpers, and guides in the UD community. Most people work at some point in time to build or convert a treebank for some domain or some language that they are interested in. This does make it very difficult to consider a major revision at this point, because there are a very large number of treebanks with it being unclear that anyone is willing to spend large amounts of time revising them to fit a new standard. The best current approach seems to be to gradually evolve version 2 by better clarifying, documenting, and consistentizing how various things are annotated. Nevertheless, on the other hand, the bazaar model has allowed us to assemble a very large number of treebanks for a large number of languages with reasonable consistency and in a relatively short timespan.
For treating content words consistently as heads, I do have my own misgivings about that one, though I have come to like it more over time. :) I think it is a little bit unfair to say that the policies were made without anticipating the consequences. I mean, as is usual in life, it’s not that we’d thought absolutely everything through and worked out how to analyze every construction in every language. Nevertheless, there was a fair bit of prior work testing out different possibilities: Stanford Dependencies for English supported several variants of partial support for content words as heads where it could be done or not done for prepositions and the copula, while “Stanford Dependencies” for Finnish was the first fully “content words as heads” treebank. And there was quite a bit of further working through analyses and having in person and email discussions on the road to both UD v1 and UD v2. I do agree that there is just a problem with the treatment of a copula linking to a clausal complement, which doesn’t seem to have a satisfying fix within the assumptions assembled. But, hey, if we only have one bad problem, we’re not doing too badly.
I’d also like to mention that coming from my roughly LFG perspective, adopting content-words-as-heads was a key part of what made UD happen and be a big success. That is: Things could either have continued like linguistics-as-usual, with everyone doing their own thing, or people could compromise enough that a whole bunch of people could come together and work with a common representation. Moving to content words as heads, even though it has some issues for languages with real prepositions and copulas like English was a big part of getting enough common ground that some LFG people, some Prague School people and some other European traditional and dependency grammar people could get together and work with a common shared representation. (It also actually makes things a little less English-centric, in that we’re essentially saying “Analyze English as if it were Finnish!”.)
Finally, if you’ve read this far, I really recommend Section 5 of the new UD paper. I think it (very briefly!) provides key wisdom about why UG might be better off without argument structure, why it would likely be difficult to community-source a resource of similar scale with full LFG, and why it might well have less impact, even if it could be done.
Chris.
On Jul 30, 2021, at 10:32 AM, Lori Levin levin@andrew.cmu.edu wrote:
Hi Chris,
The CL article is wonderful! I recommend it to everyone on this list.
While having great respect for UD, I stand by my "stone soup" analogy (which I stole from Yorick Wilks's characterization of statistical MT).
For ParGram people, I know this is off the topic of ParGram, so feel free to ignore.
*First, my great respect for UD:*
- By making UD a community-sourced project, a very large resource was
created that has enabled research on joint, cross-lingual, and multilingual models of parsing. This research could not have been done without UD.
2.UD has enabled the development of parsers for low-resource languages, both with supervised training on a UD treebank and with transfer from higher-resourced languages.
- Community-sourcing and stone-souping allowed all of this to happen
quickly, years faster than it could have been done without community-sourcing.
*Second, some points about community-sourcing: *
(I just now remembered that just before the pandemic, I told Joakim that I would help write better documentation to improve the quality of community-sourced UD treebanks as a way of remediating the points below.)
Starting from UD's stated goal of having a collection of treebanks in a uniform, comparable format, to facilitate cross-lingual machine learning:
- Some UD treebanks are written by linguists and are excellent. Others
show misunderstandings of the definitions of dependency labels and head-dependent relationships. Thus they do not fulfill the goal of having a uniform, comparable format. The validation scripts for UD are a great resource, but they cannot detect when a treebanker makes a decision that is not quite what a linguist would do given the same documentation. The documentation needs to include procedures for how to tell if recipients in your language are core-arguments or obliques, etc.
- I believe that the UD documentation is intentionally brief so that
people will not be intimidated, say, in comparison to the 300-page manual of the Penn Treebank or the 1500-page manual to the Prague Dependency Treebanks. UD documentation includes a number of constructions (as in your CL paper), but it doesn't have a decision process for deciding whether the thing that translates into a secondary predicate in English really is a secondary predicate in your language, which I think leads to copying English grammar instead of analyzing the grammar of your own language. Again, I think this could be fixed by tutorials on linguistic tests for various things along with typology lessons. e.g., "There are n things that typologists know about coordinate structures, apply these diagnostics to see which type you are dealing with. Now that you know which of the n types you are dealing with, here are specific treebanking guidelines for that type".
I'm noticing more and more in my teaching that students from other countries equate grammar with English grammar, causing them to miss-analyze their own language by assuming that it is like English. For example, students who speak languages with locative possession constructions (a book is at me) will say that possession in their language is just like English. The UD documentation does not help people with this. There needs to be some pre-treebanking awareness. ParGram people would be great at writing the needed documentation, tutorials, and instructions.
- Further copying of English grammar results from over-interpretation of
the goal of having a uniform, comparable format for all treebanks. (I don't mean to pick on Japan. I have discussed this politely with Graham Neubig.) Some groups in Japan (Unidict?) took this to an extreme, which to my mind, totally defeated the purpose of UD. In your CL article, "ta" is not segmented separately from the word "katta", but some Japanese segmenters treat "ta" as a separate word. In some Japanese UD's "ta" is an AUX. Why did they think this was a good idea? Because tenses are carried on auxiliary verbs in European languages, so this makes Japanese more like European languages and should be consistent with UD's goal of uniform, comparable trees across languages. They also treated "koto" as an AUX because it seems like an aux when it nominalizes a sentence, but they also treated it as an aux in "noun no koto". One of my students was experimenting with cross-lingual parser training, and had to remove Japanese from the training pool or at least delete all instances of "koto" before training with Japanese.
The UD goals could be easily mis-understood by non-linguists to mean "make everything like English". However, multi-lingual neural net parsers work by learning what is languagey about language. If you are not a linguist and you do something naive, without experimenting first, because you think it is compatible with UD goals, you are likely to mess up what is languagey about your language. I think this is what happened with over-segmentation in Japanese. Again, ParGram people could help people understand what it means to make your f-structures compatible without copying the grammar of another language.
*Finally, content words as heads and other policies that were made without anticipating all of the consequences*
Here is where the stone soup comes in most: it would have been good to anticipate the consequences before becoming too big to change.
UD has decided to define itself as dependency relations among content words. I can't complain about this: around 2003-2004 I was in a group called Interlingual Annotation of Multilingual Text Corpora (IAMTC), which was short-lived, but included people like Ed Hovy, Owen Rambow, Bonnie Dorr, and Nizar Habash. IAMTC also advocated content words as heads. We were a little too early for community sourcing and very large-scale annotation, and not nearly as charismatic as the UD people, so we faded away.
Any policy has consquences. Here are some of the consequences of "content words as heads":
The head of "Women in the house and in the senate" is "house" because "is", "in", and "and" can't be heads. A better solution would be a constructional approach, identifying this as an instance of a type of predicative construction.
The UD discussion board has long discussions of sentences like "The answer is nobody knows". "Is" can't be the head, so "knows" is the head. But that leads to a violation of the "one subject" policy. "nobody" is the subject of "knows" but "answer" is also the subject of "knows". This is the kind of thing that linguists anticipate before they try to start the soup using only stones.
And now, I see in the UD discussion boards that it may not be possible to move on to version 3 because there has been too much effort and volume in version 2.
--Lori
On Fri, Jul 30, 2021 at 10:08 AM Christopher D. Manning < manning@stanford.edu> wrote:
Of course, I’d describe UD — and its relation to LFG — slightly differently. 😉
At any rate, we do have — and have always had — more than 3 pages; and just now we’ve got an article out in Computational Linguistics:
https://direct.mit.edu/coli/article/47/2/255/98516/Universal-Dependencies
Chris
On Jul 29, 2021, at 8:34 PM, Lori Levin levin@andrew.cmu.edu wrote:
Hi Joan,
I hope it is appropriate for me to chime in, since my day job as a funded project monkey prevented me from attending all but one ParGram meeting in the past.
I'm now trying to get back to doing more linguistics, but with the perspective of the needs of the field of language technologies.
Here is how I see the importance of ParGram:
- Cross-lingual studies of grammatical relations, information structure,
constructions in a very broad sense, argument realization, grammaticalization, etc. leading to theoretical insight into the nature of these things.
- Treebanks and parsers that can be used for corpus-based studies in
linguistics, and perhaps in some hybrid third-wave neuro-symbolic systems in language technologies, especially in low-resource languages.
- A challenge to UD (universal dependency) and UMR (uniform meaning
representation): I think we can learn from UD and UMR how to do things on a larger scale. But at the same time, we can save them from their fate as stone soup in the following sense: they thought they could do something easy (make soup using only "stones" and water, which consisted of three pages of definitions of grammatical relations), but as they progressed, they needed to keep adding "carrots", "onions", "bones" (serious linguistic decisions). But unlike the story, where the soup turned out good, UD has turned out messy and too big to fail. Sometimes they talk about possibly not going on to Version 3 because Version 2 is too big to change. We can show how to do a UD-like project on a firm foundation.
--Lori
On Thu, Jul 29, 2021 at 12:45 AM Joan Bresnan bresnan@stanford.edu wrote:
I would like to hear some discussion of what the goals would be.
—Joan (my spelling checker keeps rewriting my name as Jian, which is what we call the straight sword in my taiji group— but this is really unintentional)
Sent on the fly
On Jul 28, 2021, at 5:00 AM, Agnieszka Patejuk via LFG-list <
lfg-list@mailman.uni-konstanz.de> wrote:
Hi,
I am sending this message to all potentially relevant lists (ParGram, XLE, LFG, ParSem) so as to maximise the chance that it will reach people who are interested in this topic. If you know someone who might be interested but is not subscribed to these lists, please consider letting them know.
The question is: who would be interested in restarting ParGram
meetings?
I am not sure what would be the best way to organise this discussion, so I am suggesting the following: • if you think your answer might be of interest to many people (for instance, it might spark a discussion), please consider replying to the list(s) • if not, please reply only to me – I will later go through the responses and post a summary to the relevant lists.
I hope that later we can discuss this topic in more detail with people who have expressed interest.
All best, Agnieszka