Hi Chris,
The CL article is wonderful! I recommend it to everyone on this list.
While having great respect for UD, I stand by my "stone soup" analogy (which I stole from Yorick Wilks's characterization of statistical MT).
For ParGram people, I know this is off the topic of ParGram, so feel free to ignore.
First, my great respect for UD:
1. By making UD a community-sourced project, a very large resource was created that has enabled research on joint, cross-lingual, and multilingual models of parsing. This research could not have been done without UD.
2.UD has enabled the development of parsers for low-resource languages, both with supervised training on a UD treebank and with transfer from higher-resourced languages.
3. Community-sourcing and stone-souping allowed all of this to happen quickly, years faster than it could have been done without community-sourcing.
Second, some points about community-sourcing:
(I just now remembered that just before the pandemic, I told Joakim that I would help write better documentation to improve the quality of community-sourced UD treebanks as a way of remediating the points below.)
Starting from UD's stated goal of having a collection of treebanks in a uniform, comparable format, to facilitate cross-lingual machine learning:
1. Some UD treebanks are written by linguists and are excellent. Others show misunderstandings of the definitions of dependency labels and head-dependent relationships. Thus they do not fulfill the goal of having a uniform, comparable format. The validation scripts for UD are a great resource, but they cannot detect when a treebanker makes a decision that is not quite what a linguist would do given the same documentation. The documentation needs to include procedures for how to tell if recipients in your language are core-arguments or obliques, etc.
2. I believe that the UD documentation is intentionally brief so that people will not be intimidated, say, in comparison to the 300-page manual of the Penn Treebank or the 1500-page manual to the Prague Dependency Treebanks. UD documentation includes a number of constructions (as in your CL paper), but it doesn't have a decision process for deciding whether the thing that translates into a secondary predicate in English really is a secondary predicate in your language, which I think leads to copying English grammar instead of analyzing the grammar of your own language. Again, I think this could be fixed by tutorials on linguistic tests for various things along with typology lessons. e.g., "There are n things that typologists know about coordinate structures, apply these diagnostics to see which type you are dealing with. Now that you know which of the n types you are dealing with, here are specific treebanking guidelines for that type".
I'm noticing more and more in my teaching that students from other countries equate grammar with English grammar, causing them to miss-analyze their own language by assuming that it is like English. For example, students who speak languages with locative possession constructions (a book is at me) will say that possession in their language is just like English. The UD documentation does not help people with this. There needs to be some pre-treebanking awareness. ParGram people would be great at writing the needed documentation, tutorials, and instructions.
3. Further copying of English grammar results from over-interpretation of the goal of having a uniform, comparable format for all treebanks. (I don't mean to pick on Japan. I have discussed this politely with Graham Neubig.) Some groups in Japan (Unidict?) took this to an extreme, which to my mind, totally defeated the purpose of UD. In your CL article, "ta" is not segmented separately from the word "katta", but some Japanese segmenters treat "ta" as a separate word. In some Japanese UD's "ta" is an AUX. Why did they think this was a good idea? Because tenses are carried on auxiliary verbs in European languages, so this makes Japanese more like European languages and should be consistent with UD's goal of uniform, comparable trees across languages. They also treated "koto" as an AUX because it seems like an aux when it nominalizes a sentence, but they also treated it as an aux in "noun no koto". One of my students was experimenting with cross-lingual parser training, and had to remove Japanese from the training pool or at least delete all instances of "koto" before training with Japanese.
The UD goals could be easily mis-understood by non-linguists to mean "make everything like English". However, multi-lingual neural net parsers work by learning what is languagey about language. If you are not a linguist and you do something naive, without experimenting first, because you think it is compatible with UD goals, you are likely to mess up what is languagey about your language. I think this is what happened with over-segmentation in Japanese. Again, ParGram people could help people understand what it means to make your f-structures compatible without copying the grammar of another language.
Finally, content words as heads and other policies that were made without anticipating all of the consequences
Here is where the stone soup comes in most: it would have been good to anticipate the consequences before becoming too big to change.
UD has decided to define itself as dependency relations among content words. I can't complain about this: around 2003-2004 I was in a group called Interlingual Annotation of Multilingual Text Corpora (IAMTC), which was short-lived, but included people like Ed Hovy, Owen Rambow, Bonnie Dorr, and Nizar Habash. IAMTC also advocated content words as heads. We were a little too early for community sourcing and very large-scale annotation, and not nearly as charismatic as the UD people, so we faded away.
Any policy has consquences. Here are some of the consequences of "content words as heads":
The head of "Women in the house and in the senate" is "house" because "is", "in", and "and" can't be heads. A better solution would be a constructional approach, identifying this as an instance of a type of predicative construction.
The UD discussion board has long discussions of sentences like "The answer is nobody knows". "Is" can't be the head, so "knows" is the head. But that leads to a violation of the "one subject" policy. "nobody" is the subject of "knows" but "answer" is also the subject of "knows". This is the kind of thing that linguists anticipate before they try to start the soup using only stones.
And now, I see in the UD discussion boards that it may not be possible to move on to version 3 because there has been too much effort and volume in version 2.
--Lori