Dear ParGram Community,
we are preparing an ACL paper on parallel treebanking at the moment, incorporating insights from the ParGram project. The long paper deadline for ACL2013 is February 20th.
The main focus of the treebank would be on its typological diversity and depth of linguistic representation. As of now, we have received c- and f-structures produced by ParGram grammars for 8 languages. The essential argumentation for the paper would be as follows:
- The treebank is yet quite small, but includes many languages. - Its representations are produced in an automatic fashion by grammars. - One can rely on parallelism in analyses and features/values, since ParGram has (th|f)ought about these a lot for >10 years. - The representations are deep. - The multitude of languages/depth of representation makes the treebank attractive for typologists and syntacticians/morphologists.
Following is a short excerpt from the introduction:
"An obvious application for parallel treebanking is machine translation, where treebank size is a deciding factor for or against choosing a particular treebank. When conducting in-depth linguistic studies of typological features, other factors (such as number of included languages, number of covered phenomena, depth of linguistic analysis) become more important. The treebanking effort reported on in this paper supports work of the latter mentality. We have created a parallel treebank which is yet quite small, but includes a multitude of eight typologically quite diverse languages as well as different constructions."
Of course, one would then try to pick out interesting linguistic issues in the representation, or try to look at how different phenomena are handled across the grammars.
If anyone of you is interested in contributing to the paper (be it by writing the paper itself, or by sending more structures for their language), please contact us. Also, if anyone has an opinion about the above argumentation, please feel free to share it with us.
So far, best wishes to all of you,
Jani (from the Urdu ParGram team in Konstanz)