[Scientific Presentation] Integration of Multiword Expression Recognition in Parsers, Mathieu Constant (Univ. Marne-La-Vallee)

December 12, 2014


Matthieu Constant

Associate professor in Computer Science at the Université Paris-Est Marne-la-Vallée, France.

Automatic linguistic analysis faces two major problems inherent in natural languages: ambiguity and multiword expressions (MWE). Whereas the literature abounds in analyzers trying to deal with the case of ambiguity, few studies tackled the integration of MWE recognition. As these expressions comprise, by definition, a certain degree of non-compositionality (e.g. eau de vie ‘brandy’, perdre la boule ‘to go crazy’), their recognition is thus crucial for applications like Machine Translation.

In this talk, we will focus on the integration of compounds (i.e. a type of contiguous MWEs) in parsers. We will tackle this problem with a hybrid approach combining statistical models and symbolic linguistic resources. We will show that such an approach not only makes it possible to improve compound recognition, but also the global accuracy of parsing. We will consider several strategies for constituency parsing as well as for dependency parsing. In particular, we will compare experimentally joint strategies with pipeline ones.

Matthieu Constant is an associate professor in Computer Science at the Université Paris-Est Marne-la-Vallée, France. he is a member of the Computational Linguistics team, Laboratoire d’informatique Gaspard Monge. He holds an Habilitation à Diriger des Recherches since December 2012.