Universal Natural Language Processing

The majority of the world's 7000 languages have limited training data for natural language processing systems available. At NALA, we investigate how we can build NLP systems for those so-called low-resource languages. Techniques we use include but are not limited to transfer learning, pretraining, and meta-learning.

Read more here.


Natural Language Processing for Educational Applications

Artificial intelligence and NLP have the potential to revolutionize learning by assiting both students and teachers. At NALA, we investigate how NLP can help to make learning and teaching more effective.

Read more here.


Computational Morphology

Words are composed of meaning-bearing units, their morphemes. At NALA, we study how morphemes can be combined into words, and how words can be decomposed to recover the original morphemes. We also investigate how we can leverage subword information to build better NLP systems.

Read more here.


Low-resource Machine Translation

While machine translation (MT) works very well for many high-resource language pairs, for many low-resource languages, automatic translation is extremely difficult. In addition to the data scarcity problem, many low-resource languages exhibit different typological features from languages frequently studied in NLP, which poses another challenge. At NALA, we aim to build MT systems for as many languages as possible.

Read more here.


Language Grounding

Models trained on text only arguably cannot truly understand language, since understanding arrives by grounding language to real-world experiences. At NALA, we explore how we can provide models with the opportunity to interact with and observe the concepts they are expected to communicate about.

Read more here.