publications

Selected publications of NALA members (since 2020).

2026

  1. Does Contextual Informativeness Predict Preschoolers’ Word Learning from Stories?
    Maria Valentini, Julisa Granados, Katharina von der Wense, and Eliana Colunga
    In Proceedings of the Annual Meeting of the Cognitive Science Society, 2026
  2. From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation
    Minh Duc Bui, Xenia Heilmann, Mattia Cerrato, Manuel Mager, and Katharina von der Wense
    In Findings of the Association for Computational Linguistics: ACL 2026, 2026
  3. Large Language Models Are Overconfident in Their Own Responses
    Mario Sanz-Guerrero, Manuel Mager, and Katharina von der Wense
    In Findings of the Association for Computational Linguistics: ACL 2026, 2026
  4. Meenz bleibt Meenz, but Large Language Models Do Not Speak Its Dialect
    Minh Duc Bui, Manuel Mager, Peter Herbert Kann, and Katharina von der Wense
    In Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026), 2026

2025

  1. Mitigating Label Length Bias in Large Language Models
    Mario Sanz-Guerrero, and Katharina von der Wense
    In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025
  2. NALA_MAINZ at BLP-2025 Task 2: A Multi-agent Approach for Bangla Instruction to Python Code Generation
    Hossain Shaikh Saadi, Faria Alam, Mario Sanz-Guerrero, Minh Duc Bui, Manuel Mager, and Katharina von der Wense
    In Proceedings of the Workshop on Bangla Language Processing (BLP 2025), 2025
  3. Dialogue Acts as a Lens on Human–LLM Interaction: Analyzing Conversational Norms in Model-Generated Responses
    Arunima Maitra, Dorothea French, and Katharina von der Wense
    In Proceedings of the Workshop on Bridging Human–Computer Interaction and Natural Language Processing (HCI+NLP 2025), 2025
  4. JGU Mainz’s submission to the WMT25 shared task on LLMs with limited resources for Slavic languages: MT and QA
    Hossain Shaikh Saadi, Minh Duc Bui, Mario Sanz-Guerrero, and Katharina von der Wense
    In Proceedings of the Tenth Conference on Machine Translation (WMT 2025), 2025
  5. Molecular String Representation Preferences in Pretrained LLMs: A Comparative Study in Zero- & Few-Shot Molecular Property Prediction
    George Arthur Baker, Mario Sanz-Guerrero, and Katharina von der Wense
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
  6. Large Language Models Discriminate Against Speakers of German Dialects
    Minh Duc Bui, Carolin Holtermann, Valentin Hofmann, Anne Lauscher, and Katharina von der Wense
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
  7. Interdisciplinary Research in Conversation: A Case Study in Computational Morphology for Language Documentation
    Enora Rice, Katharina von der Wense, and Alexis Palmer
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
  8. Mind the Gap: A Closer Look at Tokenization for Multiple-Choice Question Answering with LLMs
    Mario Sanz-Guerrero, Minh Duc Bui, and Katharina von der Wense
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
  9. ReSeeding Latent States for Sequential Language Understanding
    Stéphane Aroca-Ouellette, Katharina von der Wense, and Alessandro Roncone
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
  10. Model-Based Ranking of Source Languages for Zero-Shot Cross-Lingual Transfer
    Abteen Ebrahimi, Adam Wiemerslage, and Katharina von der Wense
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
  11. Linguistic Alignment Predicts Learning in Small Group Tutoring Sessions
    Dorothea French, Robert Moulder, Kelechi Ezema, Katharina von der Wense, and Sidney K. D’Mello
    In Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
  12. Implicitly Aligning Humans and Autonomous Agents through Shared Task Abstractions
    Stéphane Aroca-Ouellette, Miguel Aroca-Ouellette, Katharina von der Wense, and Alessandro Roncone
    In Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2025), 2025
  13. On Generalization across Measurement Systems: LLMs Entail More Test-Time Compute for Underrepresented Cultures
    Minh Duc Bui, Kyung eun Park, Goran Glavaš, Fabian David Schmidt, and Katharina von der Wense
    In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, 2025
  14. Improving Low-Resource Morphological Inflection via Self-Supervised Objectives
    Adam Wiemerslage, and Katharina von der Wense
    In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, 2025
  15. Understanding the Gap: an Analysis of Research Collaborations in NLP and Language Documentation
    Luke Gessler, Alexis Palmer, and Katharina von der Wense
    In Findings of the Association for Computational Linguistics: ACL 2025, 2025
  16. MALAMUTE: A Multilingual, Highly-granular, Template-free, Education-based Probing Dataset
    Sagi Shaier, George Arthur Baker, Chiranthan Sridhar, Lawrence Hunter, and Katharina von der Wense
    In Findings of the Association for Computational Linguistics: ACL 2025, 2025
  17. CLIX: Cross-Lingual Explanations of Idiomatic Expressions
    Aaron Gluck, Katharina von der Wense, and Maria Leonor Pacheco
    In Findings of the Association for Computational Linguistics: ACL 2025, 2025
  18. The Effectiveness of Uncased Tokenization for Clinical Notes
    Cory Paik, and Katharina von der Wense
    In Findings of the Association for Computational Linguistics: ACL 2025, 2025
  19. Untangling the Influence of Typology, Data, and Model Architecture on Ranking Transfer Languages for Cross-Lingual POS Tagging
    Enora Rice, Ali Marashian, Hannah Haynie, Katharina von der Wense, and Alexis Palmer
    In Proceedings of the Workshop on Language Models for Underserved Communities (LM4UC 2025), 2025
  20. Corrective In-Context Learning: Evaluating Self-Correction in Large Language Models
    Mario Sanz-Guerrero, and Katharina von der Wense
    In Proceedings of the Workshop on Insights from Negative Results in NLP, 2025
  21. Findings of the AmericasNLP 2025 Shared Tasks on Machine Translation, Creation of Educational Material, and Translation Metrics for Indigenous Languages of the Americas
    Ona De Gibert, Robert Pugh, Ali Marashian, Raul Vazquez, Abteen Ebrahimi, Pavel Denisov, Enora Rice, Edward Gow-Smith, Juan Prieto, Melissa Robles, Rubén Manrique, Oscar Moreno, Angel Lino, Rolando Coto-Solano, Aldo Alvarez, Marvin Agüero-Torales, John E. Ortega, Luis Chiruzzo, Arturo Oncevay, Shruti Rijhwani, Katharina von der Wense, and Manuel Mager
    In Proceedings of the Fifth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP), 2025
  22. Multi3Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision–Language Models
    Minh Duc Bui, Katharina von der Wense, and Anne Lauscher
    In Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics, 2025
  23. More Experts Than Galaxies: Conditionally-Overlapping Experts with Biologically-Inspired Fixed Routing
    Sagi Shaier, Francisco Pereira, Katharina von der Wense, Lawrence Hunter, and Matt Jones
    In The Thirteenth International Conference on Learning Representations (ICLR 2025), 2025
  24. Asking Again and Again: Exploring LLM Robustness to Repeated Questions
    Sagi Shaier, Mario Sanz-Guerrero, and Katharina von der Wense
    2025
  25. From Priest to Doctor: Domain Adaptation for Low-Resource Neural Machine Translation
    Ali Marashian, Enora Rice, Luke Gessler, Alexis Palmer, and Katharina von der Wense
    In Proceedings of the 31st International Conference on Computational Linguistics, 2025
  26. Measuring Contextual Informativeness in Child-Directed Text
    Maria R. Valentini, Téa Y. Wright, Ali Marashian, Jennifer M. Ellis, Eliana Colunga, and Katharina von der Wense
    In Proceedings of the 31st International Conference on Computational Linguistics, 2025

2024

  1. Identifying Telescope Usage in Astrophysics Publications: A Machine Learning Framework for Institutional Research Management at Observatories
    Vicente Amado Olivo, Wolfgang Kerzendorf, Brian Cherinka, Joshua V. Shields, Annie Didier, and Katharina von der Wense
    The Astronomical Journal 2024
  2. Getting The Most Out of Your Training Data: Exploring Unsupervised Tasks for Morphological Inflection
    Abhishek Purushothama, Adam Wiemerslage, and Katharina von der Wense
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
  3. It Is Not About What You Say, It Is About How You Say It: A Surprisingly Simple Approach for Improving Reading Comprehension
    Sagi Shaier, Lawrence Hunter, and Katharina von der Wense
    In Findings of the Association for Computational Linguistics ACL 2024, 2024
  4. TAMS: Translation-Assisted Morphological Segmentation
    Enora Rice, Ali Marashian, Luke Gessler, Alexis Palmer, and Katharina von der Wense
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
  5. Aligning to Adults Is Easy, Aligning to Children Is Hard: A Study of Linguistic Alignment in Dialogue Systems
    Dorothea French, Sidney D’Mello, and Katharina von der Wense
    In Proceedings of the 1st Human-Centered Large Language Modeling Workshop, 2024
  6. Eyes on the Game: Deciphering Implicit Human Signals to Infer Human Proficiency, Trust, and Intent
    Nikhil Hulle, Stephane Aroca-Ouellette, Anthony Ries, Jake Brawer, Katharina von der Wense, and Alessandro Roncone
    In Proceedings of the 2024 33rd IEEE International Conference on Robot and Human Interactive Communication (ROMAN), 2024
  7. Findings of the AmericasNLP 2024 Shared Task on Machine Translation into Indigenous Languages
    Abteen Ebrahimi, Ona Gibert, Raul Vazquez, Rolando Coto-Solano, Pavel Denisov, Robert Pugh, Manuel Mager, Arturo Oncevay, Luis Chiruzzo, Katharina von der Wense, and Shruti Rijhwani
    In Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024), 2024
  8. Findings of the AmericasNLP 2024 Shared Task on the Creation of Educational Materials for Indigenous Languages
    Luis Chiruzzo, Pavel Denisov, Alejandro Molina-Villegas, Silvia Fernandez-Sabido, Rolando Coto-Solano, Marvin Agüero-Torales, Aldo Alvarez, Samuel Canul-Yah, Lorena Hau-Ucán, Abteen Ebrahimi, Robert Pugh, Arturo Oncevay, Shruti Rijhwani, Katharina von der Wense, and Manuel Mager
    In Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024), 2024
  9. Evaluating LLMs as Tools to Support Early Vocabulary Learning
    Jennifer Weber, Maria Valentini, Téa Wright, Katharina von der Wense, and Eliana Colunga
    In Proceedings of the Annual Meeting of the Cognitive Science Society, 2024
  10. Prompting as Panacea? A Case Study of In-Context Learning Performance for Qualitative Coding of Classroom Dialog
    Ananya Ganesh, Chelsea Chandler, Sidney D’Mello, Martha Palmer, and Katharina von der Wense
    In Proceedings of the International Conference on Educational Data Mining, 2024
  11. Zero-Shot vs. Translation-Based Cross-Lingual Transfer: The Case of Lexical Gaps
    Abteen Ebrahimi, and Katharina von der Wense
    In Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2024
  12. Knowledge Distillation vs. Pretraining from Scratch under a Fixed (Computation) Budget
    Minh Duc Bui, Fabian David Schmidt, Goran Glavaš, and Katharina von der Wense
    In Proceedings of the Workshop on Insights from Negative Results in NLP, 2024
  13. The Trade-off between Performance, Efficiency, and Fairness in Adapter Modules for Text Classification
    Minh Duc Bui, and Katharina von der Wense
    In Proceedings of the Fourth Workshop on Trustworthy Natural Language Processing, 2024
  14. NLP for Language Documentation: Two Reasons for the Gap between Theory and Practice
    Luke Gessler, and Katharina von der Wense
    In Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP), 2024
  15. JGU Mainz’s Submission to the AmericasNLP 2024 Shared Task on the Creation of Educational Materials for Indigenous Languages
    Minh Duc Bui, and Katharina von der Wense
    In Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP), 2024
  16. Quantifying the Hyperparameter Sensitivity of Neural Networks for Character-level Sequence-to-Sequence Tasks
    Adam Wiemerslage, Kyle Gorman, and Katharina von der Wense
    In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024
  17. Comparing Template-based and Template-free Language Model Probing
    Sagi Shaier, Kevin Bennett, Lawrence Hunter, and Katharina von der Wense
    In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024
  18. Desiderata For The Context Use Of Question Answering Systems
    Sagi Shaier, Lawrence Hunter, and Katharina von der Wense
    In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

2023

  1. On the Automatic Generation and Simplification of Children’s Stories
    Maria Valentini, Jennifer Weber, Jesus Salcido, Téa Wright, Eliana Colunga, and Katharina von der Wense
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
  2. Emerging Challenges in Personalized Medicine: Assessing Demographic Effects on Biomedical Question Answering Systems
    Sagi Shaier, Kevin Bennett, Lawrence Hunter, and Katharina von der Wense
    In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023
  3. Who Are All The Stochastic Parrots Imitating? They Should Tell Us!
    Sagi Shaier, Lawrence Hunter, and Katharina von der Wense
    In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023
  4. Findings of the CoCo4MT 2023 Shared Task on Corpus Construction for Machine Translation
    Ananya Ganesh, Marine Carpuat, William Chen, Katharina Kann, Constantine Lignos, John E. Ortega, Jonne Saleva, Shabnam Tafreshi, and Rodolfo Zevallos
    In Proceedings of the Second Workshop on Corpus Generation and Corpus Augmentation for Machine Translation, 2023
  5. Neural Machine Translation for the Indigenous Languages of the Americas: An Introduction
    Manuel Mager, Rajat Bhatnagar, Graham Neubig, Ngoc Thang Vu, and Katharina Kann
    In Proceedings of the Third Workshop on NLP for Indigenous Languages of the Americas, 2023
  6. Findings of the AmericasNLP 2023 Shared Task on Machine Translation into Indigenous Languages
    Abteen Ebrahimi, Manuel Mager, Shruti Rijhwani, Enora Rice, Arturo Oncevay, Claudia Baltazar, María Cortés, Cynthia Montaño, John E Ortega, Rolando Coto-Solano, Hilaria Cruz, Alexis Palmer, and Katharina Kann
    In Proceedings of the Third Workshop on NLP for Indigenous Languages of the Americas, 2023
  7. A Survey of Challenges and Methods in the Computational Modeling of Multi-Party Dialog
    Ananya Ganesh, Martha Palmer, and Katharina Kann
    In Proceedings of the 5th Workshop on NLP for Conversational AI, 2023
  8. Mind the Gap between the Application Track and the Real World
    Ananya Ganesh, Jie Cao, Margaret Perkoff, Rosy Southwell, Martha Palmer, and Katharina Kann
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023
  9. Ethical Considerations for Machine Translation of Indigenous Languages: Giving a Voice to the Speakers
    Manuel Mager, Elisabeth Albine Mager, Katharina Kann, and Ngoc Thang Vu
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023
  10. An Investigation of Noise in Morphological Inflection
    Adam Wiemerslage, Changbing Yang, Garrett Nicolai, Miikka Silfverberg, and Katharina Kann
    In Findings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023
  11. A Comparative Analysis of Automatic Speech Recognition Errors in Small Group Classroom Discourse
    Jie Cao, Ananya Ganesh, Jon Cai, Rosy Southwell, Margaret Perkoff, Michael Reagan, Katharina Kann, James Martin, Martha Palmer, and Sidney D’Mello
    In Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization, 2023
  12. Navigating Wanderland: Highlighting Off-Task Discussions in Classrooms
    Ananya Ganesh, Michael Chang, Rachel Dickler, Michael Regan, Jon Cai, Kristin Wright-Bettner, James Pustejovsky, James Martin, Jeff Flanigan, Martha Palmer, and Katharina Kann
    In Proceedings of the 24th International Conference on Artificial Intelligence in Education, 2023
  13. Meeting the Needs of Low-Resource Languages: Exploring Automatic Alignments via Pretrained Models
    Abteen Ebrahimi, Arya D. McCarthy, Arturo Oncevay, John E. Ortega, Luis Chiruzzo, Rolando Coto-Solano, Gustavo A. Giménez-Lugo, and Katharina Kann
    In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

2022

  1. Findings of the Second AmericasNLP Competition on Speech-to-Text Translation
    Abteen Ebrahimi, Manuel Mager, Adam Wiemerslage, Pavel Denisov, Arturo Oncevay, Danni Liu, Sai Koneru, Enes Yavuz Ugan, Zhaolin Li, Jan Niehues, Monica Romero, Ivan G Torre, Tanel Alumäe, Jiaming Kong, Sergey Polezhaev, Yury Belousov, Wei-Rui Chen, Peter Sullivan, Ife Adebara, Bashar Talafha, Alcides Alcoba Inciarte, Muhammad Abdul-Mageed, Luis Chiruzzo, Rolando Coto-Solano, Hilaria Cruz, Sofía Flores-Solórzano, Aldo Andrés Alvarez López, Ivan Meza-Ruiz, John E. Ortega, Alexis Palmer, Rodolfo Joel Zevallos Salazar, Kristine Stenzel, Thang Vu, and Katharina Kann
    In Proceedings of the NeurIPS 2022 Competitions Track, 2022
  2. AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas
    Katharina Kann, Abteen Ebrahimi, Manuel Mager, Arturo Oncevay, John E. Ortega, Annette Rios, Angela Fan, Ximena Gutierrez-Vasques, Luis Chiruzzo, Gustavo A. Giménez-Lugo, Ricardo Ramos, Ivan Vladimir Meza Ruiz, Elisabeth Mager, Vishrav Chaudhary, Graham Neubig, Alexis Palmer, Rolando Coto-Solano, and Ngoc Thang Vu
    Frontiers in Artificial Intelligence 2022
  3. A Major Obstacle for NLP Research: Let’s Talk about Time Allocation!
    Katharina Kann, Shiran Dudy, and Arya D. McCarthy
    In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
  4. A Comprehensive Comparison of Neural Networks as Cognitive Models of Inflection
    Adam Wiemerslage, Shiran Dudy, and Katharina Kann
    In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
  5. CHIA: CHoosing Instances to Annotate for Machine Translation
    Rajat Bhatnagar, Ananya Ganesh, and Katharina Kann
    In Findings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
  6. Generate Me a Bedtime Story: Leveraging Natural Language Processing for Early Vocabulary Enhancement
    Trevor A. Hall, Maria Valentini, Eliana Colunga, and Katharina Kann
    In Proceedings of the Workshop on NLP for Positive Impact, 2022
  7. Machine Translation Between High-resource Languages in a Language Documentation Setting
    Katharina Kann, Abteen Ebrahimi, Kristine Stenzel, and Alexis Palmer
    In Proceedings of the First Workshop on Applying NLP to Field Linguistics, 2022
  8. Response Construct Tagging: NLP-Aided Assessment for Engineering Education
    Ananya Ganesh, Hugh Scribner, Jasdeep Singh, Katherine Goodman, Jean Hertzberg, and Katharina Kann
    In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications, 2022
  9. Open-domain Dialogue Generation: What We Can Do, Cannot Do, And Should Do Next
    Katharina Kann, Abteen Ebrahimi, Joewie J. Koh, Shiran Dudy, and Alessandro Roncone
    In Proceedings of the 4th Workshop on NLP for Conversational AI, 2022
  10. AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages
    Abteen Ebrahimi, Manuel Mager, Arturo Oncevay, Vishrav Chaudhary, Luis Chiruzzo, Angela Fan, John Ortega, Ricardo Ramos, Annette Rios, Ivan Vladimir Meza Ruiz, Gustavo A. Giménez-Lugo, Elisabeth Mager, Graham Neubig, Alexis Palmer, Rolando Coto-Solano, Thang Vu, and Katharina Kann
    In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022
  11. How Does Multilingual Pretraining Affect Cross-Lingual Transferability?
    Yoshinari Fujinuma, Jordan Lee Boyd-Graber, and Katharina Kann
    In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022
  12. Morphological Processing of Low-Resource Languages: Where We Are and What’s Next
    Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya D. McCarthy, Garrett Nicolai, Eliana Colunga, and Katharina Kann
    In Findings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022
  13. BPE vs. Morphological Segmentation: A Case Study on Machine Translation of Four Polysynthetic Languages
    Manuel Mager, Arturo Oncevay, Elisabeth Mager, Katharina Kann, and Thang Vu
    In Findings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022

2021

  1. The World of an Octopus: How Reporting Bias Influences a Language Model’s Perception of Color
    Cory Paik, Stéphane Aroca-Ouellette, Alessandro Roncone, and Katharina Kann
    In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
  2. What Would a Teacher Do? Predicting Future Talk Moves
    Ananya Ganesh, Martha Palmer, and Katharina Kann
    In Findings of the 59th Annual Meeting of the Association for Computational Linguistics, 2021
  3. PROST: Physical Reasoning of Objects through Space and Time
    Stephane Aroca-Ouellette, Cory Paik, Alessandro Roncone, and Katharina Kann
    In Findings of the 59th Annual Meeting of the Association for Computational Linguistics, 2021
  4. How to Adapt Your Pretrained Multilingual Model to 1600 Languages
    Abteen Ebrahimi, and Katharina Kann
    In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, 2021
  5. Don’t Rule Out Monolingual Speakers: A Method For Crowdsourcing Machine Translation Data
    Rajat Bhatnagar, Ananya Ganesh, and Katharina Kann
    In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, 2021
  6. Findings of the LoResMT 2021 Shared Task on COVID and Sign Language for Low-resource Languages
    Atul Kr. Ojha, Chao-Hong Liu, Katharina Kann, John Ortega, Sheetal Shatam, and Theodorus Fransen
    In Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021), 2021
  7. Paradigm Clustering with Weighted Edit Distance
    Andrew Gerlach, Adam Wiemerslage, and Katharina Kann
    In Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, 2021
  8. Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering
    Adam Wiemerslage, Arya D. McCarthy, Alexander Erdmann, Garrett Nicolai, Manex Agirrezabal, Miikka Silfverberg, Mans Hulden, and Katharina Kann
    In Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, 2021
  9. Findings of the AmericasNLP 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas
    Manuel Mager, Arturo Oncevay, Abteen Ebrahimi, John Ortega, Annette Rios, Angela Fan, Ximena Gutierrez-Vasques, Luis Chiruzzo, Gustavo Giménez-Lugo, Ricardo Ramos, Ivan Vladimir Meza Ruiz, Rolando Coto-Solano, Alexis Palmer, Elisabeth Mager-Hois, Vishrav Chaudhary, Graham Neubig, Ngoc Thang Vu, and Katharina Kann
    In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, 2021
  10. Coloring the Black Box: What Synesthesia Tells Us about Character Embeddings
    Katharina Kann, and Mauro M. Monsalve-Mercado
    In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, 2021
  11. CLiMP: A Benchmark for Chinese Language Model Evaluation
    Beilei Xiang, Changbing Yang, Yu Li, Alex Warstadt, and Katharina Kann
    In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, 2021

2020

  1. Making a Point: Pointer-Generator Transformers for Disjoint Vocabularies
    Nikhil Prabhu, and Katharina Kann
    In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 9th International Joint Conference on Natural Language Processing Student Research Workshop, 2020
    Best Paper Award
  2. English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too
    Jason Phang, Phu Mon Htut, Yada Pruksachatkun, Haokun Liu, Clara Vania, Iacer Calixto, Katharina Kann, and Samuel R. Bowman
    In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 9th International Joint Conference on Natural Language Processing, 2020
  3. Tackling the Low-resource Challenge for Canonical Segmentation
    Manuel Mager, Özlem Çetinoğlu, and Katharina Kann
    In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020
  4. Acrostic Poem Generation
    Rajat Agarwal, and Katharina Kann
    In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020
  5. IGT2P: From Interlinear Glossed Texts to Paradigms
    Sarah Moeller, Ling Liu, Changbing Yang, Katharina Kann, and Mans Hulden
    In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020
  6. Why Overfitting Isn’t Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries
    Mozhi Zhang, Yoshinari Fujinuma, Michael J. Paul, and Jordan Boyd-Graber
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
  7. The SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion
    Katharina Kann, Arya D. McCarthy, Garrett Nicolai, and Mans Hulden
    In Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, 2020
  8. Frustratingly Easy Multilingual Grapheme-to-Phoneme Conversion
    Nikhil Prabhu, and Katharina Kann
    In Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, 2020
  9. The NYU-CUBoulder Systems for SIGMORPHON 2020 Task 0 and Task 2
    Assaf Singer, and Katharina Kann
    In Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, 2020
  10. The IMS–CUBoulder System for the SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion
    Manuel Mager, and Katharina Kann
    In Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, 2020
  11. Self-Training for Unsupervised Parsing with PRPN
    Anhad Mohananey, Katharina Kann, and Samuel R. Bowman
    In Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies, 2020
  12. Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work?
    Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, and Samuel R. Bowman
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
  13. Unsupervised Morphological Paradigm Completion
    Huiming Jin, Liwei Cai, Yihui Peng, Chen Xia, Arya McCarthy, and Katharina Kann
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
  14. Learning to Learn Morphological Inflection for Resource-Poor Languages
    Katharina Kann, Samuel R. Bowman, and Kyunghyun Cho
    In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
  15. Weakly Supervised POS Taggers Perform Poorly on Truly Low-Resource Languages
    Katharina Kann, Ophélie Lacroix, and Anders Søgaard
    In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
  16. Acquisition of Inflectional Morphology in Artificial Neural Networks With Prior Knowledge
    Katharina Kann
    In Proceedings of the Society for Computation in Linguistics, 2020