Ximena Gutierrez-Vasques, Dr.
- Postdoctoral researcher
- TextGroup
Navigation auf uzh.ch
I joined the URPP Language and Space in September 2019. My research interests cover Natural Language Processing, quantitative linguistics, under-resource languages, multilingual NLP.
I am currently working on approaches for measuring linguistic complexity (at the morphological level) using text corpora and information-theoretic approaches. I collaborate in the project "Non-randomness in Morphological Diversity: A Computational Approach Based on Multilingual Corpora".
*Updated email address: ximena.gutierrezvasques@uzh.ch
2022
Tanja Samardzic, Ximena Gutierrez-Vasques, Rob van der Goot, Max MüllerEberstein, Olga Pelloni and Barbara Plank. On Language Spaces, Scales and Cross-Lingual Transfer of UD Parsers. CONLL, 2022
Kann, K., Ebrahimi, A., Mager, M., Oncevay, A., Ortega, J. E., Rios, A., Fan, A., Chiruzzo, L., Ramos, R., Meza Ruiz, I. V., Mager, E., Chaudhary, V., Neubig, G., Palmer, A., & Vu, N. T. (2022). AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas. Frontiers in Artificial Intelligence, 5. https://doi.org/10.3389/frai.2022.99566Kann, K., Ebrahimi, A., Mager, M., Oncevay, A., Ortega, J. E., Rios, A., Fan, A., Chiruzzo, L., Ramos, R., Meza Ruiz, I. V., Mager, E., Chaudhary, V., Neubig, G., Palmer, A., & Vu, N. T. (2022). AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas. Frontiers in Artificial Intelligence,
Bentz, Christian, Gutierrez-Vasques, Ximena, Sozinova, Olga and Samardžić, Tanja. "Complexity trade-offs and equi-complexity in natural languages: a meta-analysis" Linguistics Vanguard, 2022. https://doi.org/10.1515/lingvan-2021-0054
Adran Israel Lerma Mayer, Ximena Gutierrez-Vasques, Ernesto Priani Saiso, Hannu Salmi. Underlying Sentiments in 1867: A Study of News Flows on the Execution of Emperor Maximilian I of Mexico in Digitized Newspaper Corpora. Digital Humanities Quarterly (DHQ)
Moran, S., Bentz, C., Gutierrez-Vasques, X., Sozinova, O., & Samardzic, T. TeDDi Sample: Text Data Diversity Sample for Language Comparison and Multilingual NLP. LREC 2022
Book chapter: “Relación tipo-token para contrastar la complejidad morfológica del español-náhuatl”. Ámbitos morfológicos: Descripciones y métodos. UNAM, Mayo, 2022
Gutierrez-Vasques, X., Bentz, C., Sozinova, O., & Samardzic, T. (2021). From characters to words: the turning point of BPE merges. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Ruzsics, T., Sozinova, O., Gutierrez-Vasques, X., & Samardzic, T. (2021l). Interpretability for Morphological Inflection: from Character-level Predictions to Subword-level Rules. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Mager, M., Oncevay, A., Ebrahimi, A., Ortega, J., Gonzales, A. R., Fan, A., ... & Kann, K. (2021). Findings of the AmericasNLP 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, NAACL.
Martínez, D. B., Mijangos, V., & Gutierrez-Vasques, X. (2021). Automatic Interlinear Glossing for Otomi language. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, NAACL
Gutierrez-Vasques, X., & Mijangos, V. (2020). Productivity and Predictability for Measuring Morphological Complexity. Entropy, 22(1), 48.
Gutierrez-Vasques, X., Medina-Urrea, A., & Sierra, G. (2019). Morphological segmentation for extracting Spanish-Nahuatl bilingual lexicon. Procesamiento del Lenguaje Natural, 63, 41-48.
Ximena Gutierrez-Vasques and Victor Mijangos. (2018). Comparing morphological complexity of Spanish, Otomi and Nahuatl. In Proceedings of the Workshop on Linguistic Complexity and Natural Language Processing. Association for Computational Linguistics, Santa Fe, New-Mexico, pages 30–37.
Manuel Mager, Ximena Gutierrez-Vasques, Gerardo Sierra, and Ivan Meza. (2018). Challenges of language technologies for the indigenous languages of the Americas. Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018).
Ximena Gutierrez Vasques. “Corpus paralelo español-náhuatl y su uso en las tecnologías del lenguaje humano” (Book chapter). In Galina Russell, Isabel; Peña Pimentel, Miriam; Priani Saisó, Ernesto; Barrón Tovar, José Francisco; Domínguez Herbón, David; Álvarez Sánchez, Adriana (Coords), Humanidades digitales: lengua, texto, patrimonio y datos. México, Bonilla Artigas Editores. 2018.
Gutierrez-Vasques, X., & Mijangos, V. (2017). Low-resource bilingual lexicon extraction using graph based word embeddings. arXiv preprint arXiv:1710.02569.
Gutierrez-Vasques, X., Sierra, G., & Pompa, I. H. (2016, May). Axolotl: a web accessible parallel corpus for spanish-nahuatl. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16) (pp. 4210-4214).
Gutierrez-Vasques, X. (2015). Bilingual lexicon extraction for a distant language pair using a small parallel corpus. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop (pp. 154-160).
2014-2018 |
National Autonomous University of Mexico (UNAM) PhD in Computational Linguistics |
2010-2012 |
Charles University, Czech Republic. Free University of Bolzano, Italy MSc in Computational Linguistics |
2004-2010 |
National Autonomous University of Mexico (UNAM) Degree in Computer Engineering |
Swiss Government Excellence Scholarship (2019)
Postdoctoral stay
European commission, Erasmus Mundus Scholarship (September 2010)
Fully funded master studies