Navigation auf uzh.ch
I am working on my PhD project at the URPP Language and Space starting from October 2018. The project is a part of Tanja Samardžić's SNSF project Non-randomness in Morphological Diversity: A Computational Approach Based on Multilingual Corpora.
I have a background in various fields of linguistics, with a main focus on computational linguistics. My research is inspired by possibilities of programming and other state-of-the-art tools for analyzing language data and understanding language related phenomena.
Geometry of Linguistic Morphology
The PhD project is aimed to develop new methods for studying linguistic morphological diversity and for language comparison. In particular, I explore the tools from information theory (entropy), fractal geometry (fractal dimension) and graph theory (tree structures) in order to establish a rigorous scientific approach for comparing morphological structures. An expected outcome of the project is a 1) novel method of studying subword structures language-independently; 2) new knowledge of the morphological systems for the sample of 100 typologically balanced languages.
Supervisor: Tanja Samardžić
Co-supervisor, professor in charge: Martin Volk
Current papers in progress:
Subword geometry: picturing word shapes (extended abstract
accepted to SIGTYP 2021)
Fractal dimension as a measure of morphological complexity
2021
Ruzsics, T., O. Sozinova, X. Gutierrez-Vasques and T. Samardzic. (2021). Interpretability for morphological inflection: from character-level predictions to subword-level rules. European Chapter of the Association for Computational Linguistics, Long Papers.
Gutierrez-Vasques, X., C. Bentz, O. Sozinova and T. Samardzic. (2021). From characters to words: the turning point of BPE merges. European Chapter of the Association for Computational Linguistics, Long Papers.
2016
Sozinova, O. (2016). Complex networks-based approach to Russian rhyme history description: linguostatistics and database.In Digital Humanities 2016, Conference Abstracts, Krakow, Poland, 891-893.
Sozinova, O. and M. Khudyakova (2016). Tense switching in narratives by Russian aphasia speakers.In Temas de lingüística clínica. Proceedings of the IV Clinical Linguistics International Congress, Barcelona, Spain, 209. 95-939.
2015
Arkhangel’skii, T. and O. Sozinova, (2015). A multimedia corpus of the Yiddish language. Automatic Documentation and Mathematical Linguistics, 49(2), 47-53.
Conference presentations
Bentz, C., O. Sozinova and T. Samardžić (2019). Collecting a corpus for 100 typologically diverse languages (100LC). Workshop on language documentation: multilingual settings and technological advances. Uppsala, Sweden.
Sozinova, O., T. Samardžić and C. Bentz (2019). Measuring inflectional and derivational complexity. Interactive Workshop on Measuring Language Complexity, IWMLC2019. Freiburg, Germany.
Sozinova, O. (2015). Rhyme: psychological experiment. Gasparov Readings 2015. Russian State University for the Humanities. Moscow, Russia.
Sozinova, O. (2015). Rhyme properties in Marina Tsvetaeva’s verse. Structure of Verse workshop. Leiden, Netherlands.
Sozinova, O. (2015), Corpus research on the variation of the reflexive
postfix -sja in the Russian subdialect of the Ustya river basin. Norwegian Graduate Student Conference in Linguistics and Philology. Tromsø, Norway.
von Waldenfels, R., N. Dobrushina, M. Daniel, A. Ter-Avanesova, I. Levin,
O. Sozinova and V. Zhigul’skaya (2015). Modelling speaker variation and dialect change in Northern Russia. The International Conference on Language Variation in Europe, ICLaVE. Leipzig, Germany.
Grabovskaya, M., P. Kasyanova and O. Sozinova (2015). Semantic analysis of Russian augmentatives using RNC. Conference on computational and corpus linguistics, ConCorT 2015. Educational center ”Voronovo”, Russia.
Grabovskaya, M., P. Kasyanova and O. Sozinova (2015). Corpus research on Russian augmentatives. I Student conference at the Institute of Linguistics. Russian State University for the Humanities. Moscow, Russia.
2020 |
Tutoring, Processing Non-standard Language, HS2020 Tutoring, Techniques of Semantic Analysis (MA), FS2020 University of Zurich |
2019 |
Tutoring, Introduction to Programming (MA), HS2019 University of Zurich |
2018 | Linguist Consultant (Russian), Lionbridge Technologies, Inc. |
2015 – 2016 | Junior Linguist (German), API.AI |
2015 |
Laboratory Assistant, Neurolinguistic Laboratory, National Research University Higher School of Economics, Moscow |
2014 – 2015 |
Teaching Assistant in German, School of Linguistics, National Research University Higher School of Economics, Moscow |
Main education
2018 – present |
PhD in Computational Linguistics Text Group, URPP Language and Space University of Zurich Supervised by Dr. Tanja Samardžić, Prof. Dr. Martin Volk. |
2016 – 2018 |
MA in Linguistics, specialized in Historical Linguistics, Minor: Slavic Languages and Literatures University of Bern Thesis: Reconstruction of Old Chinese Phonology based on Computational Linguistic Analysis of the Shijing Rhymes. Supervised by Prof. Dr. George van Driem. |
2012 – 2016 |
BA in Fundamental and Applied Linguistics, specialized in Computational Linguistics National Research University Higher School of Economics, Moscow Thesis: Complex Networks-based Approach to Russian Rhyme History Description: Linguostatistics and Database Supervised by Prof. Dr. Boris Orekhov. |
Exchange studies
2015 – 2016 |
Exchange program in Computational Linguistics (winter semester) University of Tübingen |
2014 |
Exchange program in Linguistics (spring semester) University of Bern |
Summer schools
2019 |
9th Lisbon Machine Learning School, LxMLS Instituto Superior Técnico, Lisbon |
2019 |
Revisiting research training in linguistics: theory, logic, method Petnica Science Centre, Valjevo |
2017 |
Chinese Language Summer School University of Wuhan |
2014 |
Introduction to Contemporary Neurolinguistics National Research University Higher School of Economics, Moscow |
2016 – 2018 |
Master Grant of the University of Bern |
2015 – 2016 |
Oxford Russia Fund Scholarship |
2014 – 2016 |
Increased State Academic Scholarship for academic and research achievements (Moscow, Russia) |
2021 |
Co-organizer at Scientifica 2021 University of Zurich |
2019 |
Volunteer at Scientifica 2019, LiRi Information Event University of Zurich |
Programming: Python, R, Java
Web-development: Python (Django, Flask), HTML & CSS, JavaScript (AJAX), ElasticSeach
Databases: Neo4j, MySQL
Graphics software: Adobe Photoshop & Illustrator, Corel Painter & Draw
Mark-up: LaTeX
Russian (native); English, German (fluent); French (intermediate); Chinese, Bulgarian, Serbian (elementary)