Volk, Martin; Fischer, Dominic P; Scheurer, Patricia; Schwitter, Raphael; Ströbel, Phillip (2024). LLM-based Translation Across 500 Years. The Case for Early New High German. In: 20th Conference on Natural Language Processing (KONVENS 2024), Wien, Österreich, 10 September 2024 - 13 September 2024. Association for Computational Linguistics, 368-375.
Volk, Martin; Fischer, Dominic P; Fischer, Lukas; Scheurer, Patricia; Ströbel, Phillip (2024). LLM-based Machine Translation and Summarization for Latin. In: Third Workshop on Language Technologies for Historical and Ancient Languages -- LT4HALA (at LREC/COLING), Torino, 25 May 2024.
2023
Book Section
Hegele, Stefanie; Heinisch, Barbara; Popp, Antonia; Marheinecke, Katrin; Rios, Annette; Gromann, Dagmar; Volk, Martin; Rehm, Georg (2023). Language Report German. In: Rehm, Georg; Way, Andy. European Language Equality: A Strategic Agenda for Digital Language Equality. Cham: Springer International Publishing, 147-150.
Ströbel, Phillip; Scheurer, Patricia; Volk, Martin (2023). Lessons Learnt from Bullinger Digital. In: Open Up Digital Editions Conference 2024, Zurich, 24 January 2024 - 26 January 2024. Center Digital Editions & Edition Analytics (University Library Zurich) and Research and Infrastructure Support RISE (University of Basel), 75-76.
Hegele, Stefanie; Heinisch, Barbara; Popp, Antonia; Marheinecke, Katrin; Rios, Annette; Gromann, Dagmar; Volk, Martin; Rehm, Georg (2023). European Language Equality - Report on the German Language. Berlin, Germany: European Language Equality (ELE).
2022
Book Section
Volk, Martin; Graën, Johannes (2022). Binomials in Swedish corpora – ‘Ordpar 1965’ revisited. In: Volodina, Elena; Dannélls, Dana; Berdicevskis, Aleksandrs; Forsberg, Markus; Virk, Shafqat. Live and Learn : Festschrift in honor of Lars Borin. Göteborg: Department of Swedish, Multilingualism and Language Technology, University of Gothenburg, 139-144.
Scheurer, Patricia; Raphael, Müller; Bernard, Schroffenegger; Ströbel, Phillip Benjamin; Benjamin, Suter; Volk, Martin (2022). Ein Briefwechsel-Korpus des 16. Jahrhunderts in Frühneuhochdeutsch. In: Kupietz, Marc; Schmidt, Thomas. Neue Entwicklungen in der Korpuslandschaft der Germanistik. Tübingen: Narr Francke Attempto GmbH + Co. KG, 33-42.
Conference or Workshop item
Ströbel, Phillip Benjamin; Clematide, Simon; Hodel, Tobias; Volk, Martin (2022). Transformer-based HTR for Historical Documents. In: Workshop on Computational Methods in the Humanities 2022, Lausanne, 9 Juni 2022 - 10 Juni 2022.
Hauser, Renate; Vamvas, Jannis; Ebling, Sarah; Volk, Martin (2022). A Multilingual Simplified Language News Corpus. In: 2nd Workshop on Tools and Resources to Empower People with REAding DIfficulties (READI) within the 13th Language Resources and Evaluation Conference, Marseille, France, 24 June 2022. European Language Resources Association, 25-30.
Ströbel, Phillip Benjamin; Clematide, Simon; Volk, Martin; Schwitter, Raphael; Hodel, Tobias; Schoch, David (2022). Evaluation of HTR models without Ground Truth Material. In: LREC 2022, Marseille, 21 June 2022 - 23 June 2022, European Language Resources Association.
Fischer, Lukas; Scheurer, Patricia; Schwitter, Raphael; Volk, Martin (2022). Machine Translation of 16th Century Letters from Latin to German. In: Second Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA 2022), Marseille, 25 Juni 2022. LREC, 43-50.
2021
Book Section
Cheng, Yuang; Ding, Yue; Foucher, Sebastien; Pascual, Damián; Richter, Oliver; Volk, Martin; Wattenhofer, Roger (2021). WikiFlash: Generating Flashcards from Wikipedia Articles. In: Mantoro, T; et al. Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science. Cham: Springer, 138-149.
Graën, Johannes; Volk, Martin (2021). Binomial adverbs in Germanic and Romance Languages : A corpus-based study. In: Lavid-López, Julia; Maíz-Arévalo, Carmen; Zamorano-Mansilla, Juan Rafael. Corpora in Translation and Contrastive Research in the Digital Age : Recent advances and explorations. Amsterdam: John Benjamins, 326-342.
2020
Book Section
Säuberli, Andreas; Ebling, Sarah; Volk, Martin (2020). Benchmarking Data-driven Automatic Text Simplification for German. In: Gala, Nuria; Wilkens, Rodrigo. Proceedings of the 1st Workshop on Tools and Resources to Empower People with REAding DIfficulties (READI). Marseille: European Language Resources Association, 41-48.
Klenner, Manfred; Göhring, Anne; Amsler, Michael (2020). Harmonization Sometimes Harms. In: Proceedings of the 5th Swiss Text Analytics Conference (SwissText) & 16th Conference on Natural Language Processing (KONVENS), Winterthur, 23 June 2020 - 25 June 2020, swisstext-and-konvens-2020.
Kew, Tannon; Shaitarova, Anastassia; Meraner, Isabel; Clematide, Simon; Goldzycher, Janis; Volk, Martin (2019). Geotagging a diachronic corpus of alpine texts: comparing distinct approaches to toponym recognition. In: RANLP 2019, Workshop on Language technology for digital historical archives with a special focus on Central-, (South-)Eastern Europe, Middle East and North Africa, Varna, Bulgaria, 5 September 2019. RANLP, 11-18.
Läubli, Samuel; Sennrich, Rico; Volk, Martin (2018). Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October 2018 - 4 November 2018. Association for Computational Linguistics, 4791-4796.
Graën, Johannes; Bertamini, Mara; Volk, Martin (2018). Cutter – a Universal Multilingual Tokenizer. In: Swiss Text Analytics Conference, Winterthur, 12 June 2018 - 13 June 2018. CEUR-WS, 75-81.
Läubli, Samuel; Müller, Mathias; Horat, Beat; Volk, Martin (2018). mtrain: A Convenience Tool for Machine Translation. In: Proceedings of the 21st Annual Conference of the European Association for Machine Translation, Alacant, Spain, 28 May 2018 - 30 May 2018. Universitat d'Alacant, 357.
Graën, Johannes; Sandoz, Dominique; Volk, Martin (2017). Multilingwis2 – Explore Your Parallel Corpus. In: Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, Gothenburg, Sweden, 22 May 2017 - 24 May 2017. Linköping University Electronic Press, Linköpings universitet, 247-250.
2016
Book Section
Clematide, Simon; Graën, Johannes; Volk, Martin (2016). Multilingwis – A Multilingual Search Tool for Multi-Word Units in Multiparallel Corpora. In: Corpas Pastor, Gloria. Computerised and Corpus-based Approaches to Phraseology: Monolingual and Multilingual Perspectives/Fraseología computacional y basada en corpus: perspectivas monolingües y multilingües. Geneva: Tradulex, n/a.
Suter, Julia; Ebling, Sarah; Volk, Martin (2016). Rule-based Automatic Text Simplification for German. In: 13th Conference on Natural Language Processing (KONVENS 2016), Bochum, Germany, 19 September 2016 - 21 September 2016, s.n..
Clematide, Simon; Furrer, Lenz; Volk, Martin (2016). Crowdsourcing an OCR Gold Standard for a German and French Heritage Corpus. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, 23 May 2016 - 28 May 2016. European Language Resources Association (ELRA), 975-982.
Clematide, Simon (2015). Reflections and a Proposal for a Query and Reporting Language for Richly Annotated Multiparallel Corpora. In: Gintare, Grigonyte; Clematide, Simon; Utka, Andrius; Volk, Martin. Proceedings of the Workshop on Innovative Corpus Query and Visualization Tools at NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania. Linköping, Sweden: Linköping University Electronic Press, Linköpings universitet, 6-16.
Volk, Martin; Clematide, Simon (2014). Detecting Code-Switching in a Multilingual Alpine Heritage Corpus. In: Proceedings of the First Workshop on Computational Approaches to Code Switching, Doha, Qatar, 25 October 2014. Association for Computational Linguistics, 24-33.
Volk, Martin; Graën, Johannes; Callegaro, Elena (2014). Innovations in parallel corpus search tools. In: Ninth International Conference on Language Resources and Evaluation (LREC'14), Reykjavik, 26 May 2014 - 31 May 2014, European Language Resources Association (ELRA).
Aepli, Noëmi; Volk, Martin (2013). Reconstructing Complete Lemmas for Incomplete German Compounds. In: International Conference of the German Society for Computational Linguistics and Language Technology (GSCL), Darmstadt, 25 September 2013 - 27 September 2013, 1-13.
Plamada, Magdalena; Volk, Martin (2013). Mining for Domain-specific Parallel Text from Wikipedia. In: Proceedings of the Sixth Workshop on Building and Using Comparable Corpora, Sofia, Bulgaria, August 2013 - August 2013, 112-120.
Grigonyte, Gintare; Rinaldi, Fabio; Volk, Martin (2012). Term evolution: use of biomedical terminologies. In: AAAI-2012 Fall Symposium on Information Retrieval and Knowledge Discovery in Biomedical Text, Arlington, 3 November 2012 - 5 November 2012. AAAI, 79-80.
Grigonyte, Gintare; Rinaldi, Fabio; Volk, Martin (2012). Change of biomedical domain terminology over time. In: Human Language Technologies – The Baltic Perspective (Baltic HLT 2012), Tartu, 4 October 2012 - 5 October 2012. I O S Press, 74-81.
Fishel, Mark; Georgakopoulou, Yota; Penkale, Sergio; Petukhova, Volha; Rojc, Matej; Volk, Martin; Way, Andy (2012). From subtitles to parallel corpora. In: The 16th Annual Conference of the European Association for Machine Translation, Trento, Italy, 28 May 2012 - 30 May 2012. European Association for Machine Translation, 3-6.
Plamada, Magdalena; Volk, Martin (2012). Towards a Wikipedia-extracted alpine corpus. In: The Fifth Workshop on Building and Using Comparable Corpora, Istanbul, Turkey, 26 May 2012 - 26 May 2012.
Volk, Martin; Furrer, Lenz; Sennrich, Rico (2011). Strategies for reducing and correcting OCR errors. In: Sporleder, Caroline; van den Bosch, Antal; Zervanou, Kalliopi. Language Technology for Cultural Heritage. Berlin: Springer, 3-22.
Hjelm, Hans; Volk, Martin (2011). Cross-language ontology learning. In: Wong, Wilson; Liu, Wei; Bennamoun, Mohammed. Ontology learning and knowledge discovery using the web: challenges and recent advances. Hershey, PA: IGI Global, 272-297.
Killer, M; Sennrich, R; Volk, Martin (2011). From multilingual web-archives to parallel treebanks in five minutes. In: Conference of the German Society for Computational Linguistics and Language Technology (GSCL) 2011, Hamburg, Germany, 28 September 2011 - 30 September 2011, 57-62.
Furrer, Lenz; Volk, Martin (2011). Reducing OCR errors in Gothic-script documents. In: 8th International Conference on Recent Advances in Natural Language Processing (RANLP 2011), Hisar, 16 September 2011, 97-103.
Edited by: Burchardt, Aljoscha; Egg, Markus; Eichler, Kathrin; Krenn, Brigitte; Lessmöllmann, Annette; Rehm, Georg; Stede, Manfred; Uszkoreit, Hans; Volk, Martin (2011). Berlin: Meta-Net.Languages in the European information society - German.
Volk, Martin; Marek, T; Sennrich, R (2010). Reducing OCR errors by combining two OCR systems. In: ECAI 2010 Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2010), Lisbon, Portugal, 16 August 2010, 61-65.
Volk, Martin; Bubenhofer, Noah; Althaus, Adrian; Bangerter, Maya; Furrer, Lenz; Ruef, Beni (2010). Challenges in building a multilingual alpine heritage corpus. In: seventh international conference on Language Resources and Evaluation (LREC), Malta, 19 May 2010 - 21 May 2010.
Piotrowski, Michael; Läubli, Samuel; Volk, Martin (2010). Towards mapping of alpine route descriptions. In: GIR'10: 6th Workshop on Geographic Information Retrieval, Zurich, Switzerland, 18 February 2010 - 19 February 2010, 15-16.
Klenner, M (2009). Nominal anaphora. Can we tame the beasts?.In: Clematide, S; Klenner, M; Volk, Martin. Searching Answers : Festschrift in Honour of Michael Hess on the Occasion of His 60th Birthday. Münster: Monsenstein und Vannerdat, 77-84.
Mahlow, C; Piotrowski, M (2009). A Target-Driven Evaluation of Morphological Components for German. In: Clematide, S; Klenner, M; Volk, Martin. Searching Answers -- Festschrift in Honour of Michael Hess on the Occasion of his 60th birthday. Münster: MV-Wissenschaft, 85-99.
Bünzli, A (2009). Natural language processing in law - change we need. In: Clematide, S; Klenner, M; Volk, Martin. Searching Answers: Festschrift in Honour of Michael Hess on the Occasion of His 60th Birthday. Münster, Germany: Monsenstein und Vannerdat, 11-19.
Jekat, S; Volk, Martin (2009). Maschinelle und computergestützte Übersetzung. In: Carstensen, K U. Computerlinguistik und Sprachtechnologie. Eine Einführung. Heidelberg: Spektrum, 642-658.
Clematide, S (2009). A morpho-syntactic generation service for German glossary entries. In: Clematide, S; Klenner, M; Volk, Martin. Searching Answers: Festschrift in Honour of Michael Hess on the Occasion of His 60th Birthday. Münster, Germany: Monsenstein und Vannerdat, 33-43.
Höfler, Stefan (2009). Modelling relevance-driven language evolution. In: Clematide, S; Klenner, M; Volk, Martin. Searching Answers: Festschrift in Honour of Michael Hess on the Occasion of His 60th Birthday. Münster, Germany: Monsenstein und und Vannerdat, 49-56.
Sennrich, Rico; Schneider, Gerold; Volk, Martin; Warin, Martin (2009). A New Hybrid Dependency Parser for German. In: Chiarcos, Christian; de Castilho, Richard Eckart; Stede, Manfred. Von der Form zur Bedeutung: Texte automatisch verarbeiten / From Form to Meaning: Processing Texts Automatically. Proceedings of the Biennial GSCL Conference 2009. Tübingen: Narr, 115-124.
Fuchs, N E; Kaljurand, K; Kuhn, T (2009). ACE can be described by itself. In: Clematide, S; Klenner, M; Volk, Martin. Searching Answers: Festschrift in honour of Michael Hess on the occasion of his 60th birthday. Münster, Germany: Monsenstein und und Vannerdat, 45-48.
Rios, A; Göhring, A; Volk, Martin (2009). A Quechua-Spanish parallel treebank. In: 7th Conference on Treebanks and Linguistic Theories, Groningen, 2009.
Volk, Martin; Marek, T; Samuelsson, Y (2008). Human judgements in parallel treebank alignment. In: COLING Workshop on Human Judgements in Computational Linguistics, Manchester, UK, 23 August 2008.
2007
Conference or Workshop item
Lundborg, J; Marek, T; Mettler, M; Volk, Martin (2007). Using the Stockholm TreeAligner. In: 6th Workshop on Treebanks and Linguistic Theories, Bergen, 2007.
Volk, Martin; Samuelsson, Y (2007). Frame-Semantic Annotation on a Parallel Treebank. In: Nodalida Workshop on Building Frame Semantics Resources for Scandinavian and Baltic Languages, Tartu, 2007.
Nivre, J; De Smet, K; Volk, Martin (2005). Treebanks: a whitepaper. In: Holmboe, H. Nordisk Sprogteknologi. Nordic Language Technology. Årbog for Nordisk Sprogteknologisk Forskningsprogram 2000-2004. Copenhagen: Museum Tusculanums Forlag, ?-?.
Volk, Martin; Gusafson-Capková, S; Hagstrand, D; Uibo, H (2005). Teaching treebanking. In: Holmboe, H. Nordisk Sprogteknologi. Nordic Language Technology. Årbog for Nordisk Sprogteknologisk Forskningsprogram 2000-2004. Copenhagen: Museum Tusculanums Forlag, 143-159.
Conference or Workshop item
Samuelsson, Y; Volk, Martin (2005). Presentation and representation of parallel treebanks. In: Treebanking for Discourse and Speech. Proc. Of the {NODALIDA} 2005 Special Session on Treebanks for Spoken Language and Discourse, Joensuu, Finland, 20 May 2005 - 21 May 2005, 147-159.
Volk, Martin (2002). Using the web as corpus for linguistic research. In: Pajusalu, R; Hennoste, T. ähendusepüüdja. Catcher of the Meaning. A Festschrift for Professor Haldur Õim. Tartu: University of Tartu, 1-10.
Mehl, S; Volk, Martin (1999). Aspects of the translation of English subordinate clauses into German. In: Problems and Potential of English-to-German MT systems. Workshop at the 8th International Conference on Theoretical and Methodological Issues in Machine Translation, Chester, 1999.
Volk, Martin (1998). Markup of a Test Suite with SGML. In: Nerbonne, J. Linguistic Databases. Stanford: Center for the Study of Language and Information,, 59-76.
Volk, Martin (1997). Probing the lexicon in evaluating commercial MT systems. In: Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics. Madrid: European Chapter Meeting of the ACL, 112-119.
Conference or Workshop item
Volk, Martin; Richarz, D (1997). Experiences with the GTU grammar development environment. In: Workshop on Computational Environments For Grammar Development And Linguistic Engineering at the ACL/EACL Joint Conference, Madrid, Spain, 11 July 1997 - 12 July 1997, 107-113.
Volk, Martin (1996). Parsing with ID/LP and PS rules. In: Natural Language Processing and Speech Technology. Results of the 3rd KONVENS Conference (Bielefeld), Berlin, 1996, 342-353.
Jung, M; Richarz, D; Volk, Martin (1994). GTU - Eine Grammatik-Testumgebung. In: KONVENS-94, Vienna, Austria, 28 September 1994 - 30 September 1994, 427-430.