Sara Stymne
Senior Lecturer/Associate Professor in Computational Linguistics at Department of Linguistics and Philology
- Telephone:
- +46 18 471 10 88
- E-mail:
- Sara.Stymne@lingfil.uu.se
- Visiting address:
- Engelska parken
Thunbergsvägen 3H - Postal address:
- Box 635
751 26 UPPSALA - Available:
- enl överenskommelse per epost
Download contact information for Sara Stymne at Department of Linguistics and Philology
Short presentation
I am Docent in Computational Linguistics, working as a senior lecturer since 2017 and at Uppsala University since 2012. My main research interests are cross-lingual NLP and digital humanities. I am interested in how computational linguistics can be used to solve research questions in other fields, including language history, literary analysis, and political science. My earlier work was focused on machine translation.
Biography
Web page: https://www2.lingfil.uu.se/cl/sara/

Publications
Recent publications
2025
Investigating the Role of Prosody in Disambiguating Implicit Discourse Relations in Egyptian Arabic
p. 926-930, 2024
Part of Proceedings of the 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2024), p. 253-263, 2024
Relation between Cross-Genre and Cross-Topic Transfer in Dependency Parsing
Part of Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), p. 13879-13884, 2024
Som om …: Stil och struktur hos komparativa subjunktionsfraser med konditional bisats
Part of Svenskans beskrivning 38, p. 306-323, 2024
All publications
Articles in journal
Universals of Linguistic Idiosyncrasy in Multilingual Computational Linguistics
p. 22-70, 2023
PARSEME Meets Universal Dependencies: Getting on the Same Page in Representing Multiword Expressions
Part of Northern European Journal of Language Technology (NEJLT), 2023
- DOI for PARSEME Meets Universal Dependencies: Getting on the Same Page in Representing Multiword Expressions
- Download full text (pdf) of PARSEME Meets Universal Dependencies: Getting on the Same Page in Representing Multiword Expressions
What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?
Part of Computational linguistics - Association for Computational Linguistics (Print), p. 763-784, 2020
- DOI for What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?
- Download full text (pdf) of What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?
What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?
Part of CoRR, 2019
Språklig rytm i skönlitterär prosa. En fallstudie i Karin Boyes Kallocain
Part of Samlaren, p. 128-161, 2018
Generation of Compound Words in Statistical Machine Translation into Compounding Languages
Part of Computational linguistics - Association for Computational Linguistics (Print), p. 1067-1108, 2013
Conference papers
Investigating the Role of Prosody in Disambiguating Implicit Discourse Relations in Egyptian Arabic
p. 926-930, 2024
Part of Proceedings of the 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2024), p. 253-263, 2024
Relation between Cross-Genre and Cross-Topic Transfer in Dependency Parsing
Part of Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), p. 13879-13884, 2024
Som om …: Stil och struktur hos komparativa subjunktionsfraser med konditional bisats
Part of Svenskans beskrivning 38, p. 306-323, 2024
Part of Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023), p. 126-144, 2023
What Causes Unemployment?: Unsupervised Causality Mining from Swedish Governmental Reports
Part of Proceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023), p. 25-29, 2023
UD-MULTIGENRE: a UD-Based Dataset Enriched with Instance-Level Genre Annotations
Part of Proceedings of the 3rd Workshop on Multi-lingual Representation Learning (MRL), p. 253-267, 2023
- DOI for UD-MULTIGENRE: a UD-Based Dataset Enriched with Instance-Level Genre Annotations
- Download full text (pdf) of UD-MULTIGENRE: a UD-Based Dataset Enriched with Instance-Level Genre Annotations
Part of Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023), p. 24-35, 2023
Parser Evaluation for Analyzing Swedish 19th–20th Century Literature
Part of Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), p. 335-346, 2023
Multilingual Automatic Speech Recognition for Scandinavian Languages
Part of Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), p. 460-466, 2023
Part of Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), p. 88-93, 2022
- DOI for Uppsala University at SemEval-2022 Task 1: Can Foreign Entries Enhance an English Reverse Dictionary?
- Download full text (pdf) of Uppsala University at SemEval-2022 Task 1: Can Foreign Entries Enhance an English Reverse Dictionary?
Cause and Effect in Governmental Reports: Two Data Sets for Causality Detection in Swedish
Part of Proceedings of the First Workshop on Natural Language Processing for Political Sciences (PoliticalNLP), p. 46-55, 2022
Exploring Cross-Lingual Transfer to Counteract Data Scarcity for Causality Detection
Part of WWW '22, p. 501-508, 2022
- DOI for Exploring Cross-Lingual Transfer to Counteract Data Scarcity for Causality Detection
- Download full text (pdf) of Exploring Cross-Lingual Transfer to Counteract Data Scarcity for Causality Detection
SLäNDa Version 2.0: Improved and Extended Annotation of Narrative and Dialogue in Swedish Literature
Part of Proceedings of the 13th International Conference on Language Resources and Evaluation (LREC 2022), p. 5324-5333, 2022
Part of Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), p. 150-156, 2021
- DOI for Uppsala NLP at SemEval-2021 Task 2: Multilingual Language Models for Fine-tuning and Feature Extraction in Word-in-Context Disambiguation
- Download full text (pdf) of Uppsala NLP at SemEval-2021 Task 2: Multilingual Language Models for Fine-tuning and Feature Extraction in Word-in-Context Disambiguation
Whit’s the Richt Pairt o Speech: PoS tagging for Scots
Part of Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), p. 39-48, 2021
Investigation of Transfer Languages for Parsing Latin: Italic Branch vs. Hellenic Branch
Part of Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), p. 315-320, 2021
A Mention-Based System for Revision Requirements Detection
Part of Proceedings of the 1st Workshop on Understanding Implicit and Underspecified Language, p. 58-63, 2021
Part of Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, p. 107-118, 2020
Cross-Lingual Domain Adaptation for Dependency Parsing
Part of Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories (TLT), p. 62-69, 2020
- DOI for Cross-Lingual Domain Adaptation for Dependency Parsing
- Download full text (pdf) of Cross-Lingual Domain Adaptation for Dependency Parsing
Evaluating Word Embeddings for Indonesian–English Code-Mixed Text Based on Synthetic Data
Part of Proceedings of the 4th Workshop on Computational Approaches to Code Switching, p. 26-35, 2020
IESTAC: English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine Translation
Part of Proceedings of the First International Workshop on Natural Language Processing Beyond Text, p. 41-50, 2020
SLäNDa: An Annotated Corpus of Narrative and Dialogue in Swedish Literary Fiction
Part of Proceedings of the 12th Language Resources and Evaluation Conference, p. 826-834, 2020
Discourse-Related Language Contrasts in English-Croatian Human and Machine Translation
Part of Proceedings of the Third Conference on Machine Translation: Research Papers, p. 36-48, 2018
82 Treebanks, 34 Models: Universal Dependency Parsing with Multi-Treebank Models
Part of Proceedings of the CoNLL 2018 Shared Task, p. 113-123, 2018
Parser Training with Heterogeneous Treebanks
Part of Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), p. 619-625, 2018
- DOI for Parser Training with Heterogeneous Treebanks
- Download full text (pdf) of Parser Training with Heterogeneous Treebanks
Part of Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, p. 2711-2720, 2018
From raw text to Universal Dependencies: look, no tags!
Part of Proceedings of the CoNLL 2017 Shared Task, p. 207-217, 2017
Arc-Hybrid Non-Projective Dependency Parsing with a Static-Dynamic Oracle
Part of IWPT 2017 15th International Conference on Parsing Technologies, p. 99-104, 2017
Learning with learner corpora: Using the TLE for native language identification
Part of Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition, p. 1-7, 2017
The Effect of Translationese on Tuning for Statistical Machine Translation
Part of Proceedings of the 21st Nordic Conference on Computational Linguistics, p. 241-246, 2017
Annotating Errors in Student Texts: First Experiences and Experiments
Part of Proceedings of Joint 6th NLP4CALL and 2nd NLP4LA Nodalida workshop, p. 47-60, 2017
Findings of the 2017 DiscoMT Shared Task on Cross-lingual Pronoun Prediction
Part of Proceedings of the Third Workshop on Discourse in Machine Translation, 2017
A BiLSTM-based System for Cross-lingual Pronoun Prediction
2017
Plausibility Testing for Lexical Resources
Part of Proceedings of CLEF 2017, p. 132-137, 2017
Part of Proceedings of the 15th Treebanks and Linguistic Theories Workshop (TLT), p. 99-110, 2017
Using Word Alignments to Determine the Compositionality of Swedish Compound Nouns
2016
Findings of the 2016 WMT Shared Taskon Cross-lingual Pronoun Prediction
Part of Proceedings of the First Conference on Machine Translation, p. 525-542, 2016
Part of Proceedings of the First Conference on Machine Translation, p. 391-398, 2016
The UU Submission to the Machine Translation Quality Estimation Task
Part of Proceedings of the First Conference on Machine Translation, p. 825-830, 2016
Feature Exploration for Cross-Lingual Pronoun Prediction
Part of Proceedings of the First Conference on Machine Translation, p. 609-615, 2016
The Effect of Translationese on SMT Tuning
2016
Part of Proceedings of the Second Workshop on Discourse in Machine Translation (DiscoMT), p. 1-16, 2015
Estimating Word Alignment Quality for SMT Reordering Tasks
Part of Proceedings of the Ninth Workshop on Statistical Machine Translation, p. 275-286, 2014
Anaphora Models and Reordering for Phrase-Based SMT
Part of Proceedings of the Ninth Workshop on Statistical Machine Translation, p. 122-129, 2014
Feature Weight Optimization for Discourse-Level SMT
Part of Proceedings of the Workshop on Discourse in Machine Translation (DiscoMT), p. 60-69, 2013
Tunable Distortion Limits and Corpus Cleaning for SMT
Part of Proceedings of the Eighth Workshop on Statistical Machine Translation, p. 225-231, 2013
Statistical Machine Translation with Readability Constraints
Part of Proceedings of the 19th Nordic Conference on Computational Linguistics (NODALIDA 2013), p. 375-386, 2013
Part of Proceeedings of the 24th Scandinavian Conference of Linguistics, p. 332-344, 2013
Docent: A Document-Level Decoder for Phrase-Based Statistical Machine Translation
Part of Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, p. 193-198, 2013
On the Interplay between Readability, Summarization, and MTranslatability
Part of Proceedings of the Fourth Swedish Language Technology Conference (SLTC 2012), p. 70-71, 2012
Conference proceedings (editor)
2025