Beáta Megyesi
Professor in Computational Linguistics (leave of absence) at Department of Linguistics and Philology
- Telephone:
- +46 18 471 78 60
- E-mail:
- Beata.Megyesi@lingfil.uu.se
- Visiting address:
- Engelska parken
Thunbergsvägen 3H - Postal address:
- Box 635
751 26 UPPSALA - Leave of absence:
- 2023-08-01 - 2025-07-31
Download contact information for Beáta Megyesi at Department of Linguistics and Philology
- CV:
- Download CV
- ORCID:
- 0000-0002-4838-6518
Short presentation
I am a professor of computational linguistics and currently on leave from Uppsala University for a professorship at Stockholm University.
My main research area is natural language processing and digital philology. I conduct research on historical cryptology to develop methods to automatically crack historical ciphers. I also develop tools for the analysis of historical and modern texts in various genres to enable large, quantitative studies for humanities and social sciences.
Keywords
- digital humanities
- historical cryptology
- natural language processing
Biography
Education
- Professor of Computational Linguistics, Department of Linguistics and Philology, Uppsala University, 2021
- Associate Professor in Computational Linguistics, Department of Linguistics and Philology, Uppsala University, 2013
- PhD in Speech Communication, Department of Speech, Music and Hearing, KTH, 2002
- B.A. in Computational Linguistics, Department of Linguistics, Stockholm University, 2000
Appointments
Present:
- Vice chair and member of the Linguistics review panel at the Swedish Research Council, 2021-2023
- Member of the nominating committee of the Northern European Association for Language Technology – NEALT, 2022-2025
- Vice-chair and member of the board of the Center for Digital Humanities, Uppsala University, 2021-2023
Past:
- President of the Northern European Association for Language Technology – NEALT, 2020-2021
- Head of Department of Linguistics and Philology, 2009-2018
- Director of the English Park Campus, Uppsala University, 2017-2018
- Vice-president of the Northern European Association for Language Technology – NEALT (2018-2019)
- Member of the board at the Dept. of Linguistics and Philology, 2007–2009, 2010-2012, 2012-2015, 2016-2018
- Member of the board of the faculty of languages, Uppsala University, 2008-2011, 2011-2014, 2019-2020
- Director of studies at the Department of linguistics and philology, 2007-2009
- Program coordinator for the Language Technology Program, Uppsala University, 2004-2007
- Member of the board at the Department of Speech, Music and Hearing, 2003-2004
Teaching
Basic level courses
- Languages, computers, and text processing (in Swedish)
- Advisor for Language Technology Project, 7.5 ECTS
- BA thesis supervision
Advanced level courses
- Research and Development, 15 ECTS
- Digital Philology, 5/7.5 ECTS
- Thesis work in language technology, 30 ECTS
- Advisor for Language Technology Project, 7.5 ECTS
- Master thesis supervision
PhD education
- I was co-supervisor: Eva Petterson and Mojgan Seraji
Other things I like: my twins, traveling, Amnesty International, some workout like skiing, piloxing and pump, books, cello, chocolate, margaritas and cosmos, ladies of jazz, Bridges of Madison county, and of course my dearest best friends: girls, you know who you are!, and my (often empty) not-to-do list...
Things I don't like: greed, injustice, and ruling techniques
Research
Research interests
- Historical Cryptology
- Digital Philology focusing on the automatic analysis of historical texts and student writings
- PoS tagging, morphological analysis, chunking, shallow parsing for different types of languages
- Parallel corpora and treebanks
- Text categorization
Projects
- DECRYPT: Decryption of historical manuscripts (PI, Vetenskapsrådet: 2018-2024).
- DECODE: Automatic decoding of historical manuscripts (PI, Vetenskapsrådet: 2015-2017)
- SweLL - L2 infrastructure: Research Infrastructure for Swedish as a second language (RJ, 2017-2019)
- SWE-CLARIN - SWEGRAM: Automatic annotation and analysis of Swedish texts (Swedish Research Council, 2014-2018, 2019-2022)
-
- Swedish treebank
- Grammar extraction
- Basic Language Resource Kit for Swedish

Publications
Selection of publications
What Was Encoded in Historical Cipher Keys in the Early Modern Era?
Part of Proceedings of the 5th International Conference on Historical Cryptology. HistoCrypt 2022., 2022
Lost in Transcription of Graphic Signs in Ciphers
Part of Proceedings of the 5th International Conference on Historical Cryptology. HistoCrypt 2022, p. 153-158, 2022
The DECODE Database of Historical Ciphers and Keys: Version 2
Part of Proceedings of the 5th International Conference on Historical Cryptology. HistoCrypt 2022., p. 111-114, 2022
Proceedings of the 5th International Conference on Historical Cryptology
2022
Identifying Cleartext in Historical Ciphers
Part of Proceedings of the Workshop on Language Technologies for Historical and Ancient Languages. LT4HALA 2022., 2022
Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images
Part of Proceedings of the 4th International Conference on Historical Cryptology HistoCrypt 2021, 2021
- DOI for Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images
- Download full text (pdf) of Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images
Transcription of Historical Ciphers and Keys: Guidelines, version 2.0
2021
Deciphering Papal Ciphers from the 16th to the 18th Century
Part of Cryptologia, p. 479-540, 2021
- DOI for Deciphering Papal Ciphers from the 16th to the 18th Century
- Download full text (pdf) of Deciphering Papal Ciphers from the 16th to the 18th Century
Part of Proceedings of the 28th International Conference on Computational Linguistics. COLING 2020, p. 357-369, 2020
Proceedings of the 3rd International Conference on Historical Cryptology
2020
- Download full text (pdf) of Proceedings of the 3rd International Conference on Historical Cryptology
Transcription of Historical Ciphers and Keys
Part of Proceedings of the 3rd International Conference on Historical Cryptology, p. 106-115, 2020
A Web-based Interactive Transcription Tool for Encrypted Manuscripts
Part of Proceedings of the 3rd International Conference on Historical Cryptology HistoCrypt 2020, 2020
- DOI for A Web-based Interactive Transcription Tool for Encrypted Manuscripts
- Download full text (pdf) of A Web-based Interactive Transcription Tool for Encrypted Manuscripts
Decryption of historical manuscripts: the DECRYPT project
Part of Cryptologia, p. 545-559, 2020
- DOI for Decryption of historical manuscripts: the DECRYPT project
- Download full text (pdf) of Decryption of historical manuscripts: the DECRYPT project
Pseudonymization of Language Learner Data
Part of Workshop om pseudonymisering av textdata, 2019
The SweLL Language Learner Corpus: From Design to Annotation
Part of Northern European Journal of Language Technology (NEJLT), p. 67-104, 2019
- DOI for The SweLL Language Learner Corpus: From Design to Annotation
- Download full text (pdf) of The SweLL Language Learner Corpus: From Design to Annotation
Matching Keys and Encrypted Manuscripts
Part of Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa '19), 2019
Proceedings of the Workshop on NLP and Pseudonymisation
2019
The DECODE Database: Collection of Historical Ciphers and Keys
Part of Proceedings of the 2nd International Conference on Historical Cryptology, p. 69-78, 2019
Towards a Generic Unsupervised Method for Transcription of Encoded Manuscripts
Part of Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage, 2019
SWEGRAM: Annotering och analys av svenska texter
2019
Part of Proceedings of the 7th NLP4CALL, 2018
Proceedings of the 1st International Conference on Historical Cryptology: HistoCrypt 2018
2018
The HistCorp Collection of Historical Corpora and Resources
Part of DHN 2018, p. 306-320, 2018
Annotation of learner corpora: first SweLL insights
Part of Abstracts of SLTC 2018, p. 86-89, 2018
Annotating Errors in Student Texts: First Experiences and Experiments
Part of Proceedings of Joint 6th NLP4CALL and 2nd NLP4LA Nodalida workshop, p. 47-60, 2017
SWEGRAM: A Web-Based Tool for Automatic Annotation and Analysis of Swedish Texts
Part of Proceedings of the 21st Nordic Conference on Computational Linguistics, Nodalida 2017., p. 132-141, 2017
Transcription of Encoded Manuscripts with Image Processing Techniques
Part of Proceedings of Digital Humanities 2017., 2017
A Friend in Need?: Research agenda for electronic Second Language infrastructure
Part of Proceedings of SLTC 2016, 2016
The Uppsala Corpus of Student Writings: Corpus Creation, Annotation, and Analysis
Part of LREC 2016, p. 3192-3199, 2016
Proceedings of the 20th Nordic Conference of Computational Linguistics
ACL Anthology, 2015
A Multilingual Evaluation of Three Spelling Normalization Methods for Historical Text
Part of Proceedings of the 8th Workshop on Language Technologyfor Cultural Heritage, Social Sciences, and Humanities(LaTeCH), p. 32-41, 2014
Professional language in Swedish clinical text: Linguistic characterization and comparative studies
Part of Nordic Journal of Linguistics, p. 297-323, 2014
The Secrets of the Copiale Cipher
Part of Research into Freemasonry and Fraternalism, p. 314-324, 2011
Part of Proceedings of the NODALIDA 2009 workshop Nordic Perspectives on the CLARIN Infrastructure of Language Resources, p. 1-5, 2009
Part of Multilingualism, 2009
Cultivating a Swedish Treebank
Part of Resourceful Language Technology, p. 111-120, Acta Universitatis Upsaliensis, 2008
Language Resources and Tools for Swedish: A Survey
Part of Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), 2008
Single Malt or Blended? A Study in Multilingual Parser Optimization
Part of Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, p. 933-939, 2007
General-Purpose Text Categorization Applied to the Medical Domain.
2007
The Swedish-Turkish Parallel Corpus and Tools for its Creation
Part of Proceedings of NoDaLida 2007, 2007
A Study on Automatically Extracted Keywords in Text Categorization
Part of Proceedings of International Conference of Association for Computational Linguistics, 2006
Exploring the Prosody-Syntax Interface in Conversations
Part of Proceeding of the 15th International Congress of Phonetic Sciences, 2003
Part of Proceedings of Fonetik 2002, 2002
Recent publications
Keys with nomenclatures in the early modern Europe
Part of Cryptologia, p. 97-139, 2024
- DOI for Keys with nomenclatures in the early modern Europe
- Download full text (pdf) of Keys with nomenclatures in the early modern Europe
What is the Code for the Code?Historical Cryptology Terminology
Part of Proceedings of the 6th International Conference on Historical Cryptology HistoCrypt 2023, 2023
Towards Data-effective Educational Question Generation with Prompt-based Learning
Part of Proceedings of 2023 Computing Conference, 2023
2023
Historical Language Models in Cryptanalysis: Case Studies on English and German
Part of Proceedings of the 6th International Conference on Historical Cryptology HistoCrypt 2023, 2023
All publications
Articles in journal
Keys with nomenclatures in the early modern Europe
Part of Cryptologia, p. 97-139, 2024
- DOI for Keys with nomenclatures in the early modern Europe
- Download full text (pdf) of Keys with nomenclatures in the early modern Europe
Part of Pattern Recognition Letters, p. 43-49, 2022
- DOI for Few shots are all you need: A progressive learning approach for low resource handwritten text recognition
- Download full text (pdf) of Few shots are all you need: A progressive learning approach for low resource handwritten text recognition
Deciphering Papal Ciphers from the 16th to the 18th Century
Part of Cryptologia, p. 479-540, 2021
- DOI for Deciphering Papal Ciphers from the 16th to the 18th Century
- Download full text (pdf) of Deciphering Papal Ciphers from the 16th to the 18th Century
Decryption of historical manuscripts: the DECRYPT project
Part of Cryptologia, p. 545-559, 2020
- DOI for Decryption of historical manuscripts: the DECRYPT project
- Download full text (pdf) of Decryption of historical manuscripts: the DECRYPT project
The SweLL Language Learner Corpus: From Design to Annotation
Part of Northern European Journal of Language Technology (NEJLT), p. 67-104, 2019
- DOI for The SweLL Language Learner Corpus: From Design to Annotation
- Download full text (pdf) of The SweLL Language Learner Corpus: From Design to Annotation
Parallel corpora and Universal Dependencies for Turkic
Part of Turkic languages, p. 259-273, 2015
Professional language in Swedish clinical text: Linguistic characterization and comparative studies
Part of Nordic Journal of Linguistics, p. 297-323, 2014
Bootstrapping a Persian Dependency Treebank
Part of Linguistic Issues in Language Technology, 2012
The Secrets of the Copiale Cipher
Part of Research into Freemasonry and Fraternalism, p. 314-324, 2011
Shallow Parsing with PoS Taggers and Linguistic Features.
Part of Journal of Machine Learning Research: Special Issue on Shallow Parsing, p. 639-668, 2002
Chapters in book
Cultivating a Swedish Treebank
Part of Resourceful Language Technology. A Festschrift in Honor of Anna Sågvall Hein, p. 111-120, Acta Universitatis Upsaliensis, 2008
Cultivating a Swedish Treebank
Part of Resourceful Language Technology, p. 111-120, Acta Universitatis Upsaliensis, 2008
Supporting Research Environment for Less Explored Languages: A Case Study of Swedish and Turkish
Part of Resourceful Language Technology, p. 96-110, Uppsala universitet, 2008
Collections (editor)
Proceedings of the 20th Nordic Conference of Computational Linguistics
ACL Anthology, 2015
Resourceful Language Technology: Festschrift in Honor of Anna Sågvall Hein
Acta Universitatis Upsaliensis, 2008
Conference papers
What is the Code for the Code?Historical Cryptology Terminology
Part of Proceedings of the 6th International Conference on Historical Cryptology HistoCrypt 2023, 2023
Towards Data-effective Educational Question Generation with Prompt-based Learning
Part of Proceedings of 2023 Computing Conference, 2023
Historical Language Models in Cryptanalysis: Case Studies on English and German
Part of Proceedings of the 6th International Conference on Historical Cryptology HistoCrypt 2023, 2023
What Was Encoded in Historical Cipher Keys in the Early Modern Era?
Part of Proceedings of the 5th International Conference on Historical Cryptology. HistoCrypt 2022., 2022
Lost in Transcription of Graphic Signs in Ciphers
Part of Proceedings of the 5th International Conference on Historical Cryptology. HistoCrypt 2022, p. 153-158, 2022
The DECODE Database of Historical Ciphers and Keys: Version 2
Part of Proceedings of the 5th International Conference on Historical Cryptology. HistoCrypt 2022., p. 111-114, 2022
Identifying Cleartext in Historical Ciphers
Part of Proceedings of the Workshop on Language Technologies for Historical and Ancient Languages. LT4HALA 2022., 2022
Key Design in the Early Modern Era in Europe
Part of Proceedings of the 4th International Conference on Historical Cryptology (HistoCrypt 2021), 2021
- DOI for Key Design in the Early Modern Era in Europe
- Download full text (pdf) of Key Design in the Early Modern Era in Europe
Revealing Secrets from the Past: Studying Historical Ciphers.
2021
Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images
Part of Proceedings of the 4th International Conference on Historical Cryptology HistoCrypt 2021, 2021
- DOI for Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images
- Download full text (pdf) of Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images
Part of Proceedings of the 28th International Conference on Computational Linguistics. COLING 2020, p. 357-369, 2020
Transcription of Historical Ciphers and Keys
Part of Proceedings of the 3rd International Conference on Historical Cryptology, p. 106-115, 2020
A Web-based Interactive Transcription Tool for Encrypted Manuscripts
Part of Proceedings of the 3rd International Conference on Historical Cryptology HistoCrypt 2020, 2020
- DOI for A Web-based Interactive Transcription Tool for Encrypted Manuscripts
- Download full text (pdf) of A Web-based Interactive Transcription Tool for Encrypted Manuscripts
Automatic Key Structure Extraction
Part of Proceedings of the 3rd International Conference on Historical Cryptology, p. 146-152, 2020
Pseudonymization of Language Learner Data
Part of Workshop om pseudonymisering av textdata, 2019
Matching Keys and Encrypted Manuscripts
Part of Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa '19), 2019
The DECODE Database: Collection of Historical Ciphers and Keys
Part of Proceedings of the 2nd International Conference on Historical Cryptology, p. 69-78, 2019
Towards a Generic Unsupervised Method for Transcription of Encoded Manuscripts
Part of Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage, 2019
Part of Proceedings of the 7th NLP4CALL, 2018
The HistCorp Collection of Historical Corpora and Resources
Part of DHN 2018, p. 306-320, 2018
Annotation of learner corpora: first SweLL insights
Part of Abstracts of SLTC 2018, p. 86-89, 2018
Annotating Errors in Student Texts: First Experiences and Experiments
Part of Proceedings of Joint 6th NLP4CALL and 2nd NLP4LA Nodalida workshop, p. 47-60, 2017
SWEGRAM: A Web-Based Tool for Automatic Annotation and Analysis of Swedish Texts
Part of Proceedings of the 21st Nordic Conference on Computational Linguistics, Nodalida 2017., p. 132-141, 2017
Transcription of Encoded Manuscripts with Image Processing Techniques
Part of Proceedings of Digital Humanities 2017., 2017
Swe-Clarin: Language Resources and Technology for Digital Humanities
Part of Digital Humanities 2016, p. 29-51, 2016
A Friend in Need?: Research agenda for electronic Second Language infrastructure
Part of Proceedings of SLTC 2016, 2016
The Uppsala Corpus of Student Writings: Corpus Creation, Annotation, and Analysis
Part of LREC 2016, p. 3192-3199, 2016
Ranking Relevant Verb Phrases Extracted from Historical Text
Part of Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, 2015
A Multilingual Evaluation of Three Spelling Normalization Methods for Historical Text
Part of Proceedings of the 8th Workshop on Language Technologyfor Cultural Heritage, Social Sciences, and Humanities(LaTeCH), p. 32-41, 2014
Verb Phrase Extraction in a Historical Context
2014
Automatic Morphosyntactic Analaysis of Clinical Text
2014
A Multilingual Evaluation of Three Spelling Normalization Methods for Historical Text.
Part of Workshop on Language Technology for Cultural Heritage, Social Sciences and Humanities, LaTeCH 2014, 2014
EACL - Expansion of Abbreviations in CLinical text
Part of Workshop on Predicting and Improving Text Readability for Target Reader Populations, PITR 2014, 2014
Part of Proceedings of the 19th Nordic Conference on Computational Linguistics, 2013
An SMT Approach to Automatic Annotation of Historical Texts
Part of Workshop on Computational Historical Linguistics, Nodalida 2013., 2013
A Basic Language Resource Kit for Persian
Part of Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), p. 2245-2252, 2012
Rule-Based Normalisation of Historical Text – a Diachronic Study
Part of Empirical Methods in Natural Language Processing, p. 333-341, 2012
Parsing the Past - Identification of Verb Constructions in Historical Text
Part of Language Technology for Cultural Heritage, Social Sciences, and Humanities, 2012
Dependency Parsers for Persian
Part of Proceedings of 10th Workshop on Asian Language Resources, COLING 2012, 24th International Conference on Computational Linguistics, Mumbai, India, 2012
2011
Using Parallel Corpora in Data-Driven Teaching of Turkish in Sweden.
p. 1686-1689, 2010
The English-Swedish-Turkish Parallel Treebank
Part of Proceedings of Language Resources and Evaluation (LREC 2010), 2010
Part of Proceedings of the NODALIDA 2009 workshop Nordic Perspectives on the CLARIN Infrastructure of Language Resources, p. 1-5, 2009
The Open Source Tagger HunPoS for Swedish.
Part of Proceedings of the 17th Nordic Conference on Computational Linguistics (NODALIDA), 2009
Part of Multilingualism, 2009
Swedish-Turkish Parallel Treebank
Part of Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), 2008
Language Resources and Tools for Swedish: A Survey
Part of Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), 2008
Single Malt or Blended? A Study in Multilingual Parser Optimization
Part of Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, p. 933-939, 2007
Single Malt or Blended? A Study in Multilingual Parser Optimization.
Part of Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, 2007
Bootstrapping a Swedish Treebank Using Cross-Corpus Harmonization and Annotation Projection.
Part of Proceedings of Treebanks and Linguistic Theories, 2007
The Swedish-Turkish Parallel Corpus and Tools for its Creation
Part of Proceedings of NoDaLida 2007, 2007
Bootstrapping a Swedish Treebank Using Cross-Corpus Harmonization and Annotation Projection
Part of Proceedings of the 6th International Workshop on Treebanks and Linguistic Theories, p. 97-102, 2007
A Study on Automatically Extracted Keywords in Text Categorization
Part of Proceedings of International Conference of Association for Computational Linguistics, 2006
Building a Swedish-Turkish Parallel Corpus
Part of Proceedings of Language Resources and Evaluation Conference, 2006
Using Linguistic Data for Genre Classification
Part of Proceedings of SAIS-SSLS, 2005
The Acoustic and Morpho-Syntactic Context of Prosodic Boundaries in Dialogs.
Part of Proceedings of Fonetik 2003, 2003
Exploring the Prosody-Syntax Interface in Conversations
Part of Proceeding of the 15th International Congress of Phonetic Sciences, 2003
Part of Proceedings of Fonetik 2002, 2002
Part of Proceedings of ICSLP'2002 - 7th International Conference on Spoken Language Processing, 2002
Data-Driven Methods for Building a Swedish Treebank.
Part of Swedish Treebank Symposium, 2002
Silence and Discourse Context in Read Speech and Dialogues in Swedish
Part of Proceedings of the Speech Prosody 2002 conference, p. 363-366, 2002
Comparing Data-Driven Learning Algorithms for PoS Tagging of Swedish
Part of Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2001), 2001
Pausing in Dialogues and Read Speech: Speaker's Production and Listeners Interpretation
Part of Proceedings of the Workshop on Prosody in Speech Recognition and Understanding, 2001
A Comparative Study of Pauses in Dialogues and Read Speech.
Part of Proceedings of Eurospeech 2001, p. 931-935, 2001
Data-Driven Methods for PoS tagging and Chunking of Swedish
Part of In the Proceedings of the Nordic Conference on Computational Linguistics, Nodalida 2001, 2001
Phrasal Parsing by Using Data-Driven PoS Taggers
Part of Proceedings of the Conference on Recent Advances in Natural Language Processing, p. 166-173, 2001
Ensemble of Classifiers for Noise Detection in PoS Tagged Corpora
Part of Proceedings of the Third International Workshop on TEXT, SPEECH and DIALOGUE, p. 27-32, 2000
Towards a Finite-State Parser for Swedish
Part of Proceedings of NoDaLiDa 99, p. 115-123, 2000
Improving Brill's PoS Tagger for an Agglutinative Language
Part of Proceedings of the Joint Sigdat Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, p. 275-284, 1999
Brill's PoS Tagger with Extended Lexical Templates for Hungarian
Part of Proceedings of the Workshop (W01) on Machine Learning in Human Language Technology, p. 22-28, 1999
Conference proceedings (editor)
2023
Proceedings of the 5th International Conference on Historical Cryptology
2022
Proceedings of the 3rd International Conference on Historical Cryptology
2020
- Download full text (pdf) of Proceedings of the 3rd International Conference on Historical Cryptology
Proceedings of the Workshop on NLP and Pseudonymisation
2019
Proceedings of the 1st International Conference on Historical Cryptology: HistoCrypt 2018
2018
Reports
SweLL transcription guidelines, L2 essays
2021
SweLL Pseudonymization Guidelines
2021
Transcription of Historical Ciphers and Keys: Guidelines, version 2.0
2021
Transcription of Historical Ciphers and Keys: Guidelines
2020
SWEGRAM: Annotering och analys av svenska texter
2019
Survey on Swedish Language Resources
2008
The Open Source Tagger HunPoS for Swedish
2008
Supporting Research Environment for Swedish and Turkish
2008
Converting SUC2.0 to XCES with stand-off annotation
2007
Changing the tokenization in Talbanken to SUC2.0
2007
General-Purpose Text Categorization Applied to the Medical Domain.
2007