Multilingual Language Models: Studies of Pre-Training Approaches and Hallucination Detection

  • Date: 28 March 2025, 13:15–14:30
  • Location: English Park
  • Type: Seminar
  • Lecturer: Evangelia Gogoulou
  • Organiser: Ahmed Ruby
  • Contact person: Ahmed Ruby

Abstract: The performance of large language models varies significantly across languages, highlighting the importance of cross-lingual transfer for improving low-resource language capabilities. This PhD thesis investigates how language interactions during pre-training affect model performance across different training schemes, architectures, and evaluation criteria. Through experiments on multilingual joint pre-training and incremental language pre-training, we analyze the forward and backward transfer effects and identify key influencing factors, such as language similarity and contamination. Additionally, we evaluate multilingual models on hallucination detection tasks, revealing the impact of model-specific factors like size and instruction tuning. These findings enhance the understanding of cross-lingual transfer, guiding the development of multilingual models with improved learning capacity. Additionally, our work provides resources and methods for evaluating hallucinations in machine-generated text. The comprehensive summary of my thesis can be found here: https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-356567

FOLLOW UPPSALA UNIVERSITY ON

Uppsala University on Facebook
Uppsala University on Instagram
Uppsala University on Youtube
Uppsala University on Linkedin