Multilingual Language Models: Studies of Pre-Training Approaches and Hallucination Detection

  • Datum: 28 mars 2025, kl. 13.15–14.30
  • Plats: Engelska parken, Room: 9-3042 and https://uu-se.zoom.us/j/63206988606
  • Typ: Seminarium
  • Föreläsare: Evangelia Gogoulou
  • Arrangör: Ahmed Ruby
  • Kontaktperson: Ahmed Ruby

Abstract: The performance of large language models varies significantly across languages, highlighting the importance of cross-lingual transfer for improving low-resource language capabilities. This PhD thesis investigates how language interactions during pre-training affect model performance across different training schemes, architectures, and evaluation criteria. Through experiments on multilingual joint pre-training and incremental language pre-training, we analyze the forward and backward transfer effects and identify key influencing factors, such as language similarity and contamination. Additionally, we evaluate multilingual models on hallucination detection tasks, revealing the impact of model-specific factors like size and instruction tuning. These findings enhance the understanding of cross-lingual transfer, guiding the development of multilingual models with improved learning capacity. Additionally, our work provides resources and methods for evaluating hallucinations in machine-generated text. The comprehensive summary of my thesis can be found here: https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-356567

FÖLJ UPPSALA UNIVERSITET PÅ

Uppsala universitet på facebook
Uppsala universitet på Instagram
Uppsala universitet på Youtube
Uppsala universitet på Linkedin