Fine-Grained Language Relatedness for Zero-Shot Silesian-English Translation

Warning

This publication doesn't include Institute of Computer Science. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

SIGNORONI Edoardo

Year of publication 2023
Type Article in Proceedings
Conference RASLAN 2023 Recent Advances in Slavonic Natural Language Processing
MU Faculty or unit

Faculty of Informatics

Citation
web https://nlp.fi.muni.cz/raslan/raslan23.pdf#page=153
Keywords machine translation;large language models;English;Silesian;evaluation;zero-shot
Attached files
Description When parallel corpora are not available to train or fine-tune Machine Translation (MT) systems, one solution is to use data from a related language, and operate in a zero-shot setting. We explore the behaviour and performance of two pre-trained Large Language Models (LLMs) for zero-shot Silesian-English translation, by fine-tuning them on increasingly related languages. Our experiment shows that using data from related languages generally improves the zero-shot translation performance for our language pair, but the optimal fine-grained choice inside the Slavic language family is non-trivial and depends on the model characteristics.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info