Kavli Affiliate: Xiang Zhang
| First 5 Authors: Xiang Zhang, Senyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak
| Summary:
Large Language Models (LLMs) have demonstrated exceptional natural language
understanding abilities and have excelled in a variety of natural language
processing (NLP)tasks in recent years. Despite the fact that most LLMs are
trained predominantly in English, multiple studies have demonstrated their
comparative performance in many other languages. However, fundamental questions
persist regarding how LLMs acquire their multi-lingual abilities and how
performance varies across different languages. These inquiries are crucial for
the study of LLMs since users and researchers often come from diverse language
backgrounds, potentially influencing their utilization and interpretation of
LLMs’ results. In this work, we propose a systematic way of qualifying the
performance disparities of LLMs under multilingual settings. We investigate the
phenomenon of across-language generalizations in LLMs, wherein insufficient
multi-lingual training data leads to advanced multi-lingual capabilities. To
accomplish this, we employ a novel back-translation-based prompting method. The
results show that GPT exhibits highly translating-like behaviour in
multilingual settings.
| Search Query: ArXiv Query: search_query=au:”Xiang Zhang”&id_list=&start=0&max_results=3