An Efficient Multilingual Language Model Compression through Vocabulary Trimming

Kavli Affiliate: Yi Zhou

| First 5 Authors: Asahi Ushio, Yi Zhou, Jose Camacho-Collados

| Summary:

Multilingual language model (LM) have become a powerful tool in NLP
especially for non-English languages. Nevertheless, model parameters of
multilingual LMs remain large due to the larger embedding matrix of the
vocabulary covering tokens in different languages. On the contrary, monolingual
LMs can be trained in a target language with the language-specific vocabulary
only, but this requires a large budget and availability of reliable corpora to
achieve a high-quality LM from scratch. In this paper, we propose
vocabulary-trimming (VT), a method to reduce a multilingual LM vocabulary to a
target language by deleting irrelevant tokens from its vocabulary. In theory,
VT can compress any existing multilingual LM to build monolingual LMs in any
language covered by the multilingual LM. In our experiments, we show that VT
can retain the original performance of the multilingual LM, while being smaller
in size (in general around 50% of the original vocabulary size is enough) than
the original multilingual LM. The evaluation is performed over four NLP tasks
(two generative and two classification tasks) among four widely used
multilingual LMs in seven languages. Finally, we show that this methodology can
keep the best of both monolingual and multilingual worlds by keeping a small
size as monolingual models without the need for specifically retraining them,
and even limiting potentially harmful social biases.

| Search Query: [#feed_custom_title]

Read More