A Predictive Factor Analysis of Social Biases and Task-Performance in Pretrained Masked Language Models

Kavli Affiliate: Yi Zhou

| First 5 Authors: Yi Zhou, Jose Camacho-Collados, Danushka Bollegala, ,

| Summary:

Various types of social biases have been reported with pretrained Masked
Language Models (MLMs) in prior work. However, multiple underlying factors are
associated with an MLM such as its model size, size of the training data,
training objectives, the domain from which pretraining data is sampled,
tokenization, and languages present in the pretrained corpora, to name a few.
It remains unclear as to which of those factors influence social biases that
are learned by MLMs. To study the relationship between model factors and the
social biases learned by an MLM, as well as the downstream task performance of
the model, we conduct a comprehensive study over 39 pretrained MLMs covering
different model sizes, training objectives, tokenization methods, training data
domains and languages. Our results shed light on important factors often
neglected in prior literature, such as tokenization or model objectives.

| Search Query: ArXiv Query: search_query=au:”Yi Zhou”&id_list=&start=0&max_results=3