Kavli Affiliate: Yi Zhou | First 5 Authors: Saeth Wannasuphoprasit, Yi Zhou, Danushka Bollegala | Summary: Cosine similarity between two words, computed using their contextualised token embeddings obtained from masked language models (MLMs) such as BERT has shown to underestimate the actual similarity between those words (Zhou et al., 2022). This similarity underestimation problem is […]
Continue.. Solving Cosine Similarity Underestimation between High Frequency Words by L2 Norm Discounting