Identifiability Matters: Revealing the Hidden Recoverable Condition in Unbiased Learning to Rank

Kavli Affiliate: Zhuo Li

| First 5 Authors: Mouxiang Chen, Chenghao Liu, Zemin Liu, Zhuo Li, Jianling Sun

| Summary:

Unbiased Learning to Rank (ULTR) aims to train unbiased ranking models from
biased click logs, by explicitly modeling a generation process for user
behavior and fitting click data based on examination hypothesis. Previous
research found empirically that the true latent relevance is mostly recoverable
through click fitting. However, we demonstrate that this is not always
achievable, resulting in a significant reduction in ranking performance. This
research investigates the conditions under which relevance can be recovered
from click data in the first principle. We initially characterize a ranking
model as identifiable if it can recover the true relevance up to a scaling
transformation, a criterion sufficient for the pairwise ranking objective.
Subsequently, we investigate an equivalent condition for identifiability,
articulated as a graph connectivity test problem: the recovery of relevance is
feasible if and only if the identifiability graph (IG), derived from the
underlying structure of the dataset, is connected. The presence of a
disconnected IG may lead to degenerate cases and suboptimal ranking
performance. To tackle this challenge, we introduce two methods, namely node
intervention and node merging, designed to modify the dataset and restore the
connectivity of the IG. Empirical results derived from a simulated dataset and
two real-world LTR benchmark datasets not only validate our proposed theory but
also demonstrate the effectiveness of our methods in alleviating data bias when
the relevance model is unidentifiable.

| Search Query: ArXiv Query: search_query=au:”Zhuo Li”&id_list=&start=0&max_results=3