WheaCha: A Method for Explaining the Predictions of Models of Code

Kavli Affiliate: Ke Wang

| First 5 Authors: Yu Wang, Ke Wang, Linzhang Wang, ,

| Summary:

Attribution methods have emerged as a popular approach to interpreting model
predictions based on the relevance of input features. Although the feature
importance ranking can provide insights of how models arrive at a prediction
from a raw input, they do not give a clear-cut definition of the key features
models use for the prediction. In this paper, we present a new method, called
WheaCha, for explaining the predictions of code models. Although WheaCha
employs the same mechanism of tracing model predictions back to the input
features, it differs from all existing attribution methods in crucial ways.
Specifically, WheaCha divides an input program into "wheat" (i.e., the defining
features that are the reason for which models predict the label that they
predict) and the rest "chaff" for any prediction of a learned code model. We
realize WheaCha in a tool, HuoYan, and use it to explain four prominent code
models: code2vec, seq-GNN, GGNN, and CodeBERT. Results show (1) HuoYan is
efficient – taking on average under twenty seconds to compute the wheat for an
input program in an end-to-end fashion (i.e., including model prediction time);
(2) the wheat that all models use to predict input programs is made of simple
syntactic or even lexical properties (i.e., identifier names); (3) Based on
wheat, we present a novel approach to explaining the predictions of code models
through the lens of training data.

| Search Query: ArXiv Query: search_query=au:”Ke Wang”&id_list=&start=0&max_results=10

Read More