Statistically valid explainable black-box machine learning: applications in sex classification across species using brain imaging

Kavli Affiliate: Joshua Vogelstein

| Authors: Tingshan Liu, Jayanta Dey, Beiya Xu, Samuel S Alldritt, Karl-Heinz Nenning, Kyoungseob Byeon, Ting Xu and Joshua T. Vogelstein

| Summary:

Sex classification using neuroimaging data has the potential to revolutionize personalized diagnostics by revealing subtle structural brain differences that underlie sex-specific disease risks. Despite the promise of machine learning, traditional methods often fall short in providing both high classification accuracy and interpretable, statistically validated feature importance scores for high-dimensional imaging data. This gap is particularly evident when conventional techniques such as random forests, LIME, and SHAP are applied, as they struggle with complex feature interactions and managing noise in large datasets. We address this challenge by developing an integrated framework that combines Oblique Random Forests (ORFs) with a novel, permutation-based feature importance testing algorithm. ORFs extend traditional random forests by employing oblique decision boundaries through linear combinations of features, thereby capturing intricate interactions inherent in neuroimaging data. Our feature importance testing method, NEOFIT, rigorously quantifies the significance of each feature by generating null distributions and corrected p-values. We first validate our approach using simulated datasets, establishing its robustness and scalability under controlled conditions. We then apply our method to classify sex from both voxel-wise structural MRI and cortical thickness data in humans and macaques, facilitating direct cross-species comparisons. Our results demonstrate that the proposed framework not only enhances classification performance but also provides clear, interpretable insights into the neuroanatomical features that distinguish sexes. These methodological advancements pave the way for improved diagnostic tools and contribute to a deeper understanding of the evolutionary basis of sex differences in brain structure.

Read More