Solving the “Blind men and the elephant problem”: Additive deep learning of complex high dimensional models from partial faceted datasets

Kavli Affiliate: Denis Wirtz

| Authors: Yufei Wu, Pei-Hsun Wu, Allison Chambliss, Denis Wirtz and Sean Sun

| Summary:

Summary: Biological systems are complex networks involving tens of thousands of interacting molecular components, and measurable biological functions are emerging properties of these complex networks. Many quantitative studies in biology attempt to connect biological function with molecular components and genes, in the process developing mechanistic understanding. However, it is challenging to quantify the contribution of all components to the biological function simultaneously, especially at the single cell level. Instead, in typical experiments, only a subset of the variables (or facet) is measured. This makes it difficult to obtain a complete and unbiased understanding of the network and how different components of the network cooperatively contribute to the biological function. In this paper, we explore a machine learning approach to combine different facets of data and obtain a complete picture of the biological system based on conditional distributions from faceted data subsets. Both a polynomial regression approach and a neural network approach are developed and examined with two set of concrete examples: A mechanical spring network system deforming under external forces and a small (8-dimensions) biological network including the cellular senescence marker P53. In the later example, single cell data is collected to validate the machine learning approach. We find that the full system is successfully reconstructed from faceted data in both examples. We further discuss the additive property of the model, where the model predictive accuracy increases with increasing number of simultaneously measured variables (dimension of subsets). Our model provides a systematic and novel approach to integrate different pieces of experimental information to reconstruct complex high dimensional systems, arriving at an unbiased and wholistic model of biological function.

Read More