Kavli Affiliate: Loyal Goff
| Authors: Shijie C. Zheng, Genevieve Stein-O’Brien, Leandros Boukas, Loyal A Goff and Kasper D Hansen
| Summary:
RNA velocity analysis of single cells promises to predict temporal dynamics from gene expression. Indeed, in many systems, it has been observed that RNA velocity produces a vector field which qualitatively reflects known features of the system. Despite this observation, the limitations of RNA velocity estimates are poorly understood. Using real data and simulations, we dissect the impact of different steps in the RNA velocity workflow on the estimated vector field. We find that the process of mapping RNA velocity estimates into a low-dimensional representation, such as those produced by UMAP, has a large impact on the result. The RNA velocity vector field is strongly dependent on the k-NN graph of the data. This dependence lead to large estimator errors when the k-NN graph is not a faithful representation of the true data structure; a feature which cannot be known for most real datasets. Finally, we establish that RNA velocity does not estimate expression speed neither at the gene nor at the cellular level. We propose that RNA velocity is best considered a smoothed interpolation of the observed k-NN structure, as opposed to an extrapolation of future cellular states, and that the use of RNA velocity as a validation of latent space embedding structures is circular.