Kavli Affiliate: Loyal Goff
| Authors: Shijie C. Zheng, Genevieve Stein-O’Brien, Leandros Boukas, Loyal A Goff and Kasper D Hansen
| Summary:
RNA velocity analysis of single cells promises to predict temporal dynamics from gene expression. Indeed, in many systems, it has been observed that RNA velocity produces a vector field that qualitatively reflects known features of the system. Despite this observation, the limitations of RNA velocity estimates are poorly understood. Using real data and simulations, we dissect the impact of different steps in the RNA velocity workflow on the estimated vector field. We find that the process of mapping RNA velocity estimates into a low-dimensional representation, such as those produced by UMAP, has a large impact on the result. The RNA velocity vector field strongly depends on the k-NN graph of the data. This dependence leads to significant estimator errors when the k-NN graph is not a faithful representation of the true data structure, a feature that cannot be known for most real datasets. Finally, we establish that RNA velocity estimates expression speed neither at the gene nor cellular level. We propose that RNA velocity is best considered a smoothed interpolation of the observed k-NN structure, as opposed to an extrapolation of future cellular states, and that the use of RNA velocity as a validation of latent space embedding structures is circular.