Structured Joint Decomposition (SJD) identifies conserved molecular dynamics across collections of biologically related multi-omics data matrices

Kavli Affiliate: Brian Caffo

| Authors: Huan Chen, Jinrui Liu, Shreyash Sonthalia, Genevieve Stein-OBrien, Luo Xiao, Brian Caffo and Carlo Colantuoni

| Summary:

Abstract Motivation It is necessary to develop exploratory tools to learn from the unprecedented volume of high-dimensional multi-omic data currently being produced. We have developed an R package, SJD, which identifies components of variation that are shared across multiple matrices. The approach focuses specifically on variation across the samples/cells within each dataset while incorporating biologist-defined hierarchical structure among input experiments that can span in vivo and in vitro systems, multi-omic data modalities, and species. Results SJD enables the definition of molecular variation that is conserved across systems, those that are shared within subsets of studies, and elements unique to individual matrices. We have included functions to simplify the construction and visualization of highly complex in silico experiments involving many diverse multi-omic matrices from multiple species. Here we apply SJD to decompose four RNA-seq experiments focused on neurogenesis in the neocortex. The public datasets used in this analysis and the conserved transcriptomic dynamics in mammalian neurogenesis that we define here can view viewed and explored together at https://nemoanalytics.org/p?l=ChenEtAlSJD2022&g=DCX. Availability and Implementation The SJD package can be found at https://chuansite.github.io/SJD. Contact hzchenhuan{at}gmail.com; ccolant1{at}jhmi.edu Supplementary information Supplementary data are available at Bioinformatics online. Competing Interest Statement The authors have declared no competing interest.

Read More