Kavli Affiliate: Li Zhao
| Authors: Chuwen Zhang, Yong He, Jieni Wang, Tengkai Chen, Federico Baltar, Minjie Hu, Jing Liao, Xi Xiao, Zhao-Rong Li and Xiyang Dong
| Summary:
Phosphorus is essential for life and critically influences marine productivity. Despite geochemical evidence of active phosphorus cycling in deep-sea cold seep ecosystems, the microbial processes involved remain poorly understood. Traditional sequence-based searches often fail to detect proteins with remote homology. To address this, we developed a protein language model (PLM) named LucaPCycle that integrates sequence and structural information. This PLM-based approach identified 4,606 new phosphorus-cycling protein families based on the non-redundant gene and genome catalogs from global cold seeps, substantially enhancing our understanding of their diversity, ecology, and function. Among previously unannotated sequences, we discovered three novel alkaline phosphatase families (ALP1, ALP2 and ALP3) that feature unique domain organizations and preserved enzymatic capabilities. These results highlight previously overlooked ecological importance of phosphorus cycling within cold seeps, corroborated by data from porewater geochemistry, metatranscriptomics, and metabolomics. We identified a previously unrecognized diversity of archaea contributing to organic phosphorus mineralization and inorganic phosphorus solubilization through various mechanisms. This includes ecologically significant groups such as Asgardarchaeota, anaerobic methanotrophic archaea (ANME), and Thermoproteota. Additionally, viruses can enhance their hosts’ (e.g., ANME) phosphorus utilization through the PhoR-PhoB regulatory system and PhnCDE transporter, indirectly influencing methane dynamics. Overall, our PLM-based functional predictions are capable of accessing previously ’hidden’ sequence spaces for microbial phosphorus cycling, and can be applied to other various ecosystems.