PhaGO: Protein function annotation for bacteriophages by integrating the genomic context

Kavli Affiliate: Cheng Peng

| First 5 Authors: Jiaojiao Guan, Yongxin Ji, Cheng Peng, Wei Zou, Xubo Tang

| Summary:

Bacteriophages are viruses that target bacteria, playing a crucial role in
microbial ecology. Phage proteins are important in understanding phage biology,
such as virus infection, replication, and evolution. Although a large number of
new phages have been identified via metagenomic sequencing, many of them have
limited protein function annotation. Accurate function annotation of phage
proteins presents several challenges, including their inherent diversity and
the scarcity of annotated ones. Existing tools have yet to fully leverage the
unique properties of phages in annotating protein functions. In this work, we
propose a new protein function annotation tool for phages by leveraging the
modular genomic structure of phage genomes. By employing embeddings from the
latest protein foundation models and Transformer to capture contextual
information between proteins in phage genomes, PhaGO surpasses state-of-the-art
methods in annotating diverged proteins and proteins with uncommon functions by
6.78% and 13.05% improvement, respectively. PhaGO can annotate proteins lacking
homology search results, which is critical for characterizing the rapidly
accumulating phage genomes. We demonstrate the utility of PhaGO by identifying
688 potential holins in phages, which exhibit high structural conservation with
known holins. The results show the potential of PhaGO to extend our
understanding of newly discovered phages.

| Search Query: ArXiv Query: search_query=au:”Cheng Peng”&id_list=&start=0&max_results=3

Read More