Telomere-to-telomere sheep genome assembly reveals new variants associated with wool fineness trait

Kavli Affiliate: Wei Min

| Authors: Ling-Yun Luo, Hui Wu, Li-Ming Zhao, Ya-Hui Zhang, Jia-Hui Huang, Qiu-Yue Liu, Hai-Tao Wang, Dong-Xin Mo, He-Hua EEr, Lian-Quan Zhang, Hai-Liang Chen, Shan-Gang Jia, Wei-Min Wang and Meng-Hua Li

| Summary:

Ongoing efforts to improve sheep reference genome assemblies still leave many gaps and incomplete regions, resulting in a few common failures and errors in sheep genomic studies. Here, we report a complete, gap-free telomere-to-telomere (T2T) genome of a ram (T2T-sheep1.0) with a size of 2.85 Gb, including all autosomes and chromosomes X and Y. It adds 220.05 Mb of previously unresolved regions (PURs) and 754 new genes to the most updated reference assembly, ARS-UI_Ramb_v3.0, and contains four types of repeat units (SatI, SatII, SatIII, and CenY) in the centromeric regions. T2T-sheep1.0 exhibits a base accuracy of >99.999%, corrects several structural errors in previous reference assemblies, and improves structural variant (SV) detection in repetitive sequences. We identified 192,265 SVs, including 16,885 new SVs in the PURs, from the PacBio long-read sequences of 18 global representative sheep. With the whole-genome short-read sequences of 810 wild and domestic sheep representing 158 global populations and seven wild species, the use of T2T-sheep1.0 as the reference genome has improved population genetic analysis based on ∼133.31 million SNPs and 1,265,266 SVs, including 2,664,979 novel SNPs and 196,471 novel SVs. T2T-sheep1.0 improves selective tests by detecting several novel genes and variants, including those associated with domestication (e.g., ABCC4) and selection for the wool fineness trait (e.g., FOXQ1) in tandemly duplicated regions.

Read More