Volume 41 Issue 4
Jul.  2024
Turn off MathJax
Article Contents

WEI Ya’nan, GONG Minggui, BAI Na, SU Jiajie, JIANG Xia. Analysis of codon preference in chloroplast genome of Dendrocalamus farinosus[J]. Journal of Zhejiang A&F University, 2024, 41(4): 696-705. doi: 10.11833/j.issn.2095-0756.20230498
Citation: WEI Ya’nan, GONG Minggui, BAI Na, SU Jiajie, JIANG Xia. Analysis of codon preference in chloroplast genome of Dendrocalamus farinosus[J]. Journal of Zhejiang A&F University, 2024, 41(4): 696-705. doi: 10.11833/j.issn.2095-0756.20230498

Analysis of codon preference in chloroplast genome of Dendrocalamus farinosus

doi: 10.11833/j.issn.2095-0756.20230498
  • Received Date: 2023-10-08
  • Accepted Date: 2024-03-11
  • Rev Recd Date: 2024-03-06
  • Available Online: 2024-07-12
  • Publish Date: 2024-07-12
  •   Objective  The objective is to explore the preferred usage patterns of chloroplast genome codon in Dendrocalamus farinosus, analyze the main reasons affecting the codon usage preference of D. farinosus, and determine the optimal codon, so as to provide reference for chloroplast genomics research in Bambusoideae plants.   Method  According to the GenBank login number MZ681865.156, 85 chloroplast gene sequences of D. farinosus were downloaded from National Center for Biotechnology Information (NCBI) database in the United States. CodonW, CUSP and R language software were used to analyze the effective number of codons (ENC), adaptation index (CAI) and relative synonymous codon usage (RSCU). Correspondence analysis of RSCU was performed and the codons were sorted based on ENC and RSCU values.   Result  The average ratio of guanine (G) and cytosine (C) in the chloroplast genome codon (GC ratio) was 39.48%, with GC1 (47.69%)>GC2 (39.70%)>GC3 (31.05%), and the last codon base preferred to end in A/U. The majority of ENC value was above 35, and CAI value was 0.167, so the codon preference was weak. Neutral plot analysis, ENC-plot and PR2-plot analysis showed that natural selection was the main factor affecting the codon preference of the chloroplast genome of D. farinosus. A total of 18 codons, including GCA, GCU, UUC, and GGU, were identified as the optimal codons for the chloroplast genome of D. farinosus.   Conclusion  Natural selection is the main factor contributing to the codon preference of the chloroplast genome in D. farinosus, and 18 optimal codons such as GCU, GAU, and GGU are screened for the chloroplast genome of D. farinosus. [Ch, 5 fig. 5 tab. 30 ref.]
  • [1] LI Yan, SHU Jinping, HUA Keda, ZHANG Yabo, YING Yue, ZHANG Wei.  Sequencing and analysis of the complete mitochondrial genome of Garella ruficirra . Journal of Zhejiang A&F University, 2024, 41(4): 724-734. doi: 10.11833/j.issn.2095-0756.20240138
    [2] DUAN Chunyan, WANG Xiaoling.  Genetic characteristics of whole chloroplast genome in Prunus triloba ‘Multiplex’ . Journal of Zhejiang A&F University, 2024, 41(3): 577-585. doi: 10.11833/j.issn.2095-0756.20230489
    [3] WU Minhua, YE Xiaoxia, TAN Jingyi, LIANG Qiuting, WU Zijian, HUANG Qionglin.  Analysis on chloroplast genome of Wikstroemia indica . Journal of Zhejiang A&F University, 2024, 41(2): 297-305. doi: 10.11833/j.issn.2095-0756.20230412
    [4] HONG Senrong, ZHANG Mutong, XU Zilin, ZHANG Qinrong, LUO Yuxin, TIAN Wenhui, WANG Xinyu.  Chloroplast genome characteristics and codon usage preference of Solanum tuberosum var. cormosus ‘Huaiyushan’ . Journal of Zhejiang A&F University, 2024, 41(1): 92-103. doi: 10.11833/j.issn.2095-0756.20230169
    [5] WANG Jie, HE Wenchuang, XIANG Kunli, WU Zhiqiang, GU Cuihua.  Advances in plant phylogeny in the genome era . Journal of Zhejiang A&F University, 2023, 40(1): 227-236. doi: 10.11833/j.issn.2095-0756.20220313
    [6] ZHOU Peina, DANG Jingjie, SHAO Yongfang, SHI Zunrui, ZHANG Lin, LIU Chanchan, WU Qi’nan.  Genome-wide identification and expression analysis of HD-Zip gene family in Schizonepeta tenuifolia . Journal of Zhejiang A&F University, 2023, 40(1): 12-21. doi: 10.11833/j.issn.2095-0756.20220390
    [7] LIU Jun, LI Long, CHEN Yulong, LIU Yan, WU Yaosong, REN Shanshan.  Genome-wide identification, system evolution and expression pattern analysis of CONSTANS-like in Eucommia ulmoides . Journal of Zhejiang A&F University, 2022, 39(3): 475-485. doi: 10.11833/j.issn.2095-0756.20210385
    [8] HUANG Yuancheng, GUO Wenlei, WANG Zhengjia.  Genome-wide identification and bioinformatics analysis of LBD family of transcription factors in Carya illinoensis . Journal of Zhejiang A&F University, 2021, 38(3): 464-475. doi: 10.11833/j.issn.2095-0756.20200454
    [9] RUAN Shiyu, ZHANG Zhijun, CHEN Jialu, MA Ruifang, ZHU Fengxiao, LIU Xiaoyu.  Genome identification and expression analysis of GRF gene family in Phyllostachys edulis . Journal of Zhejiang A&F University, 2021, 38(4): 792-801. doi: 10.11833/j.issn.2095-0756.20200544
    [10] CHEN Rongfen, HUANG Jianqin, CHEN Rong, XU Chuanmei.  On the application of flow cytometry in the genomic size determination of bamboo plants . Journal of Zhejiang A&F University, 2021, 38(1): 103-111. doi: 10.11833/j.issn.2095-0756.20200212
    [11] ZHENG Gang, GU Cuihua, LIN Lin, WANG Jie.  Codon usage bias analysis of rbcL genes of 20 Lythraceae species . Journal of Zhejiang A&F University, 2021, 38(3): 476-484. doi: 10.11833/j.issn.2095-0756.20200390
    [12] JIN Jing, XIE Rong, LI Xia, ZHOU Liujiang, DU Yongbin, GU Yutong, FAN Jianting.  Effects of three different host plants on the feeding preference and developmental status of Ceracris kiangsu . Journal of Zhejiang A&F University, 2020, 37(6): 1143-1148. doi: 10.11833/j.issn.2095-0756.20190723
    [13] LI Siqiao, WEI Yi, LIU Hongyu, ZHANG Zhidong, ZHANG Ye, WANG Lihua, LIU Yulin.  Development of chloroplast SSR markers of Zanthoxylum bungeanum and their generality for interspecies and intraspecies . Journal of Zhejiang A&F University, 2019, 36(6): 1241-1246. doi: 10.11833/j.issn.2095-0756.2019.06.023
    [14] HUANG Xiaoyu, XU Zaien, GUO Xiaoqin.  Synonymous codon bias of Phyllostachys edulis . Journal of Zhejiang A&F University, 2017, 34(1): 120-128. doi: 10.11833/j.issn.2095-0756.2017.01.017
    [15] DU Haihui, SUN Fangli, JIANG Shenxue.  Anti-mold performance of strand woven sympodial bamboo . Journal of Zhejiang A&F University, 2013, 30(1): 95-99. doi: 10.11833/j.issn.2095-0756.2013.01.014
    [16] XU Chen-lu, ZHANG Shou-gong, SUN Xiao-mei.  Conifer genomic resources and its applications in conifer genetics breeding . Journal of Zhejiang A&F University, 2012, 29(5): 768-777. doi: 10.11833/j.issn.2095-0756.2012.05.021
    [17] WANG Ce, QIN Jing-jing, GAN Hong-hao, LI Hong, LUO Zhi-bin.  Genome-wide analysis of the phosphate transporter gene family in Populus trichocarpa . Journal of Zhejiang A&F University, 2012, 29(4): 516-526. doi: 10.11833/j.issn.2095-0756.2012.04.006
    [18] GUAN Yu, YANG Yang, ZHANG Zhi-jun, LUO Shu-ping, TANG Ding-qin.  Construction of a large genomic DNA fragments, BIBAC library for Phyllostachys pubescens . Journal of Zhejiang A&F University, 2011, 28(4): 527-532. doi: 10.11833/j.issn.2095-0756.2011.04.001
    [19] XIE Yi-qing, LI Zhi-zhen, HUANG Ru-zhu, XIAO Xiang-xi, WANG Zhi-jie.  Comparison of methods of extracting genomic DNA from Betula luminifera . Journal of Zhejiang A&F University, 2006, 23(6): 664-668.
    [20] Wu Bingsheng, Xia Yufang, Fu Maoyi, Zhang Jiaxian, Zhou Wei..  Chemical Composition of Bambusa distegia Wood. . Journal of Zhejiang A&F University, 1995, 12(3): 281-285.
  • [1]
    ZHOU Tao, YANG Lin, SHU Junxia, et al. Analysis of codon bias in the chloroplast genome of three of Michelia spp. species [J]. Journal of West China Forestry Science, 2022, 51(3): 91 − 100.
    [2]
    ANDARGIE M, ZHU Congyi. Genome-wide analysis of codon usage in sesame (Sesamum indicum L. ) [J/OL]. Heliyon, 2022, 8(1): e8687[2023-09-08]. doi: 10.1016/j.heliyon.2021.e08687.
    [3]
    MENG Yi, LI Jing, DU Shaobing, et al. Analysis of chloroplast genome characteristics and codon preference of 17 species of Rhamnaceae [J/OL]. Molecular Plant Breeding, 2023-08-22[2023-09-08]. https://link.cnki.net/urlid/46.1068.S.20230822.1011.002.
    [4]
    XIN Yaxuan, LI Ruozhu, LI Xin, et al. Analysis on codon usage bias of chloroplast genome in Mangifera indica [J]. Journal of Central South University of Forestry &Technology, 2021, 41(9): 148 − 156, 165.
    [5]
    PARVATHY S T, UDAYASURIYAN V, BHADANA V. Codon usage bias [J]. Molecular Biology Reports, 2022, 49(1): 539 − 565.
    [6]
    CHAKRABORTY S, YENGKHOM S, UDDIN A. Analysis of codon usage bias of chloroplast genes in Oryza species [J/OL]. Planta, 2020, 252(4): 67[2023-09-08]. doi: 10.1007/s00425-020-03470-7.
    [7]
    LI Changle, ZHOU Ling, NIE Jiangbo, et al. Codon usage bias and genetic diversity in chloroplast genomes of Elaeagnus species (Myrtiflorae: Elaeagnaceae) [J]. Physiology and Molecular Biology of Plants, 2023, 29(2): 239 − 251.
    [8]
    WU Peng, XIAO Wenqi, LUO Yingyong, et al. Comprehensive analysis of codon bias in 13 Ganoderma mitochondrial genomes [J/OL]. Frontiers in Microbiology, 2023, 14: 1170790[2023-09-08]. doi: 10.3389/fmicb.2023.1170790.
    [9]
    GENG Xiaoshan, HUANG Ning, ZHU Yuling, et al. Codon usage bias analysis of the chloroplast genome of cassava [J]. South African Journal of Botany, 2022, 151: 970 − 975.
    [10]
    WANG Shenchang, HU Shanglian, CAO Ying, et al. High-throughput RNA-seg and analysis on differential expressed gene from Dendrocalamus farinosus [J]. Acta Agriculturae Boreali-Sinica, 2016, 31(3): 65 − 71.
    [11]
    JIANG Xia, ZHOU Hua, YAN Yuying, et al. Culm structure and above-ground biomass allocation of Dendrocalamus farinosus and Bambusa rigida [J]. Guizhou Forestry Science and Technology, 2023, 51(1): 39 − 43.
    [12]
    JIANG Yue, DENG Fei, WANG Hualin, et al. An extenxive analysis on the global codon usage pattern of baculoviruses [J]. Archives of Virology, 2008, 153(12): 2273 − 2282.
    [13]
    YUAN Xiaolong, WANG Yi, ZHANG Jinfeng. Characterization of codon usage in Cipadessa cinerascens chloroplast genome [J]. Journal of Forest and Environment, 2020, 40(2): 195 − 202.
    [14]
    MAZUMDER G A, UDDIN A, CHAKRABORTY S. Analysis of codon usage bias in mitochondrial CO gene among platyhelminthes [J/OL]. Molecular & Biochemical Parasitology, 2021, 245: 111410[2023-09-08]. doi: 10.1016/j.molbiopara.2021.111410.
    [15]
    QIN Zheng, ZHENG Yongjie, GUI Lijing, et al. Codon usage bias analysis of chloroplast genome of camphora tree (Cinnamomum camphora) [J]. Guihaia, 2018, 38(10): 1346 − 1355.
    [16]
    BEGUM N S, CHAKRABORTY S. Influencing elements of codon usage bias in Birnaviridae and its evolutionary analysis [J/OL]. Virus Research, 2022, 310: 198672[2023-09-08]. doi: 10.1016/j.virusres.2021.198672.
    [17]
    TANG Xiaofen, CHEN Li, MA Yutao. Review and prospect of the principle and methods quantifying codon usage bias [J]. Genomics and Applied Biology, 2013, 32(5): 660 − 666.
    [18]
    LU Qifeng, LUO Wenhua. Analysis of codon usage bias in chloroplast genome of Begonia guangxiensis [J/OL]. Molecular Plant Breeding, 2023, 2023-09-05[2023-09-08]. https://link.cnki.net/urlid/46.1068.S.20230905.0920.002.
    [19]
    YUAN Xiaolong, LI Yunqin, ZHANG Jinfeng, et al. Codon usage bias analysis of chloroplast genome in Vitellaria paradoxa [J]. Molecular Plant Breeding, 2020, 18(17): 5658 − 5664.
    [20]
    SHEN Lianwen, TIAN Jinhong, WANG Yuchang, et al. Analysis of codon usage bias (CUB) in the chloroplast genomes of 2 Yulania species [J]. Journal of Southwest Forestry University (Natural Sciences), 2023, 43(2): 44 − 53.
    [21]
    WANG Pengliang, WU Shuangcheng, YANG Liping, et al. Analysis of codon bias of chloroplast genome in Eucalyptus grandis [J]. Guihaia, 2019, 39(12): 1583 − 1592.
    [22]
    HU Xiaoyan, XU Yanqiu, HAN Youzhi, et al. Codon usage bias analysis of the chloroplast genome of Ziziphus jujuba var. spinosa [J]. Journal of Forest and Environment, 2019, 39(6): 621 − 628.
    [23]
    LI Jiangfei, WANG Yu, YAN Tingyu, et al. Analysis on codon usage bias of Keteleeria evelyniana chloroplast genome [J]. Journal of Central South University of Forestry &Technology, 2022, 42(4): 30 − 39.
    [24]
    LIU Xingyue, HE Zhongjian, QIU Yimin. Analysis of codon bias in the chloroplast genome of four Rosaceae fruit trees [J]. Molecular Plant Breeding, 2022, 20(16): 5299 − 5308.
    [25]
    ZHOU Meng, LONG Wei, LI Xia. Analysis of synonymous codon usage in chloroplast genome of Populus alba [J]. Journal of Forestry Research, 2008, 19(4): 293 − 297.
    [26]
    WANG Zhanjun, DING Liang, CAI Qianwen, et al. Comparison of codon preference patterns and variation sources in Manihot esculenta Crantz genomes [J]. Chinese Journal of Applied and Environmental, 2021, 27(4): 1013 − 1021.
    [27]
    LI Jiangfei, LI Yaqi, TANG Junrong, et al. Comparison of codon preference patterns in the chloroplast genome of Pinus densata [J]. Journal of Biology, 2023, 40(1): 52 − 59.
    [28]
    LI Jiangping, QIN Zheng, GUO Chunce, et al. Codon bias in the chloroplast genome of Gelidocalamus tessellatus [J]. Journal of Bamboo Research, 2019, 38(2): 79 − 87.
    [29]
    HUANG Xiaoyu, XU Zai’en, GUO Xiaoqin. Synonymous codon bias of Phyllostachys edulis [J]. Journal of Zhejiang A&F University, 2017, 34(1): 120 − 128.
    [30]
    MA Xingliang, ZHU Qinlong, CHEN Yuanling, et al. CRISPRICas9 platforms for genome editing in plants: developments and applications [J]. Molecular Plant, 2016, 9(7): 961 − 974.
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures(5)  / Tables(5)

Article views(88) PDF downloads(9) Cited by()

Related
Proportional views

Analysis of codon preference in chloroplast genome of Dendrocalamus farinosus

doi: 10.11833/j.issn.2095-0756.20230498

Abstract:   Objective  The objective is to explore the preferred usage patterns of chloroplast genome codon in Dendrocalamus farinosus, analyze the main reasons affecting the codon usage preference of D. farinosus, and determine the optimal codon, so as to provide reference for chloroplast genomics research in Bambusoideae plants.   Method  According to the GenBank login number MZ681865.156, 85 chloroplast gene sequences of D. farinosus were downloaded from National Center for Biotechnology Information (NCBI) database in the United States. CodonW, CUSP and R language software were used to analyze the effective number of codons (ENC), adaptation index (CAI) and relative synonymous codon usage (RSCU). Correspondence analysis of RSCU was performed and the codons were sorted based on ENC and RSCU values.   Result  The average ratio of guanine (G) and cytosine (C) in the chloroplast genome codon (GC ratio) was 39.48%, with GC1 (47.69%)>GC2 (39.70%)>GC3 (31.05%), and the last codon base preferred to end in A/U. The majority of ENC value was above 35, and CAI value was 0.167, so the codon preference was weak. Neutral plot analysis, ENC-plot and PR2-plot analysis showed that natural selection was the main factor affecting the codon preference of the chloroplast genome of D. farinosus. A total of 18 codons, including GCA, GCU, UUC, and GGU, were identified as the optimal codons for the chloroplast genome of D. farinosus.   Conclusion  Natural selection is the main factor contributing to the codon preference of the chloroplast genome in D. farinosus, and 18 optimal codons such as GCU, GAU, and GGU are screened for the chloroplast genome of D. farinosus. [Ch, 5 fig. 5 tab. 30 ref.]

WEI Ya’nan, GONG Minggui, BAI Na, SU Jiajie, JIANG Xia. Analysis of codon preference in chloroplast genome of Dendrocalamus farinosus[J]. Journal of Zhejiang A&F University, 2024, 41(4): 696-705. doi: 10.11833/j.issn.2095-0756.20230498
Citation: WEI Ya’nan, GONG Minggui, BAI Na, SU Jiajie, JIANG Xia. Analysis of codon preference in chloroplast genome of Dendrocalamus farinosus[J]. Journal of Zhejiang A&F University, 2024, 41(4): 696-705. doi: 10.11833/j.issn.2095-0756.20230498
  • 密码子是识别和传递生物体遗传信息、联系蛋白质与DNA之间的重要桥梁,在生物体遗传和变异中起着至关重要的作用[1]。编码同一氨基酸的不同密码子被称为同义密码子。由于基因突变和自然选择的影响,某些同义密码子在蛋白质翻译过程中往往被高频使用,被称为密码子的使用偏好性[23]。物种的生物学功能与密码子偏好性密切相关,密码子偏好性不仅可以影响生物编码基因的蛋白质合成速率和翻译速率[4],还会影响蛋白质结构、折叠程度和mRNA的合成[5]。研究表明:同一物种或亲缘关系相近的物种,具有相似的密码子偏好使用模式[6],通过分析物种的密码子偏好性可以衡量物种之间的基因表达量,进而探究物种之间亲属关系[7]。通过密码子偏好性的研究,能够更好地阐明物种进化过程中基因的表达规律[8],为利用基因工程技术改良物种目标基因提供参考依据[9]

    梁山慈竹Dendrocalamus farinosus属竹亚科Bambusoideae牡竹属Dendrocalamus,又名大叶竹和瓦灰竹,是中国西南地区重要的经济竹种[10],生长速度快,适应性强,竹笋效益高,属于优良的笋竹两用竹种,与硬头黄竹Bambusa rigida都属于竹编和制浆造纸的优质原料[11]。针对梁山慈竹叶绿体基因组密码子使用偏好性的研究鲜见报道。为了更好地挖掘和利用梁山慈竹的潜在经济价值,本研究以梁山慈竹叶绿体基因组序列为研究对象,分析其密码子偏好性使用模式,探究并总结其相关表达基因的密码子偏好性,以期分析影响梁山慈竹叶绿体基因组密码子偏好性的主要因素,并筛选出最优密码子,为后续梁山慈竹叶绿体基因工程改造等研究提供理论基础。

    • 根据GenBank登录号MZ681865.156在美国国家生物技术信息中心(NCBI)数据库中搜索并下载梁山慈竹叶绿体基因组序列,共有85条编码序列(CDS)。序列重复或小于300 bp会对密码子偏好性指标的测定产生影响[12]。对基因序列进行筛选,剔除序列长度小于300 bp且重复的序列,获取起始密码子为ATG,终止密码子为TAG、TGA和TAA的序列,最终获得51条CDS序列作为后续分析的样本序列。

    • 运用CodonW1.4.2 (http://sourceforge.net/projects/codonw)和EMBOSS (http://imed.med.ucm.es/EMBOSS/)计算有效密码子数(ENC)、适应指数(CAI)、密码子偏性指数(CBI)、最优密码子频率(FOP)以及密码子第3位核苷酸A、T、C、G的含量(分别记为A3、T3、C3、G3)。利用ENC判断密码子偏好性程度,ENC>35说明密码子偏好性比较弱;反之,说明偏好性强[13]。通过CUSP软件分析并获得密码子鸟嘌呤(G)和胞嘧啶(C)所占的比率(GC比率)及GC平均比率(GCall),使用SPSS 25.0软件对梁山慈竹密码子各位置的GC比率与ENC进行相关分析。

    • 运用CodonW 1.4.2对同义密码子相对使用度(RSCU)进行分析,即该密码子的实际使用频率与其理论使用频率的比值[14]。当RSCU大于1时,同义密码子中偏好使用该密码子,被称为高频密码子;当RSCU等于1时,密码子无偏好性;当RSCU小于1时,密码子使用偏好性较弱[15]

    • 中性绘图分析是对影响密码子使用偏好性的关键因素进行分析,X轴为GC3,Y轴为GC1和GC2的平均值,绘制二维散点图对GC3和GC12 (各基因 GC1和GC2的平均值)的相关性进行分析(GC1、GC2、GC3分别代表第1、2、3位密码子的GC比例)。若回归系数接近1,代表GC3和GC12显著相关,碱基组成没有差异,说明突变是决定密码子偏好性的主要因素;若回归系数接近0,则代表自然选择是主要因素。

    • ENC-plot绘图分析表现密码子的使用偏好性受到突变和自然选择的影响程度。使用Python 3.7进行ENC-plot绘图分析,构建散点图,横纵坐标分别为GC3、ENC,并绘制ENC的标准曲线。基因位点靠近或在标准曲线上,表明突变是决定密码子偏好性的主要因素,若基因位点和标准曲线距离很大,则说明偏好性主要由自然选择决定。

    • PR2-plot分析表明基因中密码子的第3位碱基的构成情况。计算密码子碱基中第3位上4种碱基A、T、C、G比例,G3/(G3+C3)为X轴,A3/(A3+T3)为Y轴,绘制PR2-plot散点图,中心点为碱基比例A=T、C=G时的值,代表处于此区域的密码子并无使用偏好性[16]

    • 将51条基因升序排列后的ENC前后两端10%的基因建立高、低表达基因库。通过CodonW软件计算2个表达库中密码子的RSCU和ΔRSCU,同时满足高频密码子(RSCU>1)和高表达密码子(ΔRSCU≥0.08)的为最优密码子[17]

    • 在Codon Usage Database (http://www.kazusa.or.jp/codon/)下载异源表达宿主和植物代表类群,包括巨龙竹D. farinosus、粉麻竹D. sinicus、小叶龙竹D. pulverulentus、硬头黄竹、大肠埃希菌Escherichia coli、烟草Nicotiana tabacum、拟南芥Arabidopsis thaliana和酿酒酵母Saccharomyces cerevisiae等物种基因组密码子的使用频率,与梁山慈竹基因组密码子使用频率比值进行比较分析,当梁山慈竹密码子使用频率比其他生物的比值≥2.0或≤0.5时,说明该物种与梁山慈竹的同义密码子的使用偏好性差异较大,当比值不在上述范围内时,表明这2个物种对该密码子的偏好性较接近。

    • 将叶绿体基因如表1所示进行功能分类,使用CodinW软件,选择对应分析计算样本中各个基因的RSCU,将分析结果分布在59维向量空间中,分析指标间的对应性。

      基因分类基因分组基因名称
      光合系统基因光系统Ⅰ基因psaApsaBpsbApsbCpsbDpsbB
      光系统Ⅱ基因petApetBpetD
      细胞色素b/f复合体基因atpAatpBatpEatpFatpI
      三磷酸腺苷合成酶基因ndhAndhBndhCndhDndhEndhFndhGndhHndhIndhJndhK
      遗传系统基因烟酰胺腺票吟二核甘酸氧化还原酶基因rbcL
      二磷酸核酮糖羧化酶大亚基基因rpoArpoBrpoC1、rpoC2
      RNA聚合酶亚基基因rps2、rps3、rps4、rps7、rps8、rps11、rps12、rps14、rps18
      核糖体蛋白小亚基基因rpl2、rpl14、rpl16、rpl20、rpl22
      其他基因成熟酶K基因matK
      膜蛋白基因cemA
      细胞色素合成基因ccsA
      酪蛋白分解蛋白酶基因clpP
      未知功能基因假定叶绿体阅读框ycf2、ycf3、infA

      Table 1.  Structural analysis of the choroplast genome of D. farinosus

    • 分析梁山慈竹叶绿体基因组CDS序列的碱基组成:梁山慈竹的4种碱基所对应的同义密码子的第3位碱基比例 (T3s、A3s、C3s、G3s)分别为45.28%、42.07%、18.13%、17.96%,T3s和A3s远高于G3s和C3s,表明梁山慈竹叶绿体基因组密码子的第3位碱基以A/U结尾为主。梁山慈竹的ENC为50.40,CAI为16.6%,第3位同义密码子的GC比率 (GC3S)为28.1%,表明其叶绿体基因组密码子偏好性较弱。

      梁山慈竹叶绿体基因组密码子的GC平均比率为39.48%,且GC1 (47.69%)>GC2 (39.70%)>GC3 (31.05%)。ENC为 39.04~61.00,均值为49.51,GC比率在基因密码子上并没有均匀分布(表2)。ENC和密码子3个位置GC比率的相关分析(表3)结果发现:ENC与GC3比率显著相关,与GC1、GC2不显著相关,说明密码子使用偏好性形成过程中GC3的影响作用大于GC1、GC2。

      基因GC比率/%ENCCAIFOP基因GC比率/%ENCCAIFOP
      GCGC1GC2GC3GCGC1GC2GC3
      rps1241.8752.0047.2026.4044.850.1400.341rps1833.5334.5039.7726.3239.040.1470.333
      psbA42.5649.7242.9435.0341.330.3130.532rpl2036.1138.3340.8329.1750.970.1120.298
      matK34.4440.8232.4230.0849.490.1660.329clpP43.0152.5338.2538.2552.370.1750.337
      psbD44.4453.3943.5036.4448.990.2420.456psbB44.0154.4245.9731.6350.730.1900.380
      psbC44.6653.5944.7335.6348.910.1830.386petB41.0648.9341.2033.0547.310.1910.333
      rpoB39.1949.8138.0129.7449.690.1530.353petD40.3750.9339.1331.0649.460.1610.305
      rpoC139.8749.9338.0731.6352.770.1560.347rpoA37.0646.1835.5929.4149.940.1510.311
      rpoC238.9549.0136.6431.1852.290.1540.333rps1143.5250.6956.2523.6144.330.1740.396
      rps238.4040.5140.9333.7652.550.1680.338infA40.3543.8635.9641.2361.000.1810.409
      atpI38.8447.5836.2932.6650.550.1630.353rps836.5041.6141.6126.2846.620.1220.374
      atpF38.2747.6235.4531.7553.170.1470.353rpl1438.7154.8437.1024.1951.900.1810.392
      atpA42.0656.0139.9630.1249.960.1820.385rpl1644.7652.1453.5728.5739.410.1150.354
      rps1439.4239.4246.1532.6941.730.1350.384rps333.4743.7531.6725.0048.030.1930.402
      psaB41.8148.7143.1333.6149.340.1720.350rpl2237.5641.3336.6734.6747.480.1880.415
      psaA43.6851.8043.2835.9552.070.1980.373rpl244.5651.7748.5833.3353.330.1430.361
      ycf339.6947.4038.1533.5355.450.1560.343ndhB38.1642.0739.3333.0746.710.1560.348
      rps437.1347.5237.1326.7349.590.1690.386rps739.4949.6845.2223.5748.310.1640.373
      ndhJ39.3849.3836.8831.8851.480.1760.356ndhF34.1937.8438.9225.8146.190.1440.321
      ndhK38.6041.7043.7230.3651.910.1590.329ccsA33.6433.7441.1026.0745.600.1520.307
      ndhC39.6750.4136.3632.3348.750.1770.345ndhD36.1940.7236.9330.9448.980.1330.314
      atpE42.5152.1739.1336.2359.510.1670.405ndhE33.3341.1832.3526.4759.060.1440.316
      atpB42.6253.9141.6832.2647.430.1920.381ndhG34.4644.0732.7726.5545.770.1250.250
      rbcL44.1457.1143.9331.3850.190.2710.454ndhI34.9937.5738.6728.7352.090.1710.345
      ycf441.2248.3939.7835.4847.140.1620.385ndhA33.9842.4236.3623.1444.350.1400.321
      cemA33.6241.9927.7131.1755.910.1760.342ndhH37.8250.7634.7727.9249.950.1550.322
      petA40.2953.5835.232.0951.120.1550.331

      Table 2.  Statistics of codon related parameters of various genes in the chloroplast genome of D. farinosus

      参数GC1GC2GC3ENCCAICBIFOPGC3sGC
      GC11
      GC20.300*1
      GC30.265−0.0091
      ENC0.142−0.425**0.389**1
      CAI0.409**0.0760.370**0.0121
      CBI0.438**0.2720.322*−0.0920.774**1
      FOP0.402**0.312*0.341*−0.0640.797**0.965**1
      GC3s0.271−0.0290.946**0.445**0.330*0.330*0.370**1
      GC0.814**0.673**0.525**0.0100.407**0.512**0.518**0.499**1
        说明: *表示显著相关 (P<0.05);**表示极显著相关 (P<0.01)。

      Table 3.  Correlation analysis of various gene parameters in the chloroplast genome of D. farinosus

    • 梁山慈竹叶绿体基因组中共包含18110个密码子(表4),总计编码20个氨基酸,密码子数为12~705个,其中密码子UGA共有12个,密码子含量最多的是编码谷氨酸的GAA,共有705个。梁山慈竹叶绿体基因组蛋白编码序列RSCU分析表明:氨基酸含量较高的有亮氨酸(Leu)和精氨酸(Arg),均为6个密码子编码,编码精氨酸的是UUA、UUG、CUU、CUC、CUA和CUG;编码亮氨酸的有AGA、AGG、CGU、CGC、CGA和CGG;除此之外,蛋氨酸(Met)和色氨酸(Trp)均只有1个密码子编码,分别是AUG和UGG,其余氨基酸密码子编码个数分别为2~4个。

      氨基酸
      密码子数量RSCU氨基酸密码子数量RSCU氨基酸
      密码子数量RSCU氨基酸密码子数量RSCU
      PheUUU*6441.29TyrUAU*5321.59SerUCU*3431.58CysUGU*1511.53
      PheUUC3510.71TyrUAC1370.41SerUCC*2601.19CysUGC470.47
      LeuUUA*6341.94TERUAA*281.56SerUCA*2221.02ArgAGA*3221.75
      LeuUUG*3621.11TERUAG140.78SerUCG1190.55ArgAGG1190.64
      LeuCUU*4201.29TERUGA120.67SerAGU*2731.25ArgCGU*2611.41
      LeuCUC1380.42TrpUGG*3281.00SerAGC890.41ArgCGC950.51
      LeuCUA2950.90GlnCAA*4771.53ThrACU*4031.68ArgCGA*2341.27
      LeuCUG1070.33GlnCAG1480.47ThrACC1810.75ArgCGG760.41
      IleAUU*7401.48GluGAA*7051.46ThrACA*2591.08GlyGGU*4211.24
      IleAUC2950.59GluGAG2630.54ThrACG1160.48GlyGGC1450.43
      IleAUA4610.92LysAAA*6471.44AlaGCU*4931.73GlyGGA*5381.58
      MetAUG*4161.00LysAAG2530.56AlaGCC1720.60GlyGGG2590.76
      ValGUU*3821.47AspGAU*5221.54AlaGCA*3431.20ProCCU*2861.48
      ValGUC1260.49AspGAC1550.46AlaGCG1350.47ProCCC*1961.01
      ValGUA*3901.50HisCAU*3111.47AsnAAU*5281.48ProCCA*2091.08
      ValGUG1390.54HisCAC1120.53AsnAAC1870.52ProCCG840.43
        说明:*表示RSCU大于1的高频密码子。

      Table 4.  RSCU of protein coding region in the chloroplast of D. farinosus

      梁山慈竹叶绿体基因组RSCU大于1的密码子数目为34个(分别为UUU、UUA、UUG、CUU、AUU、AUG、GUU、GUA、UCU、UCC、UCA、AGU、ACU、ACA、GCU、GCA、AAU、UAU、UAA、UGG、CAA、GAA、AAA、GAU、CAU、UGU、AGA、CGU、CGA、GGU、GGA、CCU、CCC和CCA),即筛选出了34个高频密码子,其中以A、U、C、G结尾的密码子分别有13、16、2和1个,这说明密码子偏好以A和U结尾,RSCU较高的3个密码子分别为UUU (1.94)、CUA (1.73)和UCU (1.75)。

    • 中性绘图分析量化自然选择和突变压力之间的关系,阐明3个密码子位置之间的联系。结果表明:横坐标GC3的数值为23.14%~41.23%,纵坐标GC12的数值为39.04%~61.00% (图1)。梁山慈竹的Pearson相关系数为0.17,呈正相关关系,数据拟合后的回归系数为0.1868,决定系数(R2)较小,为0.0282,GC12和GC3的相关性不显著,说明其叶绿体基因组密码子偏好性受自然选择影响较大。

      Figure 1.  Analysis of neutrality plot

    • 图2显示:ENC分布并不紧密,少量分布在标准曲线附近,还有个别分布在标准曲线上侧,位点的ENC均大于35,与预期ENC值有差距。说明梁山慈竹密码子偏好性较弱且自然选择和突变都对其偏好性有影响。由于落在标准曲线下方的基因点数量比较多,所以梁山慈竹基因组密码子使用偏好性主要受自然选择的影响。

      Figure 2.  Analysis of ENC-plot

    • 图3显示:基因位点在平面图4个区域内分布并不均匀,在A3/(A3+T3)<0.5和G3/(G3+C3)>0.5区域范围内分布最多。表明第3位碱基使用频率为:T>A、G>C,梁山慈竹叶绿体基因组密码子的第3位碱基在选择上具有偏好性,同时说明其密码子使用偏好性主要受自然选择的影响。

      Figure 3.  Analysis of PR2-plot

    • 对梁山慈竹的ENC进行升序排列,前10%为高表达基因,即rps18、rpl16、psbA、rps14、rps11,后10%为低表达基因,即 ycf3、cemA、ndhE、atpE、infA。梁山慈竹的RSCU和ΔRSCU表明(表5):梁山慈竹叶绿体基因组有32个高频密码子,筛选出GCA、GCU等25个高表达密码子,最终确定18个密码子作为梁山慈竹叶绿体基因组的最优密码子,分别为UAA、GCA、GCU、UUC、GGU、AAA、CUU、UUA、CCA、CCU、CAA、AGA、CGU、AGU、UCC、ACU、GUA、GUU。其中16个以A/U结尾,2个以C结尾。

      氨基酸密码子基因组
      RSCU
      高表达
      RSCU
      低表达
      RSCU
      ΔRSCU氨基酸密码子基因组
      RSCU
      高表达
      RSCU
      低表达
      RSCU
      ΔRSCU
      TerUAA***1.560 01.800 01.200 00.600 0MetAUG1.000 01.000 01.000 00
      UAG0.780 00.600 01.200 0−0.600 0AsnAAC*0.520 00.893 60.625 00.268 6
      UGA0.670 00.600 00.600 00AAU1.480 01.106 41.375 0−0.268 6
      AlaGCA**1.200 01.200 00.734 70.465 3ProCCA**1.080 00.800 00.500 00.300 0
      GCC0.600 00.457 10.653 1−0.196 0CCC1.010 00.800 01.166 7−0.366 7
      GCG0.470 00.228 60.734 7−0.506 1CCG0.430 00.444 41.000 0−0.555 6
      GCU*1.730 02.114 31.877 60.236 7CCU***1.480 01.955 61.333 30.622 3
      CysUGC**0.470 00.400 000.400 0GlnCAA*1.530 01.500 01.368 40.131 6
      UGU1.530 01.600 02.000 0−0.400 0CAG0.470 00.500 00.631 6−0.131 6
      AspGAC*0.460 00.500 00.411 80.088 2ArgAGA*1.750 01.723 41.534 90.188 5
      GAU1.540 01.500 01.588 2-0.088 2AGG0.640 00.319 10.837 2−0.518 1
      GluGAA1.460 01.189 21.578 9−0.389 7CGA1.270 01.276 61.395 3−0.118 7
      GAG**0.540 00.810 80.421 10.389 7CGC0.510 00.319 10.837 2−0.518 1
      PheUUC**1.290 01.041 70.650 00.391 7CGG0.410 00.319 10.279 10.040 0
      UUU0.710 00.958 31.350 0−0.391 7CGU***1.410 02.042 61.116 30.926 3
      GlyGGA1.580 01.253 71.818 2−0.564 5SerAGC0.410 00.384 60.470 6−0.086 0
      GGC0.430 00.417 90.484 8−0.066 9AGU**1.250 01.846 21.411 80.434 4
      GGG0.760 00.119 40.363 6−0.244 2UCA1.020 00.615 41.058 8−0.443 4
      GGU***1.240 02.209 01.333 30.875 7UCC***1.190 01.769 20.941 20.828 0
      HisCAC**0.530 00.941 20.571 40.369 8UCG0.550 00.153 80.705 9−0.552 1
      CAU1.470 01.058 81.428 6−0.369 8UCU1.580 01.230 81.411 8−0.181 0
      Ile AUA0.920 00.850 70.949 4−0.098 7ThrACA1.080 01.181 81.176 50.005 3
      AUC*0.590 00.626 90.531 60.095 3ACC0.500 00.818 21.058 8−0.240 6
      AUU1.480 01.522 41.519 00.003 4ACG0.480 00.363 60.588 2−0.224 6
      LysAAA**1.440 01.471 71.155 60.316 1ACU**1.680 01.636 41.176 50.459 9
      AAG0.560 00.528 30.844 4−0.316 1ValGUA***1.500 01.767 41.257 10.510 3
      LeuCUA0.900 00.833 31.295 5−0.462 2GUC0.490 001.028 6−1.028 6
      CUC0.420 000.545 5−0.545 5GUG0.540 00.372 10.342 90.029 2
      CUG0.330 00.250 00.477 3−0.227 3GUU**1.470 01.860 51.371 40.489 1
      CUU*1.290 01.333 31.227 30.106 0TrpUGG1.000 01.000 01.000 00
      UUA***1.940 02.166 71.022 71.144 0TyrUAC**0.410 00.521 70.166 70.355 0
      UUG1.110 01.416 71.431 8−0.015 1UAU1.590 01.478 31.833 3−0.355 0
        说明: 高频密码子(RSCU>1.00)带下划线;*. ΔRSCU≥0.08;**. ΔRSCU≥0.3;***. ΔRSCU≥0.5; 加粗的密码子表示最优密码子。

      Table 5.  RSCU analysis and optimal codon analysis of amino acids in chloroplast genome of D. farinosus

    • 将梁山慈竹基因组密码子使用频率与巨龙竹、粉麻竹、小叶龙竹、硬头黄竹、大肠埃希菌、烟草、拟南芥和酿酒酵母等物种的基因组密码子使用频率进行比较(图4)。结果显示:梁山慈竹与巨龙竹、粉麻竹、小叶龙竹和硬头黄竹的密码子使用频率为0.5~2.0,说明它们的密码子使用偏好性相似,推测具有亲缘关系的禾本科Gramineae牡竹属植物叶绿体基因组密码子偏好性相似;在大肠埃希菌、烟草、拟南芥和酿酒酵母的密码子使用比值中筛选≥2.0或≤0.5的密码子,分别有28和15、15、14个,表明梁山慈竹与这些物种在同义密码子的偏好性上有一定差异。

      Figure 4.  Comparison of codon preference between D. farinosus and other species

    • 将梁山慈竹的51个叶绿体基因的基因功能分为光合系统基因、遗传系统基因、其他基因和未知功能基因四大类,在计算RSCU的基础上将各个基因分布到59维的向量空间。对应分析结果(图5)显示:前4个向量轴分别存在18.3%、16.8%、15.6%和15.4%的差异,前4向量轴累计差异为66.1%,4个轴对密码子均有不同程度的影响;第1轴的值大于其他轴,说明第1轴对梁山慈竹叶绿体基因组密码子偏好性的影响较大。对第1轴与CAI、CBI、FOP、ENC和GC3s等指数进行进一步的相关分析发现:梁山慈竹基因在第1轴上的坐标值与CAI (r=−0.001 7,P<0.01)、CBI (r=0.099 0,P<0.01)、FOP (r=0.083 0,P<0.01)、ENC (r=0.112 0,P<0.01)、GC3s (r=−0.145 0,P<0.01)间具有极显著的相关关系,其中CAI和GC3s第1轴具有负相关关系,表明基因组密码子的偏好性不止受单一因素的影响,自然选择、基因突变均有可能影响梁山慈竹基因组密码子使用偏好性[18]

      Figure 5.  Correspondence Analysis on RSCU of D. farinosus

    • 本研究对梁山慈竹叶绿体基因组密码子进行使用偏好性分析,筛选出51条CDS序列,分析表明:GC1>GC2>GC3,密码子在3个位置上的分布并不均匀,密码子偏好使用以A或U结尾的碱基,且梁山慈竹叶绿体基因组的ENC均值为49.51,表明其叶绿体基因组密码子使用偏好性较弱。这与乳油木Vitellaria paradoxa[19]和二乔玉兰Magnolia soulangeana[20]等植物叶绿体基因组密码子偏好性相似。

      对梁山慈竹叶绿体基因组密码子进行中性绘图、ENC-plot分析、PR2-plot分析和对应分析。在中性绘图分析中,回归系数为0.412 8,说明密码子偏好性更多受到自然选择的影响;在ENC-plot分析中,多数基因离标准曲线距离较远,实际ENC和预期ENC有差距,表明该部分基因的密码子偏好性主要受自然选择的影响;在PR2-plot绘图分析中,大部分基因位于平面图的右下方,即T>A、G>C,表明其密码子的使用更多受自然选择的影响。综上所述,影响梁山慈竹叶绿体基因组密码子偏好性的主要原因是自然选择。该研究结果与巨桉Eucalyptus grandi[21]、灰毛浆果楝Cipadessa cinerascens、酸枣Ziziphus jujuba var. spinosa[22]和云南油杉Keteleeria evelyniana[23]等叶绿体基因组密码子偏好性研究结果基本一致;但在对4种蔷薇科 Rosaceae果树[24]和银白杨Populus alba[25]的研究中发现:突变是影响密码子偏好性的主要因素。这说明密码子的使用偏好性受自然选择或基因突变因素影响。基于RSCU的对应分析表明:梁山慈竹的密码子使用变异原因除了突变和自然选择之外,还有其他的因素,这其中光合系统基因和遗传系统基因分布相对集中,各类基因密码子使用偏好性较为接近。该结论与木薯Manihot esculenta[26]和高山松Pinus densata[27]的研究结果一致。密码子使用频率比较结果显示:梁山慈竹与禾本科牡竹属的植物密码子偏好性相似,在基因选择外源系统表达时,可以选择密码子偏好性差异相对较小的酿酒酵母,在选择大肠埃希菌、烟草和拟南芥作为外源表达宿主时,需要根据密码子使用偏好性进行碱基优化,从而使基因在宿主体内更好地表达。

      最优密码子分析表明:梁山慈竹叶绿体基因组有GCU、GAU以及GGU等18个最优密码子,最优密码子大部分以A或U结尾。该结果与抽筒竹Gelidocalamus tessellatus[28]和毛竹Phyllostachys edulis[29]叶绿体基因组最优密码子分析结果一致,这可能与亲缘关系相近,但不同物种之间叶绿体基因组进化过程中的相对保守性有关系[21]。通过筛选获取梁山慈竹偏好使用密码子,可进一步对目标基因进行密码子优化,提高梁山慈竹的竹笋产量和造纸纤维含量,以及利用新一代精准基因编辑工具CRISPR/Cas9优化梁山慈竹密码子,从而改造梁山慈竹基因组编辑的Cas9基因,提高该基因在梁山慈竹中的表达水平[30]

    • 本研究通过分析梁山慈竹叶绿体基因组的CDS序列,对梁山慈竹的叶绿体基因组进行生物信息学分析,筛选出梁山慈竹叶绿体基因组有GCU、GAU以及GGU等18个最优密码子。研究结果表明:影响梁山慈竹密码子偏好性的主要因素是自然选择。研究结果为后续在分子层面上利用基因工程开发梁山慈竹优良资源提供参考。

Reference (30)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return