摘要

Chlamydiae are obligate intracellular bacterial pathogens that cause ocular and sexually transmitted diseases, and are associated with cardiovascular diseases. The analysis of codon usage may improve our understanding of the evolution and pathogenesis of Chlamydia and allow reengineering of target genes to improve their expression for gene therapy. Here, we analyzed the codon usage of C. muridarum, C. trachomatis (here indicating biovar trachoma and LGV), C. pneumoniae, and C. psittaci using the codon usage database and the CUSP (Create a codon usage table) program of EMBOSS (The European Molecular Biology Open Software Suite). The results show that the four genomes have similar codon usage patterns, with a strong bias towards the codons with A and T at the third codon position. Compared with Homo sapiens, the four chlamydial species show discordant seven or eight preferred codons. The ENC (effective number of codons used in a gene)-plot reveals that the genetic heterogeneity in Chlamydia is constrained by the G+C content, while translational selection and gene length exert relatively weaker influences. Moreover, mutational pressure appears to be the major determinant of the codon usage variation among the chlamydial genes. In addition, we compared the codon preferences of C. trachomatis with those of E. coli, yeast, adenovirus and Homo sapiens. There are 23 codons showing distinct usage differences between C trachomatis and E. coli, 24 between C. trachomatis and adenovirus, 21 between C. trachomatis and Homo sapiens, but only six codons between C. trachomatis and yeast. Therefore, the yeast system may be more suitable for the expression of chlamydial genes. Finally, we compared the codon preferences of C. trachomatis with those of six eukaryotes, eight prokaryotes and 23 viruses. There is a strong positive correlation between the differences in coding GC content and the variations in codon bias (r=0.905, P<0.001). We conclude that the variation of codon bias between C. trachomatis and other organisms is much less influenced by phylogenetic lineage and primarily determined by the extent of disparities in GC content.