Global features of sequences of bacterial chromosomes, plasmids and phages revealed by analysis of oligonucleotide usage patterns
Oligonucleotide frequencies were shown to be conserved signatures for bacterial genomes, however, the underlying constraints have yet not been resolved in detail. In this paper we analyzed oligonucleotide usage (OU) biases in a comprehensive collection of 155 completely sequenced bacterial chromosomes, 316 plasmids and 104 phages. Two global features were analyzed: pattern skew ( PS) and variance of OU deviations normalized by mononucleotide content of the sequence (OUV). OUV reflects the strength of OU biases and taxonomic signals. PS denotes asymmetry of OU in direct and reverse DNA strands. A trend towards minimal PS was observed for almost all complete sequences of bacterial chromosomes and plasmids, however, PS was substantially higher in separate genomic loci and several types of plasmids and phages characterized by long stretches of non-coding DNA and/or asymmetric gene distribution on the two DNA strands. Five of the 155 bacterial chromosomes have anomalously high PS, of which the chromosomes of Xylella fastidiosa 9a5c and Prochlorococcus marinus MIT9313 exhibit extreme PS values suggesting an intermediate unstable state of these two genomes. Strand symmetry as indicated by minimal PS is a universally conserved feature of complete bacterial genomes that results from the matching mutual compensation of local OU biases on both replichors while OUV is more a taxon specific feature. Local events such as inversions or the incorporation of genome islands are balanced by global changes in genome organization to minimize PS that may represent one of the leading evolutionary forces driving bacterial genome diversification.