Sample sequencing of a selected region of the genome of Erwinia carotovora subsp atroseptica reveals candidate phytopathogenicity genes and allows comparison with Escherichia coli
Genome sequencing is making a profound impact on microbiology. Currently, however, only one plant pathogen genome sequence is publicly available and 2 no genome-sequencing project has been initiated for any species of the genus Erwinia, which includes several important plant pathogens. This paper describes a targeted sample sequencing approach to study the genome of Erwinia carotovora subsp. atroseptica (Eca), a major soft-rot pathogen of potato. A large insert DNA (approx. 115 kb) library of Eca was constructed using a bacterial artificial chromosome (BAC) vector. Hybridization and end-sequence data revealed two overlapping BAC clones that span an entire hrp gene cluster. Random subcloning and one-fold sequence coverage (> 200 kb) across these BACs identified 25 (89%) of 28 hrp genes predicted from the orthologous hrp cluster of Erwinia amylovora. Regions flanking the hrp cluster contained orthologues of known or putative pathogenicity operons from other Erwinia species, including dspEF (E. amylovora), hecAB and pecSM (E. chrysanthemi), sequences similar to genes from the plant pathogen Xylella fastidiosa, including haemagglutinin-like genes, and sequences similar to genes involved in rhizobacterium-plant interactions. Approximately 10% of the sequences showed strongest nucleotide similarities to genes in the closely related model bacterium and animal pathogen Escherichia, coli. However, the positions of some of these genes were different in the two genomes. Approximately 30% of sequences showed no significant similarity to any database entries. A physical map was made across the genomic region spanning the hrp cluster by hybridization to the BAC library and to digested BAC clones, and by PCR between sequence contigs. A multiple genome coverage BAC library and one-fold sample sequencing are an effective combination for extracting useful information from important regions of the Eca genome, providing a wealth of candidate novel pathogenicity genes for functional analyses. Other genomic regions could be similarly targeted.