Applicability of high-throughput genome sequencing to analyse a Salmonella enterica serovar Typhimurium DT170 outbreak

Sophie Octavia1, Qinning Wang2, Sandeep Kaur1, Vitali Sintchenko2, Gwendolyn L. Gilbert3 and Ruiting Lan1.

1 University of New South Wales, Australia.
2 Westmead Hospital, NSW, Australia.
3 University of Sydney, Australia.

Introduction. Salmonella enterica serovar Typhimurium is one of the most common foodborne pathogens in humans and farm animals in Australia. The phage type (defined by patterns of susceptibility to lysis by a set of bacteriophages) DT170 has been increasing steadily over recent years and became the most frequent phage type in 2004. In December 2006, an outbreak of S. Typhimurium DT170 occurred in a residential catering college in Sydney, Australia. Next generation sequencing was conducted to analyse the isolates obtained from the outbreak.

Methods. The sequencing samples were outbreak-associated isolates, of which 16 were collected from faecal samples of the affected students; and three from the food source, chocolate mousse. All isolates were previously identified by the multilocus variable number of tandem repeat analysis as MLVA type 3-11-7-12-523. Three additional non-outbreak isolates of the same phage type were also included for sequencing as a multiplex of a total of 22 genomes on a MiSeq (2x250 bp) platform. The reads were mapped using burrows-wheel alignment (BWA) against S. Typhimurium whole genome sequence strain LT2 as a reference. Samtools was used to determine any base changes, including single nucleotide polymorphisms (SNPs) and small indels. de novo assembly was also performed using VelvetOptimiser and compared to LT2 using progressiveMauve. SNPs commonly identified from both Samtools and progressiveMauve were included in the final list of SNPs. Prophages were identified using PHAST.

Results. The average coverage of the sequenced isolates was found to be 30 fold. Although, the majority of the outbreak isolates were identical to each other according to the sequencing data, some isolates differed by 1 to 2 SNPs. Surprisingly, three of the human isolates differed to the others by 4, 5 and 12 SNPs, respectively. Sequence clustering analyses showed that the outbreak isolates formed a separate cluster in comparison to the non-outbreak isolates as expected, and were distinguished by four SNPs. All identified SNPs were found not located on virulence genes. All isolates contained the virulence plasmid pSLT and they were collinear. Five prophages were present in all isolates, including Gifsy-1, Gifsy-2, ST64B, Fels-1 and a novel prophage with high similarity to a P22 prophage, SPN9CC.

Conclusion. SNPs analysis of the genome data showed that outbreak isolates may vary and there might be multiple sources responsible for the outbreak. Genome sequencing is more discriminatory than phage typing and MLVA. It can be used in combination with epidemiological data for an improved characterisation of outbreaks.