Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection.

TitleCombining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection.
Publication TypeJournal Article
Year of Publication2015
AuthorsEwing, AD, Houlahan, KE, Hu, Y, Ellrott, K, Caloian, C, Yamaguchi, TN, J Bare, C, P'ng, C, Waggott, D, Sabelnykova, VY, Kellen, MR, Norman, TC, Haussler, D, Friend, SH, Stolovitzky, G, Margolin, AA, Stuart, JM, Boutros, PC
Corporate Authorsparticipants, ICGC-TCGADREAMSoma
JournalNat Methods
Volume12
Issue7
Pagination623-30
Date Published2015 Jul
ISSN1548-7105
KeywordsAlgorithms; Benchmarking; Crowdsourcing; Genome; Humans; Neoplasms; Polymorphism, Single Nucleotide
Abstract

The detection of somatic mutations from cancer genome sequences is key to understanding the genetic basis of disease progression, patient survival and response to therapy. Benchmarking is needed for tool assessment and improvement but is complicated by a lack of gold standards, by extensive resource requirements and by difficulties in sharing personal genomic information. To resolve these issues, we launched the ICGC-TCGA DREAM Somatic Mutation Calling Challenge, a crowdsourced benchmark of somatic mutation detection algorithms. Here we report the BAMSurgeon tool for simulating cancer genomes and the results of 248 analyses of three in silico tumors created with it. Different algorithms exhibit characteristic error profiles, and, intriguingly, false positives show a trinucleotide profile very similar to one found in human tumors. Although the three simulated tumors differ in sequence contamination (deviation from normal cell sequence) and in subclonality, an ensemble of pipelines outperforms the best individual pipeline in all cases. BAMSurgeon is available at https://github.com/adamewing/bamsurgeon/.

DOI10.1038/nmeth.3407
Alternate JournalNat. Methods
PubMed ID25984700
PubMed Central IDPMC4856034
Grant ListP30 CA016672 / CA / NCI NIH HHS / United States
U24-CA143858 / CA / NCI NIH HHS / United States
R01-CA180778 / CA / NCI NIH HHS / United States
U01-CA176303 / CA / NCI NIH HHS / United States
/ / Canadian Institutes of Health Research / Canada
U24 CA143858 / CA / NCI NIH HHS / United States
R01 CA183793 / CA / NCI NIH HHS / United States
U01 CA176303 / CA / NCI NIH HHS / United States
R01 CA180778 / CA / NCI NIH HHS / United States
U54 HG007990 / HG / NHGRI NIH HHS / United States