AGR Seminar

Comparing and combining mutation callers

Professor Terry Speed

Host venue


Somatic mutation-calling based on DNA from matched tumor-normal patient samples is one of the key tasks carried by many cancer genome projects. One such large-scale project is The Cancer Genome Atlas (TCGA), which is now routinely compiling catalogs of somatic mutations from hundreds of paired tumor-normal DNA exome‑sequence datasets. Several mutation-callers are publicly available and more are likely to appear. Nonetheless, mutation‑calling is still challenging and there is unlikely to be one established caller that systematically outperforms all others. Evaluation of the mutation callers or understanding the sources of discrepancies is not straightforward, since for most tumor studies, validation data based on independent whole exome DNA sequencing is not available, only partial validation data for a selected (ascertained) subset of sites. We have analyzed several sets of mutation calling data from TCGA benchmark studies and their partial validation data. To assess the performances of multiple callers, we introduce approaches utilizing the external sequence data to varying degrees, ranging from having independent DNA-seq pairs, RNA-seq for tumor samples only, the original exome-seq pairs only, or none of those. Utilizing multiple callers can be a powerful way to construct a list of final calls for one’s research. Using a set of mutations from multiple callers that are impartially validated, we present a statistical approach for building a combined caller, which can be applied to combine calls in a wider dataset generated using a similar protocol. The approach allows us to build a combined caller across the full range of stringency levels, which outperforms all of the individual callers. This is joint work with Su Yeon Kim and Laurent Jacob. If you want to participate in this event: 1. Book your nearest Access Grid room and ensure technical support is available throughout the seminar. Please notify the technical support people that connection time is 2.00pm AEST, for a 3.00pm AEST start of the presentation; and 2. Contact Maaike Wienk at, with a cc to Michael Shaw at AMSI (, one week in advance at the latest.


If you would like to attend this seminar in our access grid room then please book the access grid room referring to this scnews item. Please liaise with the host institution to make any necessary arrangements and then send an email to to let the CSOs know of any special requirements for the seminar.

