Tuesday, June 7, 2016

tHapMix: simulating tumour samples through haplotype mixtures

tHapMix: simulating tumour samples through haplotype mixtures

Sergii Ivakhno, Camilla Colombo, Stephen Tanner, Philip Tedder, Stefano Berri, Anthony J. Cox
 

Abstract

Motivation: Large-scale rearrangements and copy number changes combined with different modes of clonal evolution create extensive somatic genome diversity, making it difficult to develop versatile and scalable variant calling tools and create well-calibrated benchmarks. Results: We developed a new simulation framework tHapMix that enables the creation of tumour sam-ples with different ploidy, purity and polyclonality features. It easily scales to simulation of hundreds of somatic genomes, while re-use of real read data preserves noise and biases present in sequencing platforms. We further demonstrate tHapMix utility by creating a simulated set of 140 somatic genomes and showing how it can be used in training and testing of somatic copy number variant calling tools. Availability and implementation: tHapMix is distributed under an open source license and can be downloaded from https://github.com/Illumina/tHapMix .