tHapMix: simulating tumour samples through haplotype mixtures
Sergii Ivakhno, Camilla Colombo, Stephen Tanner, Philip Tedder, Stefano Berri, Anthony J. Cox
Motivation: Large-scale rearrangements and copy number changes combined with different modes of clonal evolution create extensive somatic genome diversity, making it difficult to develop versatile and scalable variant calling tools and create well-calibrated benchmarks. Results: We developed a new simulation framework tHapMix that enables the creation of tumour sam-ples with different ploidy, purity and polyclonality features. It easily scales to simulation of hundreds of somatic genomes, while re-use of real read data preserves noise and biases present in sequencing platforms. We further demonstrate tHapMix utility by creating a simulated set of 140 somatic genomes and showing how it can be used in training and testing of somatic copy number variant calling tools. Availability and implementation: tHapMix is distributed under an open source license and can be downloaded from https://github.com/Illumina/tHapMix .