Development of an integrated DNA and RNA variant calling pipeline
Raquel Manzano, Oscar M Rueda, Gad Gezt, and Carlos Caldas
Accurate identification of somatic variants in tumor samples is crucial for comprehensive genomic analysis. Here, we present a novel pipeline that integrates DNA and RNA variant calling using Nextflow and nf-core best practices. The pipeline combines established DNA variant calling methods with tailored strategies to address RNA-specific challenges, including coverage fluctuations, RNA editing, duplicated reads, and alternative splicing. Leveraging The Cancer Genome Atlas (TCGA) cohort, we demonstrate the pipeline’s effectiveness in accurately identifying mutational patterns, including indels, multiple nucleotide polymorphisms, and RNA-specific signatures like RNA editing. Our integrated approach represents a significant advancement in tumor analysis, providing a standardized, scalable, and reproducible resource for comprehensive genomic profiling. This pipeline offers researchers an invaluable tool for in-depth molecular characterization of tumors and facilitates deeper insights into the molecular mechanisms underlying tumorigenesis.