#20nf-core/sarek: A workflow for germline, tumor-only, and somatic analysis of NGS data
High-throughput, efficient, and reproducible pipelines are needed to ensure homogeneous data processing across different compute infrastructures with affordable resource usage. We present nf-core/sarek 3.0, to explore single-nucleotide variants, structural variation, microsatellite instability, and copy-number alterations of germline, tumor-only, and tumor-normal pairs.
We reduced compute resources and increased turn-around times, which minimizes costs on commercial clouds, facilitating the integration of publicly hosted data from repositories with in-house patient cohorts.
Other improvements include modularization of processes which facilitates maintainability and customization, and a broader repertoire of available tools.
We have re-processed 54 whole-genome-sequenced tumor-normal pairs of the TCGA-LIHC cohort, as well as on-site data, including 100 cholangiocarcinoma and 20 colorectal carcinoma panels to investigate the relationship of genomic variation to drug responsiveness.