Using the nf-core/sarek and nf-core/methylseq pipelines to analyse large clinical trial datasets
Eleanor Karp-Tatham, Patrick Maclean, Alexander J. Mentzer, and Julian C Knight
Functional genomic data are increasingly collected as part of clinical datasets to understand genetic and epigenetic determinants of disease. Nextflow’s scalable architecture is well suited to processing such datasets efficiently and reproducibly. Community pipelines are valuable for allowing researchers at different institutions and with different levels of computational expertise to make the most of their datasets. Here we will highlight the use of the nf-core/sarek and nf-core/methylseq pipelines to analyse large infectious disease clinical trial datasets, including COVID-19 vaccine genetics and sepsis immuno-genomic data. We will demonstrate how we have implemented nf-core/methylseq to help test associations between methylation patterns with differential vaccine response and immune recovery after infection. Furthermore, we will provide an example of setting up the nf-core/sarek pipeline, with germline and somatic variant calling and annotation, to aid in the identification of genetic variants that impact vaccination response.