Oct 31 , 15:30 - 16:30

Reproducible analysis of metatranscriptomics either through nf-core/metatdenovo or nf-core/magmap

In the last decade, the study of microbial communities through RNA sequencing from various environments has significantly increased. Metatranscriptomics offers insights into metabolic processes within microbial communities, providing a snapshot of gene expression based on in situ environmental conditions. To support biologists in this endeavor and to promote reproducibility and standardization in data analysis, we developed two complementary pipelines with the help of the nf-core community: nf-core/metatdenovo and nf-core/magmap. These pipelines, designed to be user-friendly and reproducible, aim to investigate the activity of microbial communities with varying levels of genomic knowledge. The nf-core/metatdenovo pipeline applies a de novo assembly approach to annotate metatranscriptomic data. This approach constructs transcriptomes directly from RNA sequencing reads without requiring a reference genome, followed by quantification of active genes and the assignment of both taxonomy and functional annotation. This approach makes nf-core/metatdenovo particularly advantageous for studying environments where genomic resources are scarce or incomplete. Such environments could include extreme habitats like the deep sea or soils, where many organisms remain uncultured and uncharacterized. The nf-core/magmap pipeline instead is applicable for communities for which genomes are available – e.g. gut microbiomes or surface water – either in the form of metagenome-assembled genomes (MAGs) or reference genomes. The pipeline identifies references genomes using a kmer-based approach using Sourmash, to continue with mapping reads to identified genomic references, to allow quantification of expressed genes. By developing these pipelines, we aimed to provide biologists with robust, flexible tools that cater to different research needs and environmental contexts. To demonstrate the utility and performance of these pipelines, we will show a comparative analysis using a dataset derived from a complex microbial community. Our results show the distinct advantages of the nf-core/metatdenovo and nf-core/magmap pipelines in different ecological contexts.
View project

Speaker

Co-authors

Danilo Di Leo, Emelie Nilsson, Jarone Pinhassi and Daniel Lundin