May 23 , 11:45 - 12:00

SpaFlow: A Nextflow pipeline for single cell phenotyping in spatial omics

The rapidly growing field of spatial biology allows unprecedented insight into the arrangement and interactions of cells as they occur in our bodies. Understanding single-cell-level spatial relationships can help us answer important clinical and biological questions. How do individual immune cells infiltrate and fight tumors in cancer patients? How does the arrangement of neurons affect our brains’ functioning as we age? Single cell multiplexed protein imaging captures protein abundances on a cellular level and is one type of spatial omics data (high-resolution data measurements for spatial biology studies). Multiplexing allows measurement of multiple proteins on the same tissue section, facilitating detailed insight into cell-cell interactions and functional states in healthy or diseased tissue. Cell phenotyping is a vital step in the analysis of these data because downstream spatial analysis depends on a complete understanding of the cell types present in the tissue. Many techniques for cell phenotyping have been developed, including unsupervised clustering, linear modeling, and supervised classification. Supervised techniques, while accurate, can be very time-consuming for subject matter experts, requiring hours of work to annotate a representative sample of cells. Unsupervised clustering allows researchers to quickly evaluate the presence of cell types based on expected protein signatures. Implementing these methods efficiently can accelerate the phenotyping process by providing a jumping-off point for further iterations. We have developed SpaFlow, a Nextflow pipeline which runs quality control checks and three unsupervised cell phenotyping methods. Implementing the workflow as a Nextflow pipeline allows us to greatly reduce time and effort versus setting up custom R and Python scripts, leading to a shorter turnaround time and more cost-effective analysis for investigators. To date, we have processed batches of up to 300 images and 850,000 cells in under 30 minutes, saving hours and in some cases days of manual processing. This talk will describe our Nextflow pipeline and the value it brings to our work, as well as our experience learning and developing Nextflow as first-time users.

View project

Speaker

Brenna Novotny

Bioinformatician at Mayo Clinic

Co-authors

Chen Wang, Raymond Moore