Analyzing, sharing, and visualizing scientific data with the Cirro data platform
Sam Minot
Scientific research relies on sophisticated instrumentation which produces large volumes of highly-detailed information: genome sequencers, mass spectrometers, flow cytometers, high-throughput microscopes, etc. An effective research group must be able to store, analyze, and comprehend these datasets in the face of ever-changing technologies, shifting groups of collaborators, and the myriad complexities of high-performance computing. To address the challenges facing researchers at top-tier institutions, the Data Core Shared Resource at the Fred Hutchinson Cancer Center (Seattle, USA) developed Cirro, a cloud-based platform for data management, analysis, and visualization. By distilling the wide array of cloud computing services available down to the set of core activities needed by researchers, Cirro presents a streamlined environment for conducting biomedical research without having to use the command line.
The core functionality provided by Cirro includes: cost-effective data storage with automated savings for infrequently accessed files; self-service data sharing with internal and external collaborators; straightforward computational analysis using pre-defined Nextflow or WDL workflows; freeform statistical analysis inside Jupyter notebooks; and powerful data visualization with user-configurable charts.
Dr. Minot will demonstrate the functionality of Cirro, explore the complexities of using cloud computing effectively in the research setting, explain how Nextflow has been instrumental to its development, and discuss how scientists at any institution can take advantage of this resource for their own research.
Sam Minot
Associate Director, Data Science Applications at Fred Hutchinson Cancer Center