Applying and deploying AlphaFold at scale to decode the human gut microbiome proteome
A long-lasting problem in structural and computational biology is to accurately predict the three-dimensional structure of a protein from its amino-acids sequence. A recent breakthrough in this space was achieved by AlphaFold, a deep learning neural network allowing unprecedented accuracy in resolving protein structures.
Following the public release of AlphaFold, several projects were initiated to ease access to the predicted structures generated by this method, in particular the DeepMind-EBI joint initiative to release resolved protein structures for key organisms and the development of ColabFold.
Despite these improvements, several challenges exist to deploy and run this method on thousands of proteins at once. Here we will present the development of a Nextflow pipeline to run AlphaFold at scale on a cloud computing environment, briefly discussing its application on a collection of human gut microbiome proteins to explore their functional potential and host interactions.