Machine learning methods to predict binding of SARS-CoV-2
Abstract
The pandemic caused by the outbreak of COVID-19 still looms over the world. Even if vaccine development and distribution are ongoing, the spread of the pandemic seems unstoppable due to viral mutations that are more lethal than others or transmit faster than others. The SARS-CoV-2 virus enters human cells through the binding between the SARS-CoV-2 spike protein and the human cell-surface receptor called protein angiotensin converting enzyme 2 (ACE2) (Fig. 1A). Because of its role at the early steps of host cell invasion, the receptor binding domain (RBD) of the spike protein represent the major determinant of cross-species transmission and evolution. For this reason, some of the new, more contagious variants that are spreading in UK, South Africa, or Brazil present crucial mutations in the RBD. Full-atom molecular dynamics simulations are a suitable tool to investigate the details of the molecular mechanism underlying the RBD-ACE2 binding process and understand why a mutation results in a high or low binding affinity. However, modeling binding processes at atomistic level for an exhaustive scanning of the different features of the binding processes depending on single or multiple mutations would be computationally expensive and time consuming for a system of the size of the RBD-ACE2 complex.