Iago Pereiro Pereiro, Julien Aubert, et al.
Biomicrofluidics
Enzyme catalysts are an integral part of green chemistry strategies towards a more sustainable and resource-efficient chemical synthesis. However, the retrosynthesis of given targets with biocatalysed reactions remains a significant challenge: the substrate specificity, the potential to catalyse unreported substrates, and the specific stereo- and regioselectivity properties are domain-specific knowledge factors that hinders the adoption of biocatalysis in daily laboratory works. Here, we use the molecular transformer architecture to capture the latent knowledge about enzymatic activity from a large data set of publicly available enzymatic data, extending forward reaction and retrosynthetic pathway prediction to the domain of biocatalysis. We introduce a class token based on the EC classification scheme that allows to capture catalysis patterns among different enzymes belonging to same hierarchical families. The forward prediction model achieves a top-5 accuracy of 62.7%, while the single step retrosynthetic model shows a top-1 round-trip accuracy of 39.6%. The enzymatic data and the trained models are available through the RXN for Chemistry network (https://rxn.res.ibm.com and https://github.com/rxn4chemistry).
Iago Pereiro Pereiro, Julien Aubert, et al.
Biomicrofluidics
Francesco Tacchino, Alessandro Chiesa, et al.
ACS Spring 2022
Helgi I. Ingolfsson, Chris Neale, et al.
PNAS
Martin Zimmermann, Patrick Hunziker, et al.
Biomedical Microdevices