University of Illinois Urbana-Champaign, Illinois, United States
Abstract: Directed evolution mimics natural evolution in the lab, optimizing biomolecules for specific functions. Traditionally, it involves laborious, time-consuming rounds of variant library construction and functional screening. Recently, machine learning (ML) has demonstrated remarkable capabilities in evaluating variants in silico and designing efficient directed evolution strategies. Still, constructing and screening variant libraries is resource-intensive. Laboratory automation provides a means for reproducible, rapid execution of complex experiments and high-throughput data acquisition, and its integration with machine learning is a promising direction for protein evolution.
In this work, we developed a continuous Design-Build-Test-Learn (DBTL) cycle for directed protein evolution by combining machine learning with laboratory automation using our biofoundry iBioFAB. The automated continuous directed evolution is initiated with a zero-shot library designed by EVmutation with initial size of 180 mutants. To achieve a continuous cycle and eliminate the need for construct sequencing in each round, we evaluated multiple site-directed mutagenesis (SDM) protocols for their efficiency and suitability to automation. An optimized PCR and HiFi ligation-based method consistently yielded high success rate (>95%) when we selected one colony per mutant. This method uses mutation-containing primers followed by mutagenesis PCR using ORF cDNA. The optimized SDM protocol is easily amenable to laboratory automation and significantly speeds up the directed evolution workflow. Next, we optimized and designed automated workflows for all subsequent steps including transformation, plating, colony picking, protein expression, cell lysis, and functional assay. These automation scripts are integrated using the iBioFAB biofoundry through a robotic arm for plate transfer among different instruments for full automation of ML-enabled protein evolution. The data from functional screening is then used to train a machine learning model under the low-N framework and predict next round of variant library. Subsequently, the next round of library is constructed using the automated workflow. By combining machine learning and robotics automation, several rounds of screening can be carried out iteratively with much faster speed than what is possible through manual DBTL efforts. We used this continuous automated workflow to engineer phytase and halide methyltransferase enzymes as test cases.
Self-driving laboratories have transformative potential to automate and accelerate the scientific discovery process. However, automating protein engineering campaigns present unique challenges due to diverse experimental needs and vast functional landscape of proteins. Here, we designed a pipeline for continuous ML guided directed evolution using iBioFAB biofoundry by employing an efficient SDM protocol and fluorescence-based functional assay. The seamless integration of machine learning and automation will significantly advance and expand the possibilities of directed evolution.