|
--- |
|
license: mit |
|
tags: |
|
- biology |
|
- code |
|
- diffusion |
|
- deep learning |
|
- scientific AI |
|
- proteins |
|
- machine learning |
|
- generative AI |
|
--- |
|
# ProteinMechanicsDiffusionDesign |
|
|
|
## End-to-end de novo protein generation based on nonlinear mechanical unfolding responses using a language diffusion model |
|
|
|
B. Ni, D.L. Kaplan, and M.J. Buehler, in revision |
|
|
|
![plot](./documents/Figure1.png) |
|
|
|
### Summary |
|
Through evolution, nature has presented a set of remarkable protein materials, including elastins, silks, keratins and collagens with superior mechanical performances that play crucial roles in mechanobiology. However, going beyond natural designs to discover proteins that meet specified mechanical properties remains challenging. Here we report a generative model that predicts protein designs to meet complex nonlinear mechanical property-design objectives. Our model leverages deep knowledge on protein sequences from a pre-trained protein language model and maps mechanical unfolding responses to create novel proteins. Via full-atom molecular simulations for direct validation, we demonstrate that the designed proteins are novel, and fulfill the targeted mechanical properties, including unfolding energy and mechanical strength, as well as the detailed unfolding force-separation curves. Our model offers rapid pathways to explore the enormous mechanobiological protein sequence space unconstrained by biological synthesis, using mechanical features as target to enable the discovery of protein materials with superior mechanical properties. |
|
|
|
|
|
### Installation and use |
|
|
|
<a target="_blank" href="https://colab.research.google.com/github/lamm-mit/ProteinMechanicsDiffusionDesign/blob/main/notebook_for_colab/pLDM_inference_standalone_colab.ipynb"> |
|
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> |
|
</a> |
|
|
|
|
|
To install and run, just click the "open in Colab" badge and run all. |
|
Note that, the first run may take time as it includes downloading and installing packages and the trained model. |
|
|
|
### Model file structure |
|
The pretrained model is hosted on this repository with the following files. |
|
> 1. data_pack.pickle: pickle file with the key properties of the training dataset |
|
> 2. model_pack.pickle: pickle file with the key properties of the model |
|
> 3. ForTest_LE_128_From_F1_f5.pk: pickle file with the key properties of the de novo test dataset |
|
> 4. trainer_save-model_pLDM.pt: file of the pretrained model weight |
|
> 5. mkdssp: executable file to perform secondary structure classification using DSSP |
|
|
|
|
|
### Code repository |
|
The code base is maintained through [this GitHub repository](https://github.com/lamm-mit/ProteinMechanicsDiffusionDesign). |
|
|