avelezarce's picture
Update README.md
f68c56d verified
---
language:
- en
metrics:
- accuracy
- AUC ROC
- precision
- recall
tags:
- biology
- chemistry
- therapeutic science
- drug design
- drug development
- therapeutics
library_name: tdc
license: bsd-2-clause
---
## Dataset description
As a membrane separating circulating blood and brain extracellular fluid, the blood-brain barrier (BBB) is the protection layer that blocks most foreign drugs. Thus the ability of a drug to penetrate the barrier to deliver to the site of action forms a crucial challenge in development of drugs for central nervous system.
## Task description
Binary classification. Given a drug SMILES string, predict the activity of BBB.
## Dataset statistics
Total: 1,975; Train_val: 1,580; Test: 395
## Pre-requisites
Install the following packages
```
pip install PyTDC
pip install DeepPurpose
pip install git+https://github.com/bp-kelley/descriptastorus
pip install dgl torch torchvision
```
You can also reference the colab notebook [here](https://colab.research.google.com/drive/1CL92SOCBS-eYDL99w8tjSNIG_ySXzMrG?usp=sharing)
## Dataset split
Random split on 70% training, 10% validation, and 20% testing
To load the dataset in TDC, type
```python
from tdc.single_pred import ADME
data = ADME(name = 'BBB_Martins')
```
## Model description
AttentiveFP is a Graph Attention Network-based molecular representation learning method. The model is tuned with 100 runs using the Ax platform.
To load the pre-trained model, type
```python
from tdc import tdc_hf_interface
tdc_hf = tdc_hf_interface("BBB_Martins-AttentiveFP")
# load deeppurpose model from this repo
dp_model = tdc_hf.load_deeppurpose('./data')
tdc_hf.predict_deeppurpose(dp_model, ['YOUR SMILES STRING'])
```
## References
* Dataset entry in Therapeutics Data Commons, https://tdcommons.ai/single_pred_tasks/adme/#bbb-blood-brain-barrier-martins-et-al
* Martins, Ines Filipa, et al. “A Bayesian approach to in silico blood-brain barrier penetration modeling.” Journal of chemical information and modeling 52.6 (2012): 1686-1697.