File size: 610 Bytes
91c65e2
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
## Model Overview
AgroNt is a DNA language model trained on primarily edible plant genomes. More specifically, AgroNT uses the transformer architecture with self-attention and a masked language modeling
objective to leverage highly available genotype data from 48 different plant speices. AgroNt contains 1 billion parameters and has a context window of 1000 tokens. AgroNt uses a non-overlapping
6-mer tokenizer to convert genomic nucletoide sequences to tokens. As a result the 1000 tokens correspond to approximately 6000 base pairs. 


## Using the Model from HF
'''python
Will update once it it public
'''