ESM-2 Pre-finetuned for CAFA-5 for Protein Function Prediction

This model is a pre-finetuned for CAFA-5 protein function prediction for four epochs. This model is meant to be finetuned in a second stage of training with a Low Rank Adaptation. The training script for both the pre-finetuning and second stage finetuning with LoRA is available here. This notebook allows you to pre-finetune the base model, and then use a LoRA for the second stage of training. Note, the second stage of training is a harder curriculum for the model as it uses class weights so that the model better captures the hierarchical (weighted) structure of the gene ontology (GO) terms that serve as the labels for the multilabel sequence classification task of predicting a protein's functions (GO terms).

Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train AmelieSchreiber/esm2_t6_8M_finetuned_cafa5_v2