Are there two identical embedding tensors, even though embeddings are shared?
#15 opened about 16 hours ago
by
graefics
About MMLU evaluation
1
#12 opened 30 days ago
by
ldwang
Intermediate checkpoints for research purposes
#11 opened about 1 month ago
by
maveriq
ONNX generation
1
#9 opened about 2 months ago
by
davesoma
Adding Evaluation Results
#8 opened about 2 months ago
by
leaderboard-pr-bot
Google Colab for Guide to Use the Model
#7 opened 2 months ago
by
mesjavacca
Model code for training from sractch
1
#6 opened 2 months ago
by
Chrisneverdie
Multilingual Support
1
#5 opened 2 months ago
by
Mistsink
Trapezoidal scheduler with cooldown phase
3
#4 opened 2 months ago
by
maveriq