nvidia
/

Llama-3.1-Minitron-4B-Width-Base

Model card Files Files and versions Community

srvm commited on Aug 16

Commit

e28186c

•

1 Parent(s): 4de9c33

Fix typo

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ This model is released under the [NVIDIA Open Model License Agreement](https://d
 ## Model Architecture
-Llama-3.1-Minitron-4B-Width-Base uses a model embedding size of 4096, 32 attention heads, MLP intermediate dimension of 14336, with 32 layers in total. Additionally, it uses Grouped-Query Attention (GQA) and Rotary Position Embeddings (RoPE).
 **Architecture Type:** Transformer Decoder (Auto-Regressive Language Model)

 ## Model Architecture
+Llama-3.1-Minitron-4B-Width-Base uses a model embedding size of 3072, 32 attention heads, MLP intermediate dimension of 9216, with 32 layers in total. Additionally, it uses Grouped-Query Attention (GQA) and Rotary Position Embeddings (RoPE).
 **Architecture Type:** Transformer Decoder (Auto-Regressive Language Model)