NeMo
Safetensors
llama
srvm commited on
Commit
e28186c
1 Parent(s): 4de9c33
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -24,7 +24,7 @@ This model is released under the [NVIDIA Open Model License Agreement](https://d
24
 
25
  ## Model Architecture
26
 
27
- Llama-3.1-Minitron-4B-Width-Base uses a model embedding size of 4096, 32 attention heads, MLP intermediate dimension of 14336, with 32 layers in total. Additionally, it uses Grouped-Query Attention (GQA) and Rotary Position Embeddings (RoPE).
28
 
29
  **Architecture Type:** Transformer Decoder (Auto-Regressive Language Model)
30
 
 
24
 
25
  ## Model Architecture
26
 
27
+ Llama-3.1-Minitron-4B-Width-Base uses a model embedding size of 3072, 32 attention heads, MLP intermediate dimension of 9216, with 32 layers in total. Additionally, it uses Grouped-Query Attention (GQA) and Rotary Position Embeddings (RoPE).
28
 
29
  **Architecture Type:** Transformer Decoder (Auto-Regressive Language Model)
30