Hafez Mousavi
hafezmg48
AI & ML interests
LLMs, Transformers
Organizations
None yet
hafezmg48's activity
Does Qwen use RMSNorm or LayerNorm?
1
#21 opened 4 months ago
by
hafezmg48
Why 72B model has different vocab size comparing with other models?
6
#1 opened 10 months ago
by
Mikasaka
Intermediate_size is doubled in config.json
1
#3 opened 8 months ago
by
hafezmg48
Intermediate_size is doubled in config.json
1
#3 opened 8 months ago
by
hafezmg48