phi-1_5 / configuration_mixformer_sequential.py

Commit History

Adds support for MQA/GQA and attention mask during training.
de35f90

gugarosa commited on

Support for `attention_mask` in forward pass.
3128bb6

gugarosa commited on

Upload MixFormerSequentialForCausalLM
1698206

suriyagunasekar commited on