loiccabannes
/

MambaSan-130m-instruct

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

loiccabannes commited on Feb 5

Commit

0188bf6

•

1 Parent(s): 542ceb6

Update README.md

Files changed (1) hide show

README.md +1 -10

README.md CHANGED Viewed

@@ -12,16 +12,7 @@ pipeline_tag: question-answering
 **MambaSan-instruct is the first chat Japanese language model based on a state-space model architecture (Mamba), not a transformer.**
 The model is based on Albert Gu's and Tri Dao's work *Mamba: Linear-Time Sequence Modeling with Selective State Spaces* ([paper](https://arxiv.org/pdf/2312.00752.pdf)) as well as their [model implementation](https://github.com/state-spaces/mamba).
-This work was also inspired by heavenq's mamba-chat implementation in English:
-bibtex
-@misc{haven2023mambachat,
-  title        = {Mamba-Chat},
-  author       = {Justus Mattern and Konstantin Hohr},
-  year         = {2023},
-  howpublished = {GitHub},
-  url          = {https://github.com/havenhq/mamba-chat}
-}
-This repository provides training / fine-tuning code for the model based on some modifications of the Huggingface Trainer class.
 Mamba-Chat is based on MambaSan-130m and was fine-tuned on 31,7k examples samples of the [SkelterLabsInc/JaQuAD](https://huggingface.co/datasets/SkelterLabsInc/JaQuAD) dataset. To learn more, you can:

 **MambaSan-instruct is the first chat Japanese language model based on a state-space model architecture (Mamba), not a transformer.**
 The model is based on Albert Gu's and Tri Dao's work *Mamba: Linear-Time Sequence Modeling with Selective State Spaces* ([paper](https://arxiv.org/pdf/2312.00752.pdf)) as well as their [model implementation](https://github.com/state-spaces/mamba).
+This work was also inspired by heavenq's mamba-chat implementation in English.
 Mamba-Chat is based on MambaSan-130m and was fine-tuned on 31,7k examples samples of the [SkelterLabsInc/JaQuAD](https://huggingface.co/datasets/SkelterLabsInc/JaQuAD) dataset. To learn more, you can: