karasu-7B / README.md
ptrdvn's picture
Update README.md
0a8d4b3
|
raw
history blame
800 Bytes
metadata
license: apache-2.0

image/png image/png

Base checkpoint

augmxnt/shisa-7b-v1

  • Mistral-7B base
  • Pre-trained on 8B of MADLAD-Ja
  • Finetuned on Japanese instructions
  • Highest scoring 7B model on conversation benchmark (JA MT-Bench)

Training datasets (total ~7B)

  • Aozora Bunko
  • Japanese Law Precedent Dataset
  • Japanese Wikipedia
  • .lg.jp, .go.jp, .ac.jp domain webscrapes from CulturaX (Any documents with same first 25 characters were de-duplicated)
  • English Ultrachat200K-gen (So that it doesn't forget English and chatting ability learned in the base checkpoint)