Update README.md

Thanks for adding the README! This was a great model! However, I do see some misinformation/misspelling on the README. So I(Hopefully helpfully) changed it to correct any misinformation/misspelling. This model is not the best 3B nor the best 3B at MMLU(5-shot). Look forward for your reply.

Files changed (1) hide show

README.md +11 -9

README.md CHANGED Viewed

@@ -13,9 +13,9 @@ license: apache-2.0
 ---
 # Model Card
-**The Best 3B Model! Surpassing dolly-v2-12b**
-The best 3B model on MMLU (5-shot) on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard), with performance surpassing dolly-v2-12b
 | Metric                | Value |
 |-----------------------|-------|
@@ -25,15 +25,15 @@ The best 3B model on MMLU (5-shot) on the [Open LLM Leaderboard](https://hugging
 | TruthfulQA (0-shot)   | 37.3  |
 | Avg.                  | 45.2  |
-We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above.
-The training code and data will be open sourced later on Github(https://github.com/chi2liu/mamba-gpt-3b)
 ## Training Dataset
-` mamba-gpt-3b-v4 ` is trained on multiply dataset:
   - [Stanford Alpaca (en)](https://github.com/tatsu-lab/stanford_alpaca)
   - [Open Assistant (multilingual)](https://huggingface.co/datasets/OpenAssistant/oasst1)
   - [LIMA (en)](https://huggingface.co/datasets/GAIR/lima)
@@ -44,13 +44,13 @@ The training code and data will be open sourced later on Github(https://github.c
 ## Summary
-We have fine-tuned the open-lama model and surpassed the original model in multiple evaluation subtasks, making it currently the best performing 3B model with comparable performance to llama-7b
 - Base model: [openlm-research/open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2)
 ## Usage
-To use the model with the `transformers` library on a machine with GPUs, first make sure you have the `transformers`, `accelerate` and `torch` libraries installed.
 ```bash
 pip install transformers==4.29.2
@@ -58,6 +58,8 @@ pip install accelerate==0.19.0
 pip install torch==2.0.0
 ```
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
@@ -65,8 +67,8 @@ tokenizer = AutoTokenizer.from_pretrained("CobraMamba/mamba-gpt-3b-v4")
 model = AutoModelForCausalLM.from_pretrained("CobraMamba/mamba-gpt-3b-v4", trust_remote_code=True, torch_dtype=torch.float16)
 # we use alpaca prompt
-input_context = "Your text here"
-input_ids = tokenizer.encode(input_context, return_tensors="pt")
 output = model.generate(input_ids, max_length=128, temperature=0.7)
 output_text = tokenizer.decode(output[0], skip_special_tokens=True)
 print(output_text)

 ---
 # Model Card
+**One of the Best 3B Model! Surpassing dolly-v2-12b in the Open LLM Leaderboard!**
+One of the best 3B model on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard), with performance surpassing dolly-v2-12b!
 | Metric                | Value |
 |-----------------------|-------|
 | TruthfulQA (0-shot)   | 37.3  |
 | Avg.                  | 45.2  |
+We used the SOTA(State Of The Art) [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above.
+The training code and data will be open sourced later on Github(https://github.com/chi2liu/mamba-gpt-3b).
 ## Training Dataset
+` mamba-gpt-3b-v4 ` is trained on multiple datasets:
   - [Stanford Alpaca (en)](https://github.com/tatsu-lab/stanford_alpaca)
   - [Open Assistant (multilingual)](https://huggingface.co/datasets/OpenAssistant/oasst1)
   - [LIMA (en)](https://huggingface.co/datasets/GAIR/lima)
 ## Summary
+We have fine-tuned the OpenLLaMA model and surpassed the original model in multiple evaluation subtasks, making it currently one of the best performing 3B model, with comparable performance to llama-7b.
 - Base model: [openlm-research/open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2)
 ## Usage
+To use the model with the `transformers` library on a machine with GPU(s), first make sure you have the `transformers`, `accelerate` and `torch` libraries installed.
 ```bash
 pip install transformers==4.29.2
 pip install torch==2.0.0
 ```
+Then, run the following Python snippet:
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 model = AutoModelForCausalLM.from_pretrained("CobraMamba/mamba-gpt-3b-v4", trust_remote_code=True, torch_dtype=torch.float16)
 # we use alpaca prompt
+input_content = "Your text here"
+input_ids = tokenizer.encode(input_content, return_tensors="pt")
 output = model.generate(input_ids, max_length=128, temperature=0.7)
 output_text = tokenizer.decode(output[0], skip_special_tokens=True)
 print(output_text)