|
--- |
|
license: apache-2.0 |
|
base_model: Replete-AI/Replete-Coder-Qwen2-1.5b |
|
inference: false |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- qwen2 |
|
datasets: |
|
- Replete-AI/code_bagel_hermes-2.5 |
|
- Replete-AI/code_bagel |
|
- Replete-AI/OpenHermes-2.5-Uncensored |
|
- teknium/OpenHermes-2.5 |
|
- layoric/tiny-codes-alpaca |
|
- glaiveai/glaive-code-assistant-v3 |
|
- ajibawa-2023/Code-290k-ShareGPT |
|
- TIGER-Lab/MathInstruct |
|
- chargoddard/commitpack-ft-instruct-rated |
|
- iamturun/code_instructions_120k_alpaca |
|
- ise-uiuc/Magicoder-Evol-Instruct-110K |
|
- cognitivecomputations/dolphin-coder |
|
- nickrosh/Evol-Instruct-Code-80k-v1 |
|
- coseal/CodeUltraFeedback_binarized |
|
- glaiveai/glaive-function-calling-v2 |
|
- CyberNative/Code_Vulnerability_Security_DPO |
|
- jondurbin/airoboros-2.2 |
|
- camel-ai |
|
- lmsys/lmsys-chat-1m |
|
- CollectiveCognition/chats-data-2023-09-22 |
|
- CoT-Alpaca-GPT4 |
|
- WizardLM/WizardLM_evol_instruct_70k |
|
- WizardLM/WizardLM_evol_instruct_V2_196k |
|
- teknium/GPT4-LLM-Cleaned |
|
- GPTeacher |
|
- OpenGPT |
|
- meta-math/MetaMathQA |
|
- Open-Orca/SlimOrca |
|
- garage-bAInd/Open-Platypus |
|
- anon8231489123/ShareGPT_Vicuna_unfiltered |
|
- Unnatural-Instructions-GPT4 |
|
model-index: |
|
- name: Replete-Coder-llama3-8b |
|
results: |
|
- task: |
|
name: HumanEval |
|
type: text-generation |
|
dataset: |
|
type: openai_humaneval |
|
name: HumanEval |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: 0.35365853658536583 |
|
verified: false |
|
- task: |
|
name: AI2 Reasoning Challenge |
|
type: text-generation |
|
dataset: |
|
name: AI2 Reasoning Challenge (25-Shot) |
|
type: ai2_arc |
|
config: ARC-Challenge |
|
split: test |
|
args: |
|
num_few_shot: 25 |
|
metrics: |
|
- type: accuracy |
|
value: |
|
name: normalized accuracy |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
- task: |
|
name: Text Generation |
|
type: text-generation |
|
dataset: |
|
name: HellaSwag (10-Shot) |
|
type: hellaswag |
|
split: validation |
|
args: |
|
num_few_shot: 10 |
|
metrics: |
|
- type: accuracy |
|
value: |
|
name: normalized accuracy |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
- task: |
|
name: Text Generation |
|
type: text-generation |
|
dataset: |
|
name: MMLU (5-Shot) |
|
type: cais/mmlu |
|
config: all |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: accuracy |
|
value: |
|
name: accuracy |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
- task: |
|
name: Text Generation |
|
type: text-generation |
|
dataset: |
|
name: TruthfulQA (0-shot) |
|
type: truthful_qa |
|
config: multiple_choice |
|
split: validation |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: multiple_choice_accuracy |
|
value: |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
- task: |
|
name: Text Generation |
|
type: text-generation |
|
dataset: |
|
name: Winogrande (5-shot) |
|
type: winogrande |
|
config: winogrande_xl |
|
split: validation |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: accuracy |
|
value: |
|
name: accuracy |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
- task: |
|
name: Text Generation |
|
type: text-generation |
|
dataset: |
|
name: GSM8k (5-shot) |
|
type: gsm8k |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: accuracy |
|
value: |
|
name: accuracy |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
|
|
--- |
|
# Replete-Coder-Qwen2-1.5b-exl2 |
|
|
|
Model: [Replete-Coder-Qwen2-1.5b](https://huggingface.co/Replete-AI/Replete-Coder-Qwen2-1.5b) |
|
Model creator: [Rombodawg](https://huggingface.co/rombodawg) |
|
|
|
Based on original model: [Qwen2-1.5B](https://huggingface.co/Qwen/Qwen2-1.5B) |
|
Created by: [Qwen](https://huggingface.co/Qwen) |
|
## Quants |
|
[4bpw h6 (main)](https://huggingface.co/cgus/Replete-Coder-Qwen2-1.5b-exl2/tree/main) |
|
[4.25bpw h6](https://huggingface.co/cgus/Replete-Coder-Qwen2-1.5b-exl2/tree/4.25bpw-h6) |
|
[4.65bpw h6](https://huggingface.co/cgus/Replete-Coder-Qwen2-1.5b-exl2/tree/4.65bpw-h6) |
|
[5bpw h6](https://huggingface.co/cgus/Replete-Coder-Qwen2-1.5b-exl2/tree/5bpw-h6) |
|
[6bpw h6](https://huggingface.co/cgus/Replete-Coder-Qwen2-1.5b-exl2/tree/6bpw-h6) |
|
[8bpw h8](https://huggingface.co/cgus/Replete-Coder-Qwen2-1.5b-exl2/tree/8bpw-h8) |
|
## Quantization notes |
|
Made with Exllamav2 0.1.6 with the default dataset. |
|
# Original model card |
|
# Replete-Coder-Qwen2-1.5b |
|
|
|
Finetuned by: Rombodawg |
|
### More than just a coding model! |
|
Although Replete-Coder has amazing coding capabilities, its trained on vaste amount of non-coding data, fully cleaned and uncensored. Dont just use it for coding, use it for all your needs! We are truly trying to make the GPT killer! |
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/-0dERC793D9XeFsJ9uHbx.png) |
|
|
|
Thank you to TensorDock for sponsoring Replete-Coder-llama3-8b and Replete-Coder-Qwen2-1.5b |
|
you can check out their website for cloud compute rental bellow. |
|
- https://tensordock.com |
|
__________________________________________________________________________________________________ |
|
Replete-Coder-Qwen2-1.5b is a general purpose model that is specially trained in coding in over 100 coding languages. The data used to train the model contains 25% non-code instruction data and 75% coding instruction data totaling up to 3.9 million lines, roughly 1 billion tokens, or 7.27gb of instruct data. The data used to train this model was 100% uncensored, then fully deduplicated, before training happened. |
|
|
|
The Replete-Coder models (including Replete-Coder-llama3-8b and Replete-Coder-Qwen2-1.5b) feature the following: |
|
|
|
- Advanced coding capabilities in over 100 coding languages |
|
- Advanced code translation (between languages) |
|
- Security and vulnerability prevention related coding capabilities |
|
- General purpose use |
|
- Uncensored use |
|
- Function calling |
|
- Advanced math use |
|
- Use on low end (8b) and mobile (1.5b) platforms |
|
|
|
Notice: Replete-Coder series of models are fine-tuned on a context window of 8192 tokens. Performance past this context window is not guaranteed. |
|
__________________________________________________________________________________________________ |
|
|
|
You can find the 25% non-coding instruction below: |
|
|
|
- https://huggingface.co/datasets/Replete-AI/OpenHermes-2.5-Uncensored |
|
|
|
And the 75% coding specific instruction data below: |
|
|
|
- https://huggingface.co/datasets/Replete-AI/code_bagel |
|
|
|
These two datasets were combined to create the final dataset for training, which is linked below: |
|
|
|
- https://huggingface.co/datasets/Replete-AI/code_bagel_hermes-2.5 |
|
__________________________________________________________________________________________________ |
|
## Prompt Template: ChatML |
|
``` |
|
<|im_start|>system |
|
{}<|im_end|> |
|
|
|
<|im_start|>user |
|
{}<|im_end|> |
|
|
|
<|im_start|>assistant |
|
{} |
|
``` |
|
Note: The system prompt varies in training data, but the most commonly used one is: |
|
``` |
|
Below is an instruction that describes a task, Write a response that appropriately completes the request. |
|
``` |
|
End token: |
|
``` |
|
<|endoftext|> |
|
``` |
|
__________________________________________________________________________________________________ |
|
Thank you to the community for your contributions to the Replete-AI/code_bagel_hermes-2.5 dataset. Without the participation of so many members making their datasets free and open source for any to use, this amazing AI model wouldn't be possible. |
|
|
|
Extra special thanks to Teknium for the Open-Hermes-2.5 dataset and jondurbin for the bagel dataset and the naming idea for the code_bagel series of datasets. You can find both of their huggingface accounts linked below: |
|
|
|
- https://huggingface.co/teknium |
|
- https://huggingface.co/jondurbin |
|
|
|
Another special thanks to unsloth for being the main method of training for Replete-Coder. Bellow you can find their github, as well as the special Replete-Ai secret sause (Unsloth + Qlora + Galore) colab code document that was used to train this model. |
|
|
|
- https://github.com/unslothai/unsloth |
|
- https://colab.research.google.com/drive/1eXGqy5M--0yW4u0uRnmNgBka-tDk2Li0?usp=sharing |
|
__________________________________________________________________________________________________ |
|
|
|
## Join the Replete-Ai discord! We are a great and Loving community! |
|
|
|
- https://discord.gg/ZZbnsmVnjD |
|
|