Update README.md
Browse files
README.md
CHANGED
@@ -6,9 +6,6 @@ tags:
|
|
6 |
- trl
|
7 |
- sft
|
8 |
- generated_from_trainer
|
9 |
-
- trl
|
10 |
-
- sft
|
11 |
-
- generated_from_trainer
|
12 |
datasets:
|
13 |
- jan-hq/systemchat_binarized
|
14 |
- jan-hq/youtube_transcripts_qa
|
@@ -16,32 +13,68 @@ datasets:
|
|
16 |
model-index:
|
17 |
- name: TinyJensen-1.1B-Chat
|
18 |
results: []
|
|
|
|
|
|
|
|
|
|
|
19 |
---
|
20 |
|
21 |
-
<!--
|
22 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
-
|
25 |
|
26 |
-
|
27 |
-
|
28 |
-
- Loss: 0.8771
|
29 |
|
30 |
-
|
31 |
|
32 |
-
|
|
|
|
|
|
|
33 |
|
34 |
-
|
35 |
|
36 |
-
|
37 |
|
38 |
-
## Training and evaluation data
|
39 |
|
40 |
-
|
|
|
41 |
|
42 |
-
|
43 |
|
44 |
-
|
45 |
|
46 |
The following hyperparameters were used during training:
|
47 |
- learning_rate: 5e-05
|
@@ -56,7 +89,7 @@ The following hyperparameters were used during training:
|
|
56 |
- lr_scheduler_warmup_ratio: 0.1
|
57 |
- num_epochs: 5
|
58 |
|
59 |
-
|
60 |
|
61 |
| Training Loss | Epoch | Step | Validation Loss |
|
62 |
|:-------------:|:-----:|:----:|:---------------:|
|
@@ -64,10 +97,10 @@ The following hyperparameters were used during training:
|
|
64 |
| 0.6608 | 2.0 | 414 | 0.7941 |
|
65 |
| 0.526 | 3.0 | 621 | 0.8186 |
|
66 |
| 0.4388 | 4.0 | 829 | 0.8643 |
|
67 |
-
| 0.3888 |
|
68 |
|
69 |
|
70 |
-
|
71 |
|
72 |
- Transformers 4.37.2
|
73 |
- Pytorch 2.1.2+cu121
|
|
|
6 |
- trl
|
7 |
- sft
|
8 |
- generated_from_trainer
|
|
|
|
|
|
|
9 |
datasets:
|
10 |
- jan-hq/systemchat_binarized
|
11 |
- jan-hq/youtube_transcripts_qa
|
|
|
13 |
model-index:
|
14 |
- name: TinyJensen-1.1B-Chat
|
15 |
results: []
|
16 |
+
pipeline_tag: text-generation
|
17 |
+
widget:
|
18 |
+
- messages:
|
19 |
+
- role: user
|
20 |
+
content: Tell me about NVIDIA in 20 words
|
21 |
---
|
22 |
|
23 |
+
<!-- header start -->
|
24 |
+
<!-- 200823 -->
|
25 |
+
|
26 |
+
<div style="width: auto; margin-left: auto; margin-right: auto"
|
27 |
+
>
|
28 |
+
<img src="https://github.com/janhq/jan/assets/89722390/35daac7d-b895-487c-a6ac-6663daaad78e" alt="Jan banner"
|
29 |
+
style="width: 100%; min-width: 400px; display: block; margin: auto;">
|
30 |
+
</div>
|
31 |
+
|
32 |
+
<p align="center">
|
33 |
+
<a href="https://jan.ai/">Jan</a
|
34 |
+
>
|
35 |
+
- <a
|
36 |
+
href="https://discord.gg/AsJ8krTT3N">Discord</a>
|
37 |
+
</p>
|
38 |
+
<!-- header end -->
|
39 |
+
|
40 |
+
# Model description
|
41 |
+
|
42 |
+
- Finetuned [LlamaCorn-1.1B-Chat](https://huggingface.co/jan-hq/LlamaCorn-1.1B-Chat) further to act like Jensen Huang - CEO of NVIDIA.
|
43 |
+
- Use this model with caution because it can make you laugh.
|
44 |
+
|
45 |
+
# Prompt template
|
46 |
+
|
47 |
+
ChatML
|
48 |
+
```
|
49 |
+
<|im_start|>system
|
50 |
+
{system_message}<|im_end|>
|
51 |
+
<|im_start|>user
|
52 |
+
{prompt}<|im_end|>
|
53 |
+
<|im_start|>assistant
|
54 |
|
55 |
+
```
|
56 |
|
57 |
+
# Run this model
|
58 |
+
You can run this model using [Jan Desktop](https://jan.ai/) on Mac, Windows, or Linux.
|
|
|
59 |
|
60 |
+
Jan is an open source, ChatGPT alternative that is:
|
61 |
|
62 |
+
- π» **100% offline on your machine**: Your conversations remain confidential, and visible only to you.
|
63 |
+
- ποΈ **
|
64 |
+
An Open File Format**: Conversations and model settings stay on your computer and can be exported or deleted at any time.
|
65 |
+
- π **OpenAI Compatible**: Local server on port `1337` with OpenAI compatible endpoints
|
66 |
|
67 |
+
- π **Open Source & Free**: We build in public; check out our [Github](https://github.com/janhq)
|
68 |
|
69 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65713d70f56f9538679e5a56/r7VmEBLGXpPLTu2MImM7S.png)
|
70 |
|
|
|
71 |
|
72 |
+
# About Jan
|
73 |
+
Jan believes in the need for an open-source AI ecosystem and is building the infra and tooling to allow open-source AIs to compete on a level playing field with proprietary ones.
|
74 |
|
75 |
+
Jan's long-term vision is to build a cognitive framework for future robots, who are practical, useful assistants for humans and businesses in everyday life.
|
76 |
|
77 |
+
# Training hyperparameters
|
78 |
|
79 |
The following hyperparameters were used during training:
|
80 |
- learning_rate: 5e-05
|
|
|
89 |
- lr_scheduler_warmup_ratio: 0.1
|
90 |
- num_epochs: 5
|
91 |
|
92 |
+
# Training results
|
93 |
|
94 |
| Training Loss | Epoch | Step | Validation Loss |
|
95 |
|:-------------:|:-----:|:----:|:---------------:|
|
|
|
97 |
| 0.6608 | 2.0 | 414 | 0.7941 |
|
98 |
| 0.526 | 3.0 | 621 | 0.8186 |
|
99 |
| 0.4388 | 4.0 | 829 | 0.8643 |
|
100 |
+
| 0.3888 | 5.0 | 1035 | 0.8771 |
|
101 |
|
102 |
|
103 |
+
# Framework versions
|
104 |
|
105 |
- Transformers 4.37.2
|
106 |
- Pytorch 2.1.2+cu121
|