jan-hq commited on
Commit
f887236
β€’
1 Parent(s): 18ac55a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -20
README.md CHANGED
@@ -6,9 +6,6 @@ tags:
6
  - trl
7
  - sft
8
  - generated_from_trainer
9
- - trl
10
- - sft
11
- - generated_from_trainer
12
  datasets:
13
  - jan-hq/systemchat_binarized
14
  - jan-hq/youtube_transcripts_qa
@@ -16,32 +13,68 @@ datasets:
16
  model-index:
17
  - name: TinyJensen-1.1B-Chat
18
  results: []
 
 
 
 
 
19
  ---
20
 
21
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
22
- should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
- # TinyJensen-1.1B-Chat
25
 
26
- This model is a fine-tuned version of [jan-hq/LlamaCorn-1.1B-Chat](https://huggingface.co/jan-hq/LlamaCorn-1.1B-Chat) on the jan-hq/systemchat_binarized, the jan-hq/youtube_transcripts_qa and the jan-hq/youtube_transcripts_qa_ext datasets.
27
- It achieves the following results on the evaluation set:
28
- - Loss: 0.8771
29
 
30
- ## Model description
31
 
32
- More information needed
 
 
 
33
 
34
- ## Intended uses & limitations
35
 
36
- More information needed
37
 
38
- ## Training and evaluation data
39
 
40
- More information needed
 
41
 
42
- ## Training procedure
43
 
44
- ### Training hyperparameters
45
 
46
  The following hyperparameters were used during training:
47
  - learning_rate: 5e-05
@@ -56,7 +89,7 @@ The following hyperparameters were used during training:
56
  - lr_scheduler_warmup_ratio: 0.1
57
  - num_epochs: 5
58
 
59
- ### Training results
60
 
61
  | Training Loss | Epoch | Step | Validation Loss |
62
  |:-------------:|:-----:|:----:|:---------------:|
@@ -64,10 +97,10 @@ The following hyperparameters were used during training:
64
  | 0.6608 | 2.0 | 414 | 0.7941 |
65
  | 0.526 | 3.0 | 621 | 0.8186 |
66
  | 0.4388 | 4.0 | 829 | 0.8643 |
67
- | 0.3888 | 4.99 | 1035 | 0.8771 |
68
 
69
 
70
- ### Framework versions
71
 
72
  - Transformers 4.37.2
73
  - Pytorch 2.1.2+cu121
 
6
  - trl
7
  - sft
8
  - generated_from_trainer
 
 
 
9
  datasets:
10
  - jan-hq/systemchat_binarized
11
  - jan-hq/youtube_transcripts_qa
 
13
  model-index:
14
  - name: TinyJensen-1.1B-Chat
15
  results: []
16
+ pipeline_tag: text-generation
17
+ widget:
18
+ - messages:
19
+ - role: user
20
+ content: Tell me about NVIDIA in 20 words
21
  ---
22
 
23
+ <!-- header start -->
24
+ <!-- 200823 -->
25
+
26
+ <div style="width: auto; margin-left: auto; margin-right: auto"
27
+ >
28
+ <img src="https://github.com/janhq/jan/assets/89722390/35daac7d-b895-487c-a6ac-6663daaad78e" alt="Jan banner"
29
+ style="width: 100%; min-width: 400px; display: block; margin: auto;">
30
+ </div>
31
+
32
+ <p align="center">
33
+ <a href="https://jan.ai/">Jan</a
34
+ >
35
+ - <a
36
+ href="https://discord.gg/AsJ8krTT3N">Discord</a>
37
+ </p>
38
+ <!-- header end -->
39
+
40
+ # Model description
41
+
42
+ - Finetuned [LlamaCorn-1.1B-Chat](https://huggingface.co/jan-hq/LlamaCorn-1.1B-Chat) further to act like Jensen Huang - CEO of NVIDIA.
43
+ - Use this model with caution because it can make you laugh.
44
+
45
+ # Prompt template
46
+
47
+ ChatML
48
+ ```
49
+ <|im_start|>system
50
+ {system_message}<|im_end|>
51
+ <|im_start|>user
52
+ {prompt}<|im_end|>
53
+ <|im_start|>assistant
54
 
55
+ ```
56
 
57
+ # Run this model
58
+ You can run this model using [Jan Desktop](https://jan.ai/) on Mac, Windows, or Linux.
 
59
 
60
+ Jan is an open source, ChatGPT alternative that is:
61
 
62
+ - πŸ’» **100% offline on your machine**: Your conversations remain confidential, and visible only to you.
63
+ - πŸ—‚οΈ **
64
+ An Open File Format**: Conversations and model settings stay on your computer and can be exported or deleted at any time.
65
+ - 🌐 **OpenAI Compatible**: Local server on port `1337` with OpenAI compatible endpoints
66
 
67
+ - 🌍 **Open Source & Free**: We build in public; check out our [Github](https://github.com/janhq)
68
 
69
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65713d70f56f9538679e5a56/r7VmEBLGXpPLTu2MImM7S.png)
70
 
 
71
 
72
+ # About Jan
73
+ Jan believes in the need for an open-source AI ecosystem and is building the infra and tooling to allow open-source AIs to compete on a level playing field with proprietary ones.
74
 
75
+ Jan's long-term vision is to build a cognitive framework for future robots, who are practical, useful assistants for humans and businesses in everyday life.
76
 
77
+ # Training hyperparameters
78
 
79
  The following hyperparameters were used during training:
80
  - learning_rate: 5e-05
 
89
  - lr_scheduler_warmup_ratio: 0.1
90
  - num_epochs: 5
91
 
92
+ # Training results
93
 
94
  | Training Loss | Epoch | Step | Validation Loss |
95
  |:-------------:|:-----:|:----:|:---------------:|
 
97
  | 0.6608 | 2.0 | 414 | 0.7941 |
98
  | 0.526 | 3.0 | 621 | 0.8186 |
99
  | 0.4388 | 4.0 | 829 | 0.8643 |
100
+ | 0.3888 | 5.0 | 1035 | 0.8771 |
101
 
102
 
103
+ # Framework versions
104
 
105
  - Transformers 4.37.2
106
  - Pytorch 2.1.2+cu121