Locutusque
commited on
Commit
•
ca624ba
1
Parent(s):
fa7bc74
Update README.md
Browse files
README.md
CHANGED
@@ -25,7 +25,7 @@ pipeline_tag: text-generation
|
|
25 |
This model is intended to be used for generating conversational responses in a variety of contexts, such as chatbots, virtual assistants, and customer service applications. It is designed to provide natural and engaging responses to user input, with a focus on maintaining a consistent tone and style throughout the conversation. The model is suitable for use in both text-based and voice-based interfaces, and can be easily integrated into existing applications using the PyTorch and Transformers frameworks.
|
26 |
|
27 |
## Training Data
|
28 |
-
The model is trained on a large dataset of conversational data, consisting of interactions between users and an AI assistant. The data is preprocessed to remove any sensitive information and is formatted in a way that is suitable for training a language model. The training data is split into a training set and a validation set, with the training set used to update the model parameters and the validation set used to evaluate the model performance. The model was trained on
|
29 |
## Model Architecture
|
30 |
The model architecture used in this model is GPT-2, a transformer-based language model that is capable of generating high-quality text with a wide range of styles and tones. The GPT-2 architecture consists of a multi-layered transformer encoder-decoder, with self-attention mechanisms that allow the model to capture long-term dependencies and generate coherent text.
|
31 |
|
@@ -37,7 +37,7 @@ The model is evaluated based on several metrics, including loss, reward, penalty
|
|
37 |
- loss: 1.2
|
38 |
|
39 |
## Limitations and Bias
|
40 |
-
This model is not suitable for all use cases due to its limited training time on a weak computer. As a result, it may produce irrelevant or nonsensical responses. Additionally, it has not been fine-tuned to remember the chat history, is unable to provide follow-up responses, and it does not know the answer to many questions (it was only fine-tuned to respond in a conversational way). For optimal performance, we recommend using a GPU with at least
|
41 |
|
42 |
```python
|
43 |
import torch
|
|
|
25 |
This model is intended to be used for generating conversational responses in a variety of contexts, such as chatbots, virtual assistants, and customer service applications. It is designed to provide natural and engaging responses to user input, with a focus on maintaining a consistent tone and style throughout the conversation. The model is suitable for use in both text-based and voice-based interfaces, and can be easily integrated into existing applications using the PyTorch and Transformers frameworks.
|
26 |
|
27 |
## Training Data
|
28 |
+
The model is trained on a large dataset of conversational data, consisting of interactions between users and an AI assistant. The data is preprocessed to remove any sensitive information and is formatted in a way that is suitable for training a language model. The training data is split into a training set and a validation set, with the training set used to update the model parameters and the validation set used to evaluate the model performance. The model was trained on 302,000 examples over 502,505 steps, it achieved decent metrics.
|
29 |
## Model Architecture
|
30 |
The model architecture used in this model is GPT-2, a transformer-based language model that is capable of generating high-quality text with a wide range of styles and tones. The GPT-2 architecture consists of a multi-layered transformer encoder-decoder, with self-attention mechanisms that allow the model to capture long-term dependencies and generate coherent text.
|
31 |
|
|
|
37 |
- loss: 1.2
|
38 |
|
39 |
## Limitations and Bias
|
40 |
+
This model is not suitable for all use cases due to its limited training time on a weak computer. As a result, it may produce irrelevant or nonsensical responses. Additionally, it has not been fine-tuned to remember the chat history, is unable to provide follow-up responses, and it does not know the answer to many questions (it was only fine-tuned to respond in a conversational way). For optimal performance, we recommend using a GPU with at least 8GB of VRAM and downloading the model manually instead of using the Transformers library. Here's how you should deploy the model:
|
41 |
|
42 |
```python
|
43 |
import torch
|