sam-paech commited on
Commit
eb57a79
1 Parent(s): 7f7f8a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -3
README.md CHANGED
@@ -25,9 +25,6 @@ It scored 79.75 on the [EQ-Bench creative writing benchmark](https://eqbench.com
25
 
26
  [**Gutenberg3**](https://huggingface.co/datasets/sam-paech/gutenberg3-generalfiction-scifi-fantasy-romance-adventure-dpo) is a new, large dpo dataset containing extracts from 629 public domain fiction novels in the Gutenberg Library. It follows the same format as JonDurbin's original gutenberg set. It includes pairs of texts, where the chosen text is taken directly from a novel from the Gutenberg library, and the rejected text is generated by a language model based on a description of the passage. For this dataset I've used gemma-2-9b-it to generate the rejected texts, the idea being that it should more easily steer the base model away from its normal style (as compared to generating the rejected texts with random/weaker models).
27
 
28
- The model writes quite naturally with low amounts of gpt-slop, having inherited some human qualities from the dataset. It writes with more simple, spare prose than the typical over-adjectived LLM writing style.
29
-
30
-
31
  # Sample Outputs
32
 
33
  ### Writing Prompt
 
25
 
26
  [**Gutenberg3**](https://huggingface.co/datasets/sam-paech/gutenberg3-generalfiction-scifi-fantasy-romance-adventure-dpo) is a new, large dpo dataset containing extracts from 629 public domain fiction novels in the Gutenberg Library. It follows the same format as JonDurbin's original gutenberg set. It includes pairs of texts, where the chosen text is taken directly from a novel from the Gutenberg library, and the rejected text is generated by a language model based on a description of the passage. For this dataset I've used gemma-2-9b-it to generate the rejected texts, the idea being that it should more easily steer the base model away from its normal style (as compared to generating the rejected texts with random/weaker models).
27
 
 
 
 
28
  # Sample Outputs
29
 
30
  ### Writing Prompt