File size: 1,675 Bytes
f4321e2
 
 
 
 
 
23eaf68
 
 
 
 
 
 
 
f4321e2
 
47412da
961776a
c192693
63c6a7e
 
2f95943
 
 
fe52666
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8889744
3df7858
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
license: apache-2.0
language:
- zh
library_name: transformers
pipeline_tag: text-generation
inference:
  parameters:
    temperature: 0.7
    top_p: 0.6
    repetition_penalty: 1.1
    max_new_tokens: 128
    num_return_sequences: 3
    do_sample: true
tags:
- art
widget:
- 笔底江山助磅礴
- (唐诗:秋思)诗词
- (宋词:浣溪沙)秋
- (对联)冬



---

# Chinese Poem and Couplt small GPT2 Model

## Model description

The model is used to generate Chinese ancient poems and couplets. It is based on the [IDEA-CCNL/Wenzhong-GPT2-110M](https://huggingface.co/IDEA-CCNL/Wenzhong-GPT2-110M)


## How to use

You can use the model directly with a pipeline for text generation:

When the parameter skip_special_tokens is True:

```python
>>> from transformers import BertTokenizer, GPT2LMHeadModel,TextGenerationPipeline
>>> tokenizer = BertTokenizer.from_pretrained("snzhang/GPT2-Poem-Small")
>>> model = GPT2LMHeadModel.from_pretrained("snzhang/GPT2-Poem-Small")
>>> text_generator = TextGenerationPipeline(model, tokenizer)   
>>> text_generator("笔底江山助磅礴", max_length=50, do_sample=True)
    [{'generated_text':'笔底江山助磅礴,万卷诗书见成章。'}]
```

And you can add the prefix "(唐诗:your title)"、"(宋词:your title)" and "(对联)" to make generation more precise.

## Training data

Training data contains 71,334 Chinese ancient poems and couplets which are collected by [Chinese Poetry](https://github.com/chinese-poetry/chinese-poetry) and [Couplet Dataset](https://github.com/wb14123/couplet-dataset)

## More Details

You can get more details in [GPT2-Poem-Small](https://github.com/h7nian/GPT2-Poem-Small)