javirandor commited on
Commit
3c676b3
1 Parent(s): bde0882

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -0
README.md CHANGED
@@ -15,4 +15,42 @@ The model inherits the [GPT2LMHeadModel](https://huggingface.co/docs/transformer
15
  Passwords can be sampled from the model using the [built-in generation methods](https://huggingface.co/docs/transformers/v4.30.0/en/main_classes/text_generation#transformers.GenerationMixin.generate) provided by HuggingFace and using the "start of password token" as seed (i.e. `<s>`). This code can be used to generate one password with PassGPT:
16
 
17
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ```
 
 
 
 
15
  Passwords can be sampled from the model using the [built-in generation methods](https://huggingface.co/docs/transformers/v4.30.0/en/main_classes/text_generation#transformers.GenerationMixin.generate) provided by HuggingFace and using the "start of password token" as seed (i.e. `<s>`). This code can be used to generate one password with PassGPT:
16
 
17
  ```
18
+ from transformers import GPT2LMHeadModel
19
+ from transformers import RobertaTokenizerFast
20
+
21
+ tokenizer = RobertaTokenizerFast.from_pretrained("javirandor/passgpt-10characters",
22
+ max_len=12,
23
+ padding="max_length",
24
+ truncation=True,
25
+ do_lower_case=False,
26
+ strip_accents=False,
27
+ mask_token="<mask>",
28
+ unk_token="<unk>",
29
+ pad_token="<pad>",
30
+ truncation_side="right")
31
+
32
+ model = GPT2LMHeadModel.from_pretrained("javirandor/passgpt-10characters").eval()
33
+
34
+ NUM_GENERATIONS = 1
35
+
36
+ with torch.no_grad():
37
+ # Generate passwords sampling from the beginning of password token
38
+ g = model.generate(torch.tensor([[tokenizer.bos_token_id]]).cuda(),
39
+ do_sample=True,
40
+ num_return_sequences=NUM_GENERATIONS,
41
+ max_length=12,
42
+ pad_token_id=tokenizer.pad_token_id,
43
+ bad_words_ids=[[tokenizer.bos_token_id]])
44
+
45
+ # Remove start of sentence token
46
+ g = g[:, 1:]
47
+
48
+ decoded = tokenizer.batch_decode(g.tolist())
49
+ decoded_clean = [i.split("</s>")[0] for i in decoded] # Get content before end of password token
50
+
51
+ # Print your sampled passwords!
52
+ print(decoded_clean)
53
  ```
54
+
55
+ You can find a more flexible script for sampling [here](https://github.com/javirandor/passgpt/blob/main/src/generate_passwords.py).
56
+