javirandor
/

passgpt-10characters

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

javirandor commited on Jun 19, 2023

Commit

3c676b3

•

1 Parent(s): bde0882

Update README.md

Files changed (1) hide show

README.md +38 -0

README.md CHANGED Viewed

@@ -15,4 +15,42 @@ The model inherits the [GPT2LMHeadModel](https://huggingface.co/docs/transformer
 Passwords can be sampled from the model using the [built-in generation methods](https://huggingface.co/docs/transformers/v4.30.0/en/main_classes/text_generation#transformers.GenerationMixin.generate) provided by HuggingFace and using the "start of password token" as seed (i.e. `<s>`). This code can be used to generate one password with PassGPT:
 ```
 ```

 Passwords can be sampled from the model using the [built-in generation methods](https://huggingface.co/docs/transformers/v4.30.0/en/main_classes/text_generation#transformers.GenerationMixin.generate) provided by HuggingFace and using the "start of password token" as seed (i.e. `<s>`). This code can be used to generate one password with PassGPT:
 ```
+from transformers import GPT2LMHeadModel
+from transformers import RobertaTokenizerFast
+tokenizer = RobertaTokenizerFast.from_pretrained("javirandor/passgpt-10characters",
+                                                  max_len=12,
+                                                  padding="max_length",
+                                                  truncation=True,
+                                                  do_lower_case=False,
+                                                  strip_accents=False,
+                                                  mask_token="<mask>",
+                                                  unk_token="<unk>",
+                                                  pad_token="<pad>",
+                                                  truncation_side="right")
+model = GPT2LMHeadModel.from_pretrained("javirandor/passgpt-10characters").eval()
+NUM_GENERATIONS = 1
+with torch.no_grad():
+  # Generate passwords sampling from the beginning of password token
+  g = model.generate(torch.tensor([[tokenizer.bos_token_id]]).cuda(),
+                    do_sample=True,
+                    num_return_sequences=NUM_GENERATIONS,
+                    max_length=12,
+                    pad_token_id=tokenizer.pad_token_id,
+                    bad_words_ids=[[tokenizer.bos_token_id]])
+  # Remove start of sentence token
+  g = g[:, 1:]
+  decoded = tokenizer.batch_decode(g.tolist())
+  decoded_clean = [i.split("</s>")[0] for i in decoded] # Get content before end of password token
+# Print your sampled passwords!
+print(decoded_clean)
 ```
+You can find a more flexible script for sampling [here](https://github.com/javirandor/passgpt/blob/main/src/generate_passwords.py).