javirandor commited on
Commit
bde0882
1 Parent(s): 0acb144

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # PassGPT
2
+
3
+ PassGPT is a causal language model trained on password leaks. It was first introduced in [this paper](https://arxiv.org/abs/2306.01545). This version of the model was trained on passwords from the RockYou leak, which were at most 10 characters long.
4
+
5
+ ### Usage and License Notices
6
+ [![License](https://img.shields.io/badge/License-CC%20By%20NC%204.0-yellow)](https://github.com/javirandor/passbert/blob/main/LICENSE)
7
+ PassGPT is intended and licensed for research use only. The model and code are CC BY NC 4.0 (allowing only non-commercial use) and should not be used outside of research purposes. This material should never be used to attack real systems.
8
+
9
+ ### Model description
10
+
11
+ The model inherits the [GPT2LMHeadModel](https://huggingface.co/docs/transformers/model_doc/gpt2#transformers.GPT2LMHeadModel) architecture and implements a custom [BertTokenizer](https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertTokenizer) that encodes each character in a password as a single token, avoiding merges. It was trained from a random initialization and the code for training can be found in the [official repository](https://github.com/javirandor/passgpt/).
12
+
13
+ ### Password Generation
14
+
15
+ Passwords can be sampled from the model using the [built-in generation methods](https://huggingface.co/docs/transformers/v4.30.0/en/main_classes/text_generation#transformers.GenerationMixin.generate) provided by HuggingFace and using the "start of password token" as seed (i.e. `<s>`). This code can be used to generate one password with PassGPT:
16
+
17
+ ```
18
+ ```