javirandor's picture
Create README.md
bde0882
|
raw
history blame
1.57 kB

PassGPT

PassGPT is a causal language model trained on password leaks. It was first introduced in this paper. This version of the model was trained on passwords from the RockYou leak, which were at most 10 characters long.

Usage and License Notices

License PassGPT is intended and licensed for research use only. The model and code are CC BY NC 4.0 (allowing only non-commercial use) and should not be used outside of research purposes. This material should never be used to attack real systems.

Model description

The model inherits the GPT2LMHeadModel architecture and implements a custom BertTokenizer that encodes each character in a password as a single token, avoiding merges. It was trained from a random initialization and the code for training can be found in the official repository.

Password Generation

Passwords can be sampled from the model using the built-in generation methods provided by HuggingFace and using the "start of password token" as seed (i.e. <s>). This code can be used to generate one password with PassGPT: