Thank you
Thank you for uploading this model and the quants. It's pretty damn hilarious reading some innocent but murderous chain of thought. Is it possible to share information on how you got the obliteration process working without the preset parameters in HookedTransformer
for the different model families?
Of course! To get the obliteration process working without the preset parameters in HookedTransformer for different model families like Qwen and Mistral, I manually adapted the conversion functions from TransformerLens. I wrote custom code to map the modified weights back into the Hugging Face format, even though these models aren't originally supported by HookedTransformer. I'd be happy to share the code snippets or explain the process in more detail if you're interested!
Hello Jack, thank you for your reply and your awesome work on this! I’d be super interested to learn how you did it and code snippets would be greatly appreciated. There would be a lot of value for everyone in learning how to replicate the process for models not supported by HookedTransformer.
Yes please because I would love to know more about this process. I'm in the beginning of my LLM finetuning journey.
Hello!
Sorry for the delayed reply, and I’m glad you’re interested in replicating the process! Here’s how I got the "obliteration" process working with a model that wasn’t directly supported by HookedTransformer.
I followed this notebook as a reference: ortho_cookbook.ipynb.
Steps I took:
Clone the TransformerLens repository
git clone https://github.com/TransformerLensOrg/TransformerLens.git
Modify
loading_from_pretrained.py
in TransformerLensAdd your model path to the
OFFICIAL_MODEL_NAMES
variableExample:
OFFICIAL_MODEL_NAMES = [ "/home/jack/models/QwQ-32B-Preview", # ... other models ... ]
Add your model to the
MODEL_ALIASES
variableExample:
MODEL_ALIASES = { "/home/jack/models/QwQ-32B-Preview": "QwQ-32B-Preview", # ... other aliases ... }
Install the modified version of TransformerLens
In the notebook or your environment, install the modified version instead of the official one:
pip install /home/jack/TransformerLens
Follow the Cookbook
Use the notebook steps from ortho_cookbook.ipynb.
For the Qwen conversion back to Hugging Face Transformers, I reversed the operations fromtransformer_lens/pretrained/weight_conversions/qwen2.py
.I realized that
QwQ-32B-Preview
shares the same architecture asQwen2
, which allowed me to use the library with minimal modifications.Here’s the conversion snippet I used: Conversion Gist
I hope this helps you replicate the process! Feel free to reach out if you have any questions or need further clarification.
Wow awesome, thank you for the detailed instructions and the Gist!