Undi PRO

Undi95

AI & ML interests

I search sleep

Organizations

Undi95's activity

replied to their post about 2 months ago
view reply

Llama 3.1 model got their tokenizer_config file modified. We updated them.
GGUF already done will have old chat template inside but they still work properly.

posted an update about 2 months ago
view post
Post
8448
Exciting news!

After a long wait, Ikari and me finally made a new release of our last model on NeverSleep repo: Lumimaid-v0.2

This model can be used in different size, from the small Llama-3.1-8B to the gigantic Mistral-Large-123B, finetuned by us.

Try them now!

- NeverSleep/Lumimaid-v0.2-8B
- NeverSleep/Lumimaid-v0.2-12B
- NeverSleep/Lumimaid-v0.2-70B
- NeverSleep/Lumimaid-v0.2-123B

All the datasets we used will be added and credit will be given!
For the quant, we wait for fix to be applied (https://github.com/ggerganov/llama.cpp/pull/8676)
Hope you will enjoy them!
·
replied to their post about 2 months ago
view reply

Just curious, how much difference in intelligence do you think there would be between the 68 and 39 refusals? Would there be any reason to use the 68? More realistic characters maybe?

Thanks for all the models you've shared

Thing is modifying direction like this make perplexity higher, and output is of lower quality. So we need to find a balance, I took the two best model that got made by the script.

If you get 0 refusal for exemple, it will never refuse anything but it could break the model and make it dumb asf, and you're welcome!

replied to their post about 2 months ago
view reply

Hello there, I written a wall of text and my webpage refreshed haha, so let's me summarize again.

This method is called Orthogonal Activation Steering, it come from here : https://www.lesswrong.com/posts/jGuXSZgv6qfdhMCuJ/refusal-in-llms-is-mediated-by-a-single-direction
Then, a demo using TransformersLens was available using a Qwen model, but the resulting model couldn't be saved : https://colab.research.google.com/drive/1a-aQvKC9avdZpdyBn4jgRQFObTPy1JZw?usp=sharing#scrollTo=j7hOtw7UHXdD

Following that, wassname made a modification of this demo, and made a first script, we talked about this here : https://huggingface.co/posts/Undi95/318385306588047
The OG script isn't available because he updated it here : https://gist.github.com/wassname/42aba7168bb83e278fcfea87e70fa3af

TransformersLens then got replaced for Baukit.

Failspy made his own notebook too, calling the method abliteration, but it's the same thing : https://huggingface.co/failspy/llama-3-70B-Instruct-abliterated/blob/main/ortho_cookbook.ipynb

Finally, to reply to your answer, for this project I used a script from Lucyknada, with 1xH100 80GB, and I let it run for like 15 minutes before I found a direction with 36 refusal for 3000 toxic prompt.
It's easy and automatic, you can modify it easily too : https://github.com/lucyknada/baukit-modified

Dunno for Nemo.

Hope this help

posted an update about 2 months ago
view post
Post
7003
Hello there,

New model released, my goal was to try finetune on the last Llama-3.1-8B-Instruct but not a small train, I wanted to do something useful.
One of the rare model that I didn't made for RP, or in the goal to uncensor it (but I did anyway kek).

The model was trained on 9M Claude conversations ONLY, giving him another writting style.

Undi95/Meta-Llama-3.1-8B-Claude > OG release fp32, it's the epoch 2
Undi95/Meta-Llama-3.1-8B-Claude-bf16 > Base model resharded in bf16 waiting for available quant without issues

Since it's frustrating to be censored using a local model, orthogonal activation steering was used, trying to force the model to never refuse a prompt.

Undi95/Meta-Llama-3.1-8B-Claude-68fail-3000total > Uncensored model, refuse 68 times on 3000 toxic prompt
Undi95/Meta-Llama-3.1-8B-Claude-39fail-3000total > Uncensored model, refuse 39 times on 3000 toxic prompt

It still refuse some prompt but the majority of them is uncensored. OAS can make a model more dumb or make the base perplexity go higher, so I didn't snipe for 0 refusal.

I don't do non-RP model a lot so any feedback is welcome, I would like to re-use this base for some others future project if needed.
·
posted an update 4 months ago
view post
Post
16407
Hey everyone,

Just wanted to shout out a massive thank you to all 2000 of you who've followed me on Hugging Face! 🎉 It's incredible to have such an awesome crew backing me up as I dive into all these LLM experiments.

Even though not all my models turn out perfect, I've found some real gems and methods along the way 💎. It's like digging for treasure – sometimes you found nothing, but sometimes you find a pearl, and sometimes you find a new method to try.

Your support and encouragement mean the world to me, and I'm really stoked to keep experimenting and learning. If you told me some years ago I would have so much people following me for what I do, I wouldn't have believed it. Here's to more discoveries and adventures ahead! 🚀

Also, big thanks once again, and a huge shoutout to @IkariDev for being there through this journey and supporting me. I'm excited for our future work together and hope we will continue to make people happy! 👏

I want to thank @Gryphe too, since my early work was heavily inspired from MythoMax and the RP/ERP vibe of it. If I'm here today it's probably because of you 😂

I was so close to forget @chargoddard and his amazing tool too! What will we do without mergekit in our life? Thank you! 🙏

See y'all at 3k!
·
replied to their post 5 months ago
view reply

@wassname Hello! Thanks a lot for that my dude, I will try that.
Do the uncensoring work better when applied now? Did you get good result in the model that get made?

Really hype to try out the new script. Will do ASAP when I get home.

replied to their post 5 months ago
view reply

Hi all, in my script, I think the part where I patch a huggingface model is broken. If I benchmark it just before saving, it seems to still refuse.

Hey there wassname, thanks for coming under this post! Model getting out of the script still refuse thing, but from my own testing, I feel like there is less refusal anyway. Sometime you need some regen, or a very tiny system prompt. So it work even lightly (hoping it's not placebo lol), which is a good thing!

Please update us if you find a way to fix the issue, and thanks again for that. Fresh tools is always a delightful treat.

replied to their post 5 months ago
replied to their post 5 months ago
view reply

are you sure you're using the correct instruct format?

Yeah. That makes no sense. I guess I'll pay for a runpod and try again, just to make sure there's nothing wrong with my PC. If it fails again, I will try Undi's script, maybe I screwed something up on mine. sigh

You will need to fix some shit before using it, I will try to remake it more clean kek
Good luck

replied to their post 5 months ago
view reply

@Undi95 Just want to thank you for the collaboration so far regardles you wrote fine. Having the activation directions but not having a way to patch to model is just killing me. Is the model your Unholy or did you make a FP16?

Thanks.
Script 1 give you activation, script 2 let you use it (but it's mostly fucking broken, you probably need to fix thing here and there), perfect world would be to get them and use it in the same notebook.
I have done that with it : https://huggingface.co/Undi95/Unholy-8B-DPO-OAS (I tell all the step) but yes, mostly sure it's fucked up one way or another, still, it's a proof of concept, something got out of this mess kek

replied to their post 5 months ago
view reply

I can confirm it work and give coherent model, I'm not a VRAMLET but a BRAINLET kek
I tried to do shit, I worked on it all night, I can't code - I used CHATGPT to help me write some snippet.
I let you have this ZIP, it contain 2x the script, the code is broken, but I hope you will all get the idea behind this. (Can run on 1xA100 apparently, batch size 11)

https://files.catbox.moe/xkf7y4.zip

Since I was too dumb to make one entire script, I made a first part and a second part.
It's probably broken but I succeeded to output something after 7 hours so I suppose it can be fixed lmao
The first notebook ORTHO_RANDOM_LAYER let you bruteforce the model with layer from 1 to 32 having random "direction" (or vector, or whatever, I'm really a noob). You then can see if one of the layer let you prompt freely or censor you (see: https://files.catbox.moe/9h3k4l.txt) it then store all of them into a variable for each layer, that you can exctract into a "key.txt" containing the "direction" (or what the fuck it is).

You can then use the second notebook that can use the key as a json file (if you delete all the text around the []) that let you have the same result as before.

Long story short : Bruteforce + Different "direction" = an infinity of possibility.
But yeah, I'm really really too small brain for this shit, I really wanted to try doing something nice, it took all night just to achieve one usable model hahaha

I hope someone will, If fixing my shit is impossible, understand the idea behind it and put it into practice! Kek

Edit: I really wrote badly, but I'm really tired, sorry about that. The fact that I don't know the keyword for some Torch task is even more cringe. I at least tried my best.

replied to their post 5 months ago
replied to their post 5 months ago
replied to their post 5 months ago
view reply

I'm currently bruteforcing all the layer too, but with the 32 base prompt, and for now layer 31 (last one) just mog all the other.
But keep in mind I try on a special version of Unholy 8B with DPO on top, I will post log when it finish

image.png

image.png

There is some issue in some prompt tho, but no refusal

image.png

replied to their post 5 months ago
view reply

I ended up brute-forcing all the layers and found out that the correct layer for LLaMA 3 8B Instruct is 12.
Here is the log: https://files.catbox.moe/aaamj9.txt

Yoo so there is really ONE layer that could work? Thank you!

replied to their post 5 months ago
view reply

Thanks you!
I will try ASAP when I have the opportunity, very interesting

replied to their post 5 months ago
view reply

Yosh, I've done a try yesterday, still on 8B, with the full 7k dataset but is still make refusal for 95% of the prompt in the log. I tried with layers 16, 18 and 14 it was shit. I was using 2 gpu to be faster tho, I should try one, maybe that's the problem? Since the script was made for 1 GPU in mind.
I will try to modify some things and report back!

replied to their post 5 months ago
view reply

I just made a .csv of +7000 entries, adding the toxicQA entries, you can snag it here : https://huggingface.co/datasets/Undi95/orthogonal-activation-steering-TOXIC

Also, try on a 8B before maybe to not waste compute?
I only tried the initial script without changing anything so, feel free to try anything!
I spend enough on the 70B for now hahaha, so if you're sure about what you do, do whatever, it will be different than me anyway.

replied to their post 5 months ago
view reply

How much VRAM did it take to load then train on 32 examples for the 70b? I am willing to put a machine to work on 512 examples the original post had.

I needed to use 3xA100, but theorically, I only needed 2 (so like, 160GB VRAM?). You need a lot of ram too with how the script handle the model.
The issue that we have with the modified script we have done with our small brain: using your max GPU (so 3 if you use 3) will make it crash for whatever reason.
So I used the script with a machine of 3xA100 and 377GB RAM, with n_device to 2, and it worked. At least the log showed some uncensoring but in practice, it didn't worked well.

You need to use Transformer Lens, so if you want to take a look, give a go to the doc: https://neelnanda-io.github.io/TransformerLens/generated/code/transformer_lens.HookedTransformer.html

Don't hesitate to ask if you want more infos, I will try my best.
If you want to ask some things to the OP of the script, here the original post : https://huggingface.co/hjhj3168/Llama-3-8b-Orthogonalized-exl2/discussions/3#6632d510329025d77477c5a5

posted an update 5 months ago
view post
Post
20556
Hello!
The 8B/70B OG Llama-3 models made with the Orthogonal Activation Steering script as been pushed in private.

After multiple test with an empty prompt system, I can confirm it's not uncensored enough, but I wanted to try all the GGUF before (and it take time to do lmao)

If you want to try that yourself, here is the script : https://gist.github.com/wassname/42aba7168bb83e278fcfea87e70fa3af
And here is the same script that we modified to be able to use it on multiple GPU for 70B : https://files.catbox.moe/ya4rto.ipynb

Llama3-Unholy-8B-OAS don't have the problem as it was already trained to be less censored, but the OG one was really too much censored.

I will try to redo that soon, as it seems to HAVE WORKED for some prompt (as seen on the log, for exemple) but it's not enough.

32 entry of the dataset is clearly not enough, but it's okay, I really wanted to try that as it was something new.
I could take the Unholy way and retrain the 70B before using OAS but it should work without, that's not the goal.
·
posted an update 5 months ago
view post
Post
9599
Soon new releases on NeverSleep 👀
8B/70B Llama3 RP fine-tune in the work!
  • 2 replies
·
posted an update 7 months ago
view post
Post
Hey, it took some time but I finally moved out and got internet back, so here I am again!
A lot of things to get updated on, I will try to reply to each of you ASAP.
See you soon!
  • 1 reply
·
replied to macadeliccc's post 7 months ago
view reply

Yep, got one model output with the snr method!

replied to macadeliccc's post 7 months ago
view reply

I tried with my Borealis model, got an error :

Traceback (most recent call last):
File "/content/laserRMT/rmt_laser.py", line 199, in
loop_check, min_loss = modifier.search_optimal_layer_modification(layer_types=['mlp.down_proj', 'mlp.up_proj', 'self_attn.q_proj', 'self_attn.k_proj', 'self_attn.v_proj', 'self_attn.o_proj'],
File "/content/laserRMT/rmt_laser.py", line 132, in search_optimal_layer_modification
initial_perplexity = self.calculate_model_perplexity()
File "/content/laserRMT/rmt_laser.py", line 101, in calculate_model_perplexity
input_tok = gptq_data_utils.get_test_tokens(dataset, seed=0, seqlen=seqlen, model=model_str)
File "/content/laserRMT/lib/utils/gptq_data_utils.py", line 196, in get_test_tokens
return get_c4_new(train_samples, seed, seqlen, model)[1].input_ids
File "/content/laserRMT/lib/utils/gptq_data_utils.py", line 134, in get_c4_new
traindata = load_dataset(
File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 2129, in load_dataset
builder_instance = load_dataset_builder(
File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 1852, in load_dataset_builder
builder_instance: DatasetBuilder = builder_cls(
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 373, in init
self.config, self.config_id = self._create_builder_config(
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 539, in _create_builder_config
raise ValueError(
ValueError: BuilderConfig 'allenai--c4' not found. Available: ['en', 'en.noblocklist', 'en.noclean', 'realnewslike', 'multilingual', 'af', 'am', 'ar', 'az', 'be', 'bg', 'bg-Latn', 'bn', 'ca', 'ceb', 'co', 'cs', 'cy', 'da', 'de', 'el', 'el-Latn', 'en-multi', 'eo', 'es', 'et', 'eu', 'fa', 'fi', 'fil', 'fr', 'fy', 'ga', 'gd', 'gl', 'gu', 'ha', 'haw', 'hi', 'hi-Latn', 'hmn', 'ht', 'hu', 'hy', 'id', 'ig', 'is', 'it', 'iw', 'ja', 'ja-Latn', 'jv', 'ka', 'kk', 'km', 'kn', 'ko', 'ku', 'ky', 'la', 'lb', 'lo', 'lt', 'lv', 'mg', 'mi', 'mk', 'ml', 'mn', 'mr', 'ms', 'mt', 'my', 'ne', 'nl', 'no', 'ny', 'pa', 'pl', 'ps', 'pt', 'ro', 'ru', 'ru-Latn', 'sd', 'si', 'sk', 'sl', 'sm', 'sn', 'so', 'sq', 'sr', 'st', 'su', 'sv', 'sw', 'ta', 'te', 'tg', 'th', 'tr', 'uk', 'und', 'ur', 'uz', 'vi', 'xh', 'yi', 'yo', 'zh', 'zh-Latn', 'zu']

This error is also present in the notebook you shared (first cell of Laser).
Pls fix?
I have another try with the second Laser SNR in the background on a 7b model, running on Runpod, will edit result.

replied to their post 7 months ago
view reply

As always, our model are made for RP in mind.
So it can write well better than GPT4 on some situation for sure.
(Also, he don't judge you easily like GPT4 🤫)

So, is the series better than GPT4 ? Depend of your usage 😉
For mine, it's a yes for sure hahaha.

The 2x70B are amazing to be honest, but very heavy sadly.