25 1 39

wassname

AI & ML interests

None yet

Recent Activity

liked a model 10 days ago

EleutherAI/Hermes-RWKV-v5-7B-HF

liked a dataset about 1 month ago

jdpressman/retro-easy-prose-repair-diffs-v0.1

New activity about 2 months ago

nicoboss/Llama-3.2-1B-Instruct-Uncensored-Lora:remove hard coded base model

View all activity

Organizations

None yet

wassname's activity

liked a model 10 days ago

EleutherAI/Hermes-RWKV-v5-7B-HF

Text Generation • Updated Mar 8 • 122 • 4

liked a dataset about 1 month ago

jdpressman/retro-easy-prose-repair-diffs-v0.1

Viewer • Updated Jul 13 • 2.18k • 59 • 1

New activity in nicoboss/Llama-3.2-1B-Instruct-Uncensored-Lora about 2 months ago

remove hard coded base model

#1 opened about 2 months ago by

wassname

liked a model about 2 months ago

BBexist/llama3.2-1B-dpo

Updated Sep 30 • 4 • 1

New activity in huihui-ai/Llama-3.2-3B-Instruct-abliterated about 2 months ago

requesting 1B version

#1 opened 2 months ago by

Hasaranga85

liked a model about 2 months ago

huihui-ai/Llama-3.2-3B-Instruct-abliterated

Text Generation • Updated 6 days ago • 8.74k • 36

liked a model 2 months ago

tanliboy/llama-3.2-3b-sft-2

Text Generation • Updated Oct 1 • 22 • 1

updated a model 2 months ago

wassname/llama-3-2-1b-sft

Text Generation • Updated Sep 30 • 915 • 1

liked 2 models 2 months ago

NousResearch/Llama-3.2-1B

Text Generation • Updated Sep 27 • 19.6k • 11

reciprocate/tiny-llama

Text Generation • Updated Aug 6, 2023 • 724 • 2

updated 7 datasets 2 months ago

liked a model 3 months ago

princeton-nlp/Llama-3-Base-8B-SFT

Text Generation • Updated Jun 17 • 10.1k • 1

Reacted to grimjim's post with 👍 3 months ago

Post

2786

I've observed that the layers targeted in various abliteration notebooks (e.g., https://colab.research.google.com/drive/1VYm3hOcvCpbGiqKZb141gJwjdmmCcVpR?usp=sharing ) appear to be arbitrary, reflecting probable brute-force exploration. This doesn't need to be the case.

Taking a cue from the paper "The Unreasonable Ineffectiveness of the Deeper Layers" ( https://arxiv.org/abs/2403.17887 ) and PruneMe (https://github.com/arcee-ai/PruneMe), it seems reasonable to target deeper layers identified as more redundant given measured similarity across layers, as the result should be less damaging to models, reducing the need for subsequent fine-tuning. Intuitively, one should expect the resulting intervention layers to be deep but not final. The only uncertainty is if the redundancy successfully encodes refusals, something which is almost certainly model-dependent. This approach only requires the redundancy to be computed once per model, and the result used as a starting point for which layer range to restrict intervention to.

liked a model 4 months ago

NousResearch/Meta-Llama-3.1-8B-Instruct

Text Generation • Updated Jul 24 • 129k • 30