1 11

Tinghao Xie

vtu81

https://tinghaoxie.com

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

comin/IterComp

liked a model about 2 months ago

ostris/OpenFLUX.1

View all activity

Organizations

vtu81's activity

liked a model about 1 month ago

comin/IterComp

Text-to-Image • Updated Oct 16 • 5.77k • 41

liked a model about 2 months ago

ostris/OpenFLUX.1

Text-to-Image • Updated Oct 3 • 14.5k • 585

liked 2 datasets 4 months ago

princeton-nlp/SWE-bench_Verified

Viewer • Updated Aug 14 • 500 • 46.6k • 115

unalignment/toxic-dpo-v0.2

Viewer • Updated Jan 9 • 541 • 150 • 117

updated 2 datasets 5 months ago

sorry-bench/sorry-bench-human-judgment-202406

Viewer • Updated Jul 2 • 7.2k • 56 • 5

sorry-bench/sorry-bench-202406

Viewer • Updated Jul 2 • 9.45k • 294 • 16

updated a model 5 months ago

sorry-bench/ft-mistral-7b-instruct-v0.2-sorry-bench-202406

Text Generation • Updated Jul 2 • 2.63k • 4

authored a paper 5 months ago

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors

Paper • 2406.14598 • Published Jun 20

updated a Space 5 months ago

Running

🔥

README

authored 2 papers 5 months ago

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

Paper • 2402.05162 • Published Feb 7 • 1

BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection

Paper • 2308.12439 • Published Aug 23, 2023

authored a paper 6 months ago

Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!

Paper • 2310.03693 • Published Oct 5, 2023 • 1

New activity in yiting/PerspectiveVision-LLaVA-LoRA 6 months ago

Prompt template for inference?

#1 opened 6 months ago by

vtu81

liked a model 8 months ago

mistral-community/Mistral-7B-v0.2

Text Generation • Updated Jul 1 • 14k • 232

liked a model 9 months ago

google/gemma-7b

Text Generation • Updated Jun 27 • 166k • • 3.07k

liked a dataset 9 months ago

LLM-Tuning-Safety/HEx-PHI

Preview • Updated Aug 19 • 76 • 30

liked a dataset 10 months ago

lmsys/toxic-chat

Viewer • Updated May 14 • 20.3k • 2.57k • 138

liked a model 10 months ago

lmsys/toxicchat-t5-large-v1.0

Text2Text Generation • Updated Jan 31 • 95 • 4

liked a dataset 10 months ago

allenai/WildChat

Viewer • Updated Oct 17 • 529k • 1.34k • 124

liked a model 12 months ago

meta-llama/LlamaGuard-7b

Text Generation • Updated Apr 17 • 6.69k • 216