Arthur Zucker's picture

Arthur Zucker

ArthurZ

·

AI & ML interests

None yet

Recent Activity

Reacted to Xenova's post with 🔥 7 days ago

Have you tried out 🤗 Transformers.js v3? Here are the new features: ⚡ WebGPU support (up to 100x faster than WASM) 🔢 New quantization formats (dtypes) 🏛 120 supported architectures in total 📂 25 new example projects and templates 🤖 Over 1200 pre-converted models 🌐 Node.js (ESM + CJS), Deno, and Bun compatibility 🏡 A new home on GitHub and NPM Get started with `npm i @huggingface/transformers`. Learn more in our blog post: https://huggingface.co/blog/transformersjs-v3

Reacted to davidberenstein1957's post with 👀 7 days ago

For anyone who struggles with NER or information extraction with LLM. We showed an efficient workflow for token classification including zero-shot suggestions and model fine-tuning with Argilla, GliNER, the NuMind NuExtract LLM and SpanMarker. @argilla Video: https://youtu.be/JvLpaYgNd84?feature=shared Notebooks and slides included to try it yourself 🙂

Reacted to LukeNeumann's post with 🤯 7 days ago

Nine years ago, I uploaded the first 8K resolution video to YouTube and I've been stockpiling 8K footage ever since: https://www.youtube.com/watch?v=sLprVF6d7Ug&t Should @Overlaiapp release the first open-source 8K video dataset? Could anyone even fine tune a model with this?😅

View all activity

Articles

Fixing Gradient Accumulation

Improving Hugging Face Training Efficiency Through Packing with Flash Attention

Fine-Tuning Gemma Models in Hugging Face

Code Llama: Llama 2 learns to code

Organizations

ArthurZ's activity

New activity in mistralai/Pixtral-Large-Instruct-2411 9 days ago

Upload transformers version

#3 opened 9 days ago by

New activity in huggingface/documentation-images 12 days ago

Upload Meta-Llama-3-8B-Instruct, seqlen = 512, python, w_ compile.png

#392 opened 13 days ago by

New activity in mistral-community/pixtral-12b about 1 month ago

Update model weight

#13 opened about 1 month ago by

Update hidden_act to silu

#14 opened about 1 month ago by

New activity in rhymes-ai/Aria about 2 months ago

llama.cpp support

#1 opened about 2 months ago by

New activity in google/gemma-2-2b-jpn-it about 2 months ago

tokenizer_config.json is different from gemma-2-2b-it

#8 opened about 2 months ago by

New activity in mistral-community/pixtral-12b 2 months ago

How can i use the full 24GB model instead of this separated safetensors files?

#8 opened 2 months ago by

New activity in meta-llama/Llama-3.2-11B-Vision-Instruct 2 months ago

hidden_activation vs hidden_act in config.json

#10 opened 2 months ago by

New activity in mistral-community/pixtral-12b-240910 2 months ago

How to use safetensors?

#13 opened 2 months ago by

New activity in mistral-community/pixtral-12b 2 months ago

lamma cpp ht to gguf not working

#2 opened 2 months ago by

New activity in meta-llama/Llama-3.1-405B-Instruct-FP8 3 months ago

8-kv-heads

#14 opened 4 months ago by

New activity in meta-llama/Llama-3.1-405B-FP8 4 months ago

Update config.json

#17 opened 4 months ago by

Config KV Heads should be 8 now?

#16 opened 4 months ago by

New activity in meta-llama/Llama-3.1-405B-Instruct-FP8 4 months ago

8 kv heads

#13 opened 4 months ago by

New activity in meta-llama/Llama-3.1-405B-FP8 4 months ago

8-kv-heads

#15 opened 4 months ago by

New activity in meta-llama/Llama-3.1-405B 4 months ago

8-kv-heads

#21 opened 4 months ago by

New activity in meta-llama/Llama-3.1-405B-Instruct 4 months ago

8-kv-heads

#17 opened 4 months ago by

New activity in meta-llama/Llama-3.1-405B-FP8 4 months ago

Updated eos_token to include multiple IDs

#14 opened 4 months ago by

Update tokenizer to prepend special token

#12 opened 4 months ago by

New activity in meta-llama/Llama-3.1-70B 4 months ago

Update tokenizer to prepend special token

#11 opened 4 months ago by