Clem ๐Ÿค— PRO

clem

AI & ML interests

multi-modal, time-series, biology and chemistry

Recent Activity

liked a model about 1 hour ago
chaidiscovery/chai-1
liked a dataset about 1 hour ago
HuggingFaceTB/smoltalk
View all activity

Organizations

clem's activity

Reacted to andito's post with โค๏ธ๐Ÿ”ฅ about 6 hours ago
view post
Post
372
Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

- SmolVLM generates tokens 7.5 to 16 times faster than Qwen2-VL! ๐Ÿคฏ
- Other models at this size crash a laptop, but SmolVLM comfortably generates 17 tokens/sec on a macbook! ๐Ÿš€
- SmolVLM can be fine-tuned on a Google collab! Or process millions of documents with a consumer GPU!
- SmolVLM even outperforms larger models in video benchmarks, despite not even being trained on videos!

Check out more!
Demo: HuggingFaceTB/SmolVLM
Blog: https://huggingface.co/blog/smolvlm
Model: HuggingFaceTB/SmolVLM-Instruct
Fine-tuning script: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
replied to jsulz's post about 6 hours ago
Reacted to jsulz's post with ๐Ÿ”ฅ๐Ÿ‘€ about 6 hours ago
view post
Post
877
Something I love about working at Hugging Face is the opportunity to design and work in public. Right now, weโ€™re redesigning the architecture that supports uploads and downloads on the Hub.

Datasets and models are growing fast, and so are the challenges of storing and transferring them efficiently. To keep up, we're introducing a new protocol for uploads and downloads, supported by a content-addressed store (CAS).

Hereโ€™s whatโ€™s coming:

๐Ÿ“ฆ Smarter uploads: Chunk-level management enables advanced deduplication, compression, and reduces redundant transfers, speeding up uploads.
โšก Efficient downloads: High throughput and low latency ensure fast access, even during high-demand model releases.
๐Ÿ”’ Enhanced security: Validate uploads before storage to block malicious or invalid data.

We analyzed 24 hours of global upload activity in October (88 countries, 130TB of data!) to design a system that scales with your needs.

The result? A proposed infrastructure with CAS nodes in us-east-1, eu-west-3, and ap-southeast-1.

๐Ÿ”— Read the blog post for the full details: https://huggingface.co/blog/rearchitecting-uploads-and-downloads

๐ŸŒŸ Check out our interactive demo to explore the data yourself!
xet-team/cas-analysis

Weโ€™d love to hear your feedback - let us know if you have questions or want to see more.
ยท
Reacted to davanstrien's post with โค๏ธ 1 day ago
view post
Post
1356
First dataset for the new Hugging Face Bluesky community organisation: bluesky-community/one-million-bluesky-posts ๐Ÿฆ‹

๐Ÿ“Š 1M public posts from Bluesky's firehose API
๐Ÿ” Includes text, metadata, and language predictions
๐Ÿ”ฌ Perfect to experiment with using ML for Bluesky ๐Ÿค—

Excited to see people build more open tools for a more open social media platform!
Reacted to merve's post with ๐Ÿ‘€๐Ÿค—๐Ÿš€๐Ÿ”ฅ 1 day ago
view post
Post
2063
Small yet mighty! ๐Ÿ’ซ

We are releasing SmolVLM: a new 2B small vision language made for on-device use, fine-tunable on consumer GPU, immensely memory efficient ๐Ÿค 

We release three checkpoints under Apache 2.0: SmolVLM-Instruct, SmolVLM-Synthetic and SmolVLM-Base HuggingFaceTB/smolvlm-6740bd584b2dcbf51ecb1f39

Learn more from our blog here: huggingface.co/blog/smolvlm
This release comes with a demo, fine-tuning code, MLX integration and TRL integration for DPO ๐Ÿ’
Try the demo: HuggingFaceTB/SmolVLM
Fine-tuning Recipe: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
Also TRL integration for DPO ๐Ÿ’—
Reacted to reach-vb's post with โค๏ธ 2 days ago
view post
Post
2255
Massive week for Open AI/ ML:

Mistral Pixtral & Instruct Large - ~123B, 128K context, multilingual, json + function calling & open weights
mistralai/Pixtral-Large-Instruct-2411
mistralai/Mistral-Large-Instruct-2411

Allen AI Tรผlu 70B & 8B - competive with claude 3.5 haiku, beats all major open models like llama 3.1 70B, qwen 2.5 and nemotron
allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5
allenai/tulu-3-datasets-673b8df14442393f7213f372

Llava o1 - vlm capable of spontaneous, systematic reasoning, similar to GPT-o1, 11B model outperforms gemini-1.5-pro, gpt-4o-mini, and llama-3.2-90B-vision
Xkev/Llama-3.2V-11B-cot

Black Forest Labs Flux.1 tools - four new state of the art model checkpoints & 2 adapters for fill, depth, canny & redux, open weights
reach-vb/black-forest-labs-flux1-6743847bde9997dd26609817

Jina AI Jina CLIP v2 - general purpose multilingual and multimodal (text & image) embedding model, 900M params, 512 x 512 resolution, matroyoshka representations (1024 to 64)
jinaai/jina-clip-v2

Apple AIM v2 & CoreML MobileCLIP - large scale vision encoders outperform CLIP and SigLIP. CoreML optimised MobileCLIP models
apple/aimv2-6720fe1558d94c7805f7688c
apple/coreml-mobileclip

A lot more got released like, OpenScholar ( OpenScholar/openscholar-v1-67376a89f6a80f448da411a6), smoltalk ( HuggingFaceTB/smoltalk), Hymba ( nvidia/hymba-673c35516c12c4b98b5e845f), Open ASR Leaderboard ( hf-audio/open_asr_leaderboard) and much more..

Can't wait for the next week! ๐Ÿค—
posted an update 2 days ago
view post
Post
1749
I've been in Brazil for 10 days now ๐Ÿ‡ง๐Ÿ‡ท๐Ÿ‡ง๐Ÿ‡ท๐Ÿ‡ง๐Ÿ‡ท

I've been surprised by the gap between the massive number of people interested in AI (chatgpt adoption is crazy here) and the relatively low number of real AI builders - aka people and companies building their own AI models, datasets and apps.

Lots of efforts needed across the world for everyone to participate, control and benefit this foundational technology, starting with open-source & multi-lingual AI, more access to GPUs & AI builder training for all!
Reacted to lin-tan's post with ๐Ÿ”ฅ 5 days ago
view post
Post
1384
Can language models replace developers? #RepoCod says โ€œNot Yetโ€, because GPT-4o and other LLMs have <30% accuracy/pass@1 on real-world code generation tasks.
- Leaderboard https://lt-asset.github.io/REPOCOD/
- Dataset: lt-asset/REPOCOD
@jiang719 @shanchao @Yiran-Hu1007
Compared to #SWEBench, RepoCod tasks are
- General code generation tasks, while SWE-Bench tasks resolve pull requests from GitHub issues.
- With 2.6X more tests per task (313.5 compared to SWE-Benchโ€™s 120.8).

Compared to #HumanEval, #MBPP, #CoderEval, and #ClassEval, RepoCod has 980 instances from 11 Python projects, with
- Whole function generation
- Repository-level context
- Validation with test cases, and
- Real-world complex tasks: longest average canonical solution length (331.6 tokens) and the highest average cyclomatic complexity (9.00)

Introducing hashtag #RepoCod-Lite ๐ŸŸ for faster evaluations: 200 of the toughest tasks from RepoCod with:
- 67 repository-level, 67 file-level, and 66 self-contains tasks
- Detailed problem descriptions (967 tokens) and long canonical solutions (918 tokens)
- GPT-4o and other LLMs have < 10% accuracy/pass@1 on RepoCod-Lite tasks.
- Dataset: lt-asset/REPOCOD_Lite

#LLM4code #LLM #CodeGeneration #Security
  • 1 reply
ยท
Reacted to xiaozaa's post with ๐Ÿ‘ 5 days ago
Reacted to TuringsSolutions's post with ๐Ÿ‘€ 5 days ago
view post
Post
797
I created something called 'Hyperbolic Embeddings'. I literally just embed the tokens into Hyperbolic Space instead of Euclidean space. At first, this did not get me the gains I was expecting. I was a sad panda. Then I thought about it, a Hyperbolic Embedding needs a Hyperbolic Optimizer. So, instead of Adam, I used Riemannian Adam (RAdam). "Ladies and Gentlemen, We Got 'Em!"
  • 27 replies
ยท
Reacted to openfree's post with โค๏ธ๐Ÿ”ฅ 5 days ago
view post
Post
1511
๐ŸŽ‰ Reached HuggingFace Trending Top 100 in Just One Day! Introducing Mouse-I

First, we want to thank everyone who helped Mouse-I reach the HuggingFace Spaces Trending Top 100! We're especially excited that a game called "Jewel Pop Game," created using Mouse-I, has reached the global top 160.
With this overwhelming response, we're thrilled to introduce Mouse-I, an AI-powered code generation and automatic deployment tool by Bidraft.

โœจ What is Mouse-I?
Mouse-I is an innovative tool that automatically generates and deploys working web services within 60 seconds, simply based on your prompt input.

๐Ÿš€ Key Features

One-Click Real-time Deployment: Complete from prompt to deployment in just 60 seconds
Real-time Preview: Instantly check your generated code results
40+ Templates: Ready-to-use templates including MBTI tests, investment management tools, Tetris games, and more
Real-time Editing: Instantly modify and apply generated code

โšก How to Use
Create your own web service in just 3 steps:

Enter your prompt (15 seconds)
Code generation (40 seconds)
Deploy (5 seconds)

๐ŸŒŸ What Makes Us Special

Ultra-fast code generation powered by NVIDIA H100 GPUs
Advanced multi-LLM complex agent technology
All generated web apps available for free viewing and use in our marketplace

๐Ÿ” Current Status

Over 3,000 web apps generated, with 160+ successfully deployed
30x faster service completion compared to competing services

๐ŸŽˆ Join Our Beta Test
Try Mouse-I for free right now!
๐Ÿ‘‰ Experience Mouse-I
๐Ÿ”ฎ Future Plans
We're planning to launch 'Mouse-II', specialized for backend system development, within this year. When used together with Mouse-I, it will enable complete automation of full-stack development.

We look forward to your feedback and suggestions about Mouse-I!
Thank you for your interest and support ๐Ÿ™
#AI #CodeGeneration #WebDevelopment #HuggingFace #MouseI #Bidraft #AICodeAssistant
https://huggingface.co/spaces/VIDraft/mouse1

Reacted to cfahlgren1's post with โค๏ธ 5 days ago
view post
Post
828
observers ๐Ÿ”ญ - automatically log all OpenAI compatible requests to a dataset๐Ÿ’ฝ

โ€ข supports any OpenAI compatible endpoint ๐Ÿ’ช
โ€ข supports DuckDB, Hugging Face Datasets, and Argilla as stores

> pip install observers

No complex framework. Just a few lines of code to start sending your traces somewhere. Let us know what you think! @davidberenstein1957 and I will continue iterating!

Here's an example dataset that was logged to Hugging Face from Ollama: cfahlgren1/llama-3.1-awesome-chatgpt-prompts
Reacted to merve's post with ๐Ÿ‘€ 5 days ago
view post
Post
1423
Apple released AIMv2 ๐Ÿ a family of state-of-the-art open-set vision encoders
apple/aimv2-6720fe1558d94c7805f7688c
> like CLIP, but add a decoder and train on autoregression ๐Ÿคฏ
> 19 open models come in 300M, 600M, 1.2B, 2.7B with resolutions of 224, 336, 448
> Load and use with ๐Ÿค— transformers
Reacted to singhsidhukuldeep's post with โค๏ธ 5 days ago
view post
Post
750
Excited to share my analysis of the most groundbreaking DCN-V2 paper from @Google , which introduces significant improvements to deep learning recommendation systems!

Key technical highlights:

>> Core Architecture
- Starts with an embedding layer that handles both sparse categorical and dense features
- Unique capability to handle variable embedding sizes from small to large vocabulary sizes
- Cross network creates explicit bounded-degree feature interactions
- Deep network complements with implicit feature interactions
- Two combination modes: stacked and parallel architectures

>> Key Technical Innovations
- Enhanced cross layers with full matrix-based feature interaction learning instead of vector-based
- Mixture of Low-Rank architecture with:
* Multiple expert networks learning in different subspaces
* Dynamic gating mechanism to adaptively combine experts
* Efficient time complexity when specific conditions are met
* Support for non-linear transformations in projected spaces

>> Production Optimizations
- Low-rank matrix approximation leveraging singular value decay patterns
- Mixture-of-Experts decomposition into smaller subspaces
- Efficient parameter allocation between cross and deep networks
- Automatic feature interaction learning for higher-order interactions in multi-layered networks
- Support for both homogeneous and heterogeneous polynomial patterns

>> Real-World Impact
- Successfully deployed across Google's recommendation systems
- Significant gains in both offline accuracy and online metrics
- Better performance-latency tradeoffs through low-rank approximations
- Proven effectiveness on large-scale data with billions of training examples

This represents a major leap forward in making deep learning recommendation systems more practical and efficient at scale.

Thoughts? Would love to hear your experiences implementing similar architectures in production!