46 91 353

Thomas Wolf PRO

thomwolf

https://thomwolf.io

AI & ML interests

NLP and open-source :-)

Articles

Organizations

thomwolf's activity

reacted to merve's post with ❤️ 15 days ago

Post

2381

Lotus 🪷 is a new foundation model on monocular depth estimation ✨
Compared to previous diffusion-based MDE models, Lotus is modified for dense prediction tasks
Authors also released a model for normal prediction 🤗
Find everything in this collection merve/lotus-6718fb957dc1c85a47ca1210

reacted to singhsidhukuldeep's post with ❤️ 15 days ago

Post

2634

If you have ~300+ GB of V-RAM, you can run Mochi from @genmo

A SOTA model that dramatically closes the gap between closed and open video generation models.

Mochi 1 introduces revolutionary architecture featuring joint reasoning over 44,520 video tokens with full 3D attention. The model implements extended learnable rotary positional embeddings (RoPE) in three dimensions, with network-learned mixing frequencies for space and time axes.

The model incorporates cutting-edge improvements, including:
- SwiGLU feedforward layers
- Query-key normalization for enhanced stability
- Sandwich normalization for controlled internal activations

What is currently available?
The base model delivers impressive 480p video generation with exceptional motion quality and prompt adherence. Released under the Apache 2.0 license, it's freely available for both personal and commercial applications.

What's Coming?
Genmo has announced Mochi 1 HD, scheduled for release later this year, which will feature:
- Enhanced 720p resolution
- Improved motion fidelity
- Better handling of complex scene warping

2 replies

reacted to fdaudens's post with ❤️ 15 days ago

Post

2737

🤯 Plot twist: Size isn't everything in AI! A lean 32B parameter model just showed up to the party and outperformed a 70B one. Efficiency > Scale? The AI world just got more interesting...

Cohere For AI released Aya Expanse, a new family of multilingual models (8B and 32B) spanning 23 popular languages.

Models: CohereForAI/c4ai-aya-expanse-671a83d6b2c07c692beab3c3
Blog post: https://huggingface.co/blog/aya-expanse
Demo: CohereForAI/aya_expanse

posted an update 15 days ago

Post

3782

Parents in the 1990: Teach the kids to code
Parents now: Teach the kids to fix the code when it starts walking around 🤖✨

2 replies

reacted to singhsidhukuldeep's post with 🔥 about 2 months ago

Post

1212

Remember when @Google launched MediaPipe in an effort to create efficient on-device pipelines?

They've just unlocked the ability to run 7B+ parameter language models directly in your browser. This is a game-changer for on-device AI!

Yes, they are streaming 8.6 GB model files!

Currently, they have Gemma 2B/7B running, but imagine Dynamic LoRA, multimodal support, quantization, and you never leaving Chrome!

This is a significant technical advancement, especially in Memory Optimization:

- Redesigned the model-loading code to work around WebAssembly's 4 GB memory limit.
- Implemented asynchronous loading of transformer stack layers (28 for Gemma 1.1 7B).
- Reduced peak WebAssembly memory usage to less than 1% of previous requirements.

Cross-Platform Compatibility
- Compiled the C++ codebase to WebAssembly for broad browser support.
- Utilized the WebGPU API for native GPU acceleration in browsers.

Here's why this matters:

1. Privacy: No need to send data to remote servers.
2. Cost-Efficiency: Eliminates server expenses.
3. Offline Capabilities: Use powerful AI without an internet connection.

Blog: https://research.google/blog/unlocking-7b-language-models-in-your-browser-a-deep-dive-with-google-ai-edges-mediapipe/

reacted to alex-abb's post with 👍🔥 5 months ago

Post

4751

Hi everyone!
I'm Alex, I'm 16, I've been an internship at Hugging Face for a little over a week and I've already learned a lot about using and prompting LLM models. With @victor as tutor I've just finished a space that analyzes your feelings by prompting an LLM chat model. The aim is to extend it so that it can categorize hugging face posts.

alex-abb/LLM_Feeling_Analyzer

4 replies

reacted to fdaudens's post with ❤️ 5 months ago

Post

3410

A nice improvement for Hugging Face on Sheets: You can now customize your prompt and select the model of your choice directly on the sheet.

Thanks to @louisbrulenaudet for the contribution. Really cool to see the community improving this tool!

Try it here: JournalistsonHF/huggingface-on-sheets

reacted to yunusserhat's post with 🚀 5 months ago

Post

3112

Hello everyone,

I am pleased to announce that I have founded the University of Glasgow organization on Huggingface. If you are affiliated with the University of Glasgow or have a relative who is, you can log in through the relevant link.

https://huggingface.co/UniversityofGlasgow

1 reply

reacted to KnutJaegersberg's post with 👍 5 months ago

Post

1551

What We Learned from a Year of Building with LLMs

It's a nice perspective outlined in here.

“When a measure becomes a target, it ceases to be a good measure.”

— Goodhart’s Law

https://www.oreilly.com/radar/what-we-learned-from-a-year-of-building-with-llms-part-i/

reacted to frimelle's post with ❤️🤗 5 months ago

Post

1836

Wikimedia and Hugging Face seem kind of naturally complementary: Both are community-centred, value openness and consent. That's why I'd love to see more Wikipedia and other Wikimedia projects' datasets on Hugging Face to advance machine learning with diverse, community-curated data! See my new article on the Hugging Face hub for why and how to create more Wikimedia datasets on Hugging Face: https://huggingface.co/blog/frimelle/wikipedias-treasure-trove-ml-data

reacted to Salama1429's post with 😎❤️🚀🔥 5 months ago

Post

2417

📺 Introducing the YouTube-Commons Dataset 📺

🌐 Overview: The YouTube Commons Dataset is a comprehensive collection of 30 billion words from 15,112,121 original and automatically translated transcripts, drawn from 2,063,066 videos on YouTube.

🔗 License: All videos are shared under the CC-BY license, with the majority (71%) in English.

🤖 Applications: This dataset is ideal for training powerful AI models for converting speech to text (ASR) and translation models.

📊 Utilization: The text can be used for model training and is republishable for reproducibility purposes.

🤝 Collaboration: This dataset is the result of a collaboration between state start-up LANGU:IA, the French Ministry of Culture, and DINUM. It will be expanded in the coming months.

🔗 Explore the dataset here: https://lnkd.in/d_paWKFE

#YouTubeCommons #AIResearch #MachineLearning #OpenData #ArtificialIntelligence #NLP #Dataset #TechCollaboration #Innovation #DigitalTransformation

posted an update 5 months ago

Post

4398

[New crazy blog post alert] We are releasing an extensive blog post on the science of creating high quality web-scale datasets, detailing all the steps and learnings that came in our recent 15 trillion tokens 🍷FineWeb release

Inspired by the distill.pub interactive graphics papers, we settled to write the most extensive, enjoyable and in-depth tech report we could draft on so prepare for a 45-mmin read with interactive graphics and all.

And it's not all, in this article we also introduce 📚FineWeb-Edu a filtered subset of Common Crawl with 1.3T tokens containing only web pages with very high educational content. Up to our knowledge, FineWeb-Edu out-performs all openly release web-scale datasets by a significant margin on knowledge- and reasoning-intensive benchmarks like MMLU, ARC, and OpenBookQA

We also make a number of surprising observations on the "quality" of the internet it-self which may challenge some of the general assumptions on web data (not saying more, I'll let you draw your conclusions ;)

HuggingFaceFW/blogpost-fineweb-v1

1 reply

reacted to zolicsaki's post with 👀 6 months ago

Post

886

SambaNova just released a revolutionary paper about how the SN40L AI chip can host many LLMs on a single node and run inference so efficiently that it enables running a "composition of experts." These experts can be interconnected via a router, resulting in remarkable accuracy. This method allows you to take open source expert models from HuggingFace and continuously build and integrate them into a composition of experts.

I am also super excited about the possibilities that SN40Ls unlock for LLM agent workflows and pipelined calls. With the release of GPT4o, it seems that monolithic LLMs are starting to reach a plateau, and I believe that the next wave of AI will be driven by pipelined LLM calls and agent workflows. Most pipelined LLM workflows are bottlenecked by prohibitively expensive compute and high latency, but the SN40L provides a one stop shop solution for this. We need to get the word out to the community that this hardware exists, because it will open up a realm of possibilities that developers working with Nvidia hardware did not know exist.

SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts (2405.07518)

reacted to dhruvabansal's post with 🤯👍 6 months ago

Post

1637

🚀 Introducing RefuelLLM-2 and RefuelLLM-2-small, the next version of our large language models purpose built for data labeling, enrichment and cleaning.

RefuelLLM-2 (83.82%) outperforms all state-of-the-art LLMs, including GPT-4-Turbo (80.88%), Claude-3-Opus (79.19%) and Gemini-1.5-Pro (74.59%), across a benchmark of ~30 data labeling tasks.

RefuelLLM-2-small (79.67%), aka Llama-3-Refueled, outperforms all comparable LLMs including Claude3-Sonnet (70.99%), Haiku (69.23%) and GPT-3.5-Turbo (68.13%).

📖 Open sourcing the model weights: refuelai/Llama-3-Refueled
📝 Detailed blog post: https://www.refuel.ai/blog-posts/announcing-refuel-llm-2
🧪 Try out the model here: https://labs.refuel.ai/playground

4 replies

Thomas Wolf PRO

AI & ML interests

Articles

FineVideo: behind the scenes

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

A failed experiment: Infini-Attention, and why we should keep trying?

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Constitutional AI with Open LLMs

Open LLM Leaderboard: DROP deep dive

What's going on with the Open LLM Leaderboard?

Can foundation models label data like humans?

Organizations

thomwolf's activity