davidberenstein1957 (David Berenstein)

Reacted to clem's post with 🚀 about 16 hours ago

Post

1749

I've been in Brazil for 10 days now 🇧🇷🇧🇷🇧🇷

I've been surprised by the gap between the massive number of people interested in AI (chatgpt adoption is crazy here) and the relatively low number of real AI builders - aka people and companies building their own AI models, datasets and apps.

Lots of efforts needed across the world for everyone to participate, control and benefit this foundational technology, starting with open-source & multi-lingual AI, more access to GPUs & AI builder training for all!

Reacted to their post with 😎➕🤗🧠🚀🔥 1 day ago

Post

1156

Let’s make a generation of amazing image-generation models

The best image generation models are trained on human preference datasets, where annotators have selected the best image from a choice of two. Unfortunately, many of these datasets are closed source so the community cannot train open models on them. Let’s change that!

The community can contribute image preferences for an open-source dataset that could be used for building AI models that convert text to image, like the flux or stable diffusion families. The dataset will be open source so everyone can use it to train models that we can all use.

Blog: https://huggingface.co/blog/burtenshaw/image-preferences

posted an update 1 day ago

Post

1156

Let’s make a generation of amazing image-generation models

The best image generation models are trained on human preference datasets, where annotators have selected the best image from a choice of two. Unfortunately, many of these datasets are closed source so the community cannot train open models on them. Let’s change that!

The community can contribute image preferences for an open-source dataset that could be used for building AI models that convert text to image, like the flux or stable diffusion families. The dataset will be open source so everyone can use it to train models that we can all use.

Blog: https://huggingface.co/blog/burtenshaw/image-preferences

Reacted to their post with 🤯🤗👀 6 days ago

Post

946

Watch and learn!

Let's observe Qwen2.5-coder:0.5b on OpenAI HumanEval.

pip install observers

And start collecting your data on the Hugging Face Hub.
Dataset: davidberenstein1957/openai_records
Library: https://github.com/cfahlgren1/observers

posted an update 6 days ago

Post

946

Watch and learn!

Let's observe Qwen2.5-coder:0.5b on OpenAI HumanEval.

pip install observers

And start collecting your data on the Hugging Face Hub.
Dataset: davidberenstein1957/openai_records
Library: https://github.com/cfahlgren1/observers

Reacted to elliesleightholm's post with 🤗 7 days ago

Post

2681

I made a beginners guide to Hugging Face Spaces 🤗 I hope it's useful to some of you :)

YouTube video: https://www.youtube.com/watch?v=xqdTFyRdtjQ

Blog: https://www.marqo.ai/blog/how-to-create-a-hugging-face-space

8 replies

·

Reacted to jsulz's post with 🧠🔥❤️❤️ 7 days ago

Post

2848

When the XetHub crew joined Hugging Face this fall, @erinys and I started brainstorming how to share our work to replace Git LFS on the Hub. Uploading and downloading large models and datasets takes precious time. That’s where our chunk-based approach comes in.

Instead of versioning files (like Git and Git LFS), we version variable-sized chunks of data. For the Hugging Face community, this means:

⏩ Only upload the chunks that changed.
🚀 Download just the updates, not the whole file.
🧠 We store your file as deduplicated chunks

In our benchmarks, we found that using CDC to store iterative model and dataset version led to transfer speedups of ~2x, but this isn’t just a performance boost. It’s a rethinking of how we manage models and datasets on the Hub.

We're planning on our new storage backend to the Hub in early 2025 - check out our blog to dive deeper, and let us know: how could this improve your workflows?

https://huggingface.co/blog/from-files-to-chunks

Reacted to their post with 🤗❤️👀 7 days ago

Post

1008

🤗🔭 Introducing Observers: A Lightweight SDK for AI Observability 🔭🤗

Observers is an open-source Python SDK that provides comprehensive observability for AI applications. Our library makes it easy to:

- Track and record interactions with AI models
- Store observations in multiple backends
- Query and analyse your AI interactions with ease

https://huggingface.co/blog/davidberenstein1957/observers-a-lightweight-sdk-for-ai-observability

David Berenstein

AI & ML interests

Recent Activity

Articles

Let’s make a generation of amazing image generation models

Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK

How to build a custom text classifier without days of human labeling

How to optimize your data labelling project with custom interfaces

To what extent are we responsible for our content and how to create safer Spaces?

Data Is Better Together: A Look Back and Forward

Organizations

davidberenstein1957's activity