Quentin Lhoest PRO

lhoestq

AI & ML interests

Maintainer of πŸ€—Datasets: NLP, Multimodal data processing and sharing

Recent Activity

liked a dataset about 7 hours ago
Isotonic/agentinstruct-1Mv1-combined
liked a dataset about 7 hours ago
ylecun/mnist
updated a dataset about 10 hours ago
infinite-dataset-hub/BreastDiagnosisProgression
View all activity

Articles

Organizations

lhoestq's activity

Reacted to rwightman's post with πŸ‘ 4 days ago
view post
Post
1118
I'm currently on a push to expand the scope of image based datasets on the Hub. There's certainly a lot already, but for anyone who's looked closely, there's not a whole lot of standardization. I am to fix that, datasets under the https://huggingface.co/timm and https://huggingface.co/pixparse orgs will serve as canonical examples for various task / modality combinations and be useable without fuss in libraries like timm, OpenCLIP, and hopefully more.

I just uploaded the first multi-label dataset that I'll support with timm scripts soon: timm/plant-pathology-2021

Next up object detection & segmentation! I've got an annotation spec sorted out, a lot of datasets ready to rip, and yeah that means timm support for object detection, eventually segmentation, is finally under development :O