Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
PleIAs
's Collections
Common Corpus
Toxic Commons
Finance Commons
Bad Data Toolbox
OpenCulture
Common Corpus
updated
14 days ago
Largest multilingual pretraining data.
Upvote
7
PleIAs/common_corpus
Viewer
•
Updated
5 days ago
•
397M
•
50.3k
•
165
Upvote
7
+3
Share collection
View history
Collection guide
Browse collections