π 1M public posts from Bluesky's firehose API π Includes text, metadata, and language predictions π¬ Perfect to experiment with using ML for Bluesky π€
Excited to see people build more open tools for a more open social media platform!
Hi HuggingFacers!π€ I'm thrilled to introduce my latest project: π¦π²π»π§πΏππ (π¦π²π»tence π§πΏansformers ππaluator), a python package that offers simple customizable evaluation for text retrieval accuracy and time performance of Sentence Transformers-compatible text embedders on PDF data!π
Dataset highlights: - 644,412 public domain images with comprehensive metadata from publicdomainpictures.net - English language metadata including titles, descriptions, and keywords - Each entry contains rich metadata including: - Unique image ID and full-size image URLs - Detailed titles and descriptions - Keyword/tag collections - Creator attribution - Released to the public domain under Creative Commons Zero (CC0) license