Exciting updates include:
β‘ InferenceClient is now a drop-in replacement for OpenAI's chat completion!
β¨ Support for response_format, adapter_id , truncate, and more in InferenceClient
πΎ Serialization module with a save_torch_model helper that handles shared layers, sharding, naming convention, and safe serialization. Basically a condensed version of logic scattered across safetensors, transformers , accelerate
π Optimized HfFileSystem to avoid getting rate limited when browsing HuggingFaceFW/fineweb
π¨ HfApi & CLI improvements: prevent empty commits, create repo inside resource group, webhooks API, more options in the Search API, etc.
Check out the full release notes for more details:
Wauplin/huggingface_hub#7
π