if you use Google Kubernetes Engine to host you ML workloads, I think this series of videos is a great way to kickstart your journey of deploying LLMs, in less than 10 minutes! Thank you @wietse-venema-demo !
I'd like to share here a bit more about our Deep Learning Containers (DLCs) we built with Google Cloud, to transform the way you build AI with open models on this platform!
With pre-configured, optimized environments for PyTorch Training (GPU) and Inference (CPU/GPU), Text Generation Inference (GPU), and Text Embeddings Inference (CPU/GPU), the Hugging Face DLCs offer:
β‘ Optimized performance on Google Cloud's infrastructure, with TGI, TEI, and PyTorch acceleration. π οΈ Hassle-free environment setup, no more dependency issues. π Seamless updates to the latest stable versions. πΌ Streamlined workflow, reducing dev and maintenance overheads. π Robust security features of Google Cloud. βοΈ Fine-tuned for optimal performance, integrated with GKE and Vertex AI. π¦ Community examples for easy experimentation and implementation. π TPU support for PyTorch Training/Inference and Text Generation Inference is coming soon!