Added link and description about Optimum support for AMD GPUs
Browse files
README.md
CHANGED
@@ -93,6 +93,10 @@ Here are a few of the more popular ones to get you started:
|
|
93 |
|
94 |
Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application.
|
95 |
|
|
|
|
|
|
|
|
|
96 |
# Serving a model with TGI
|
97 |
|
98 |
Text Generation Inference (a.k.a “TGI”) provides an end-to-end solution to deploy large language models for inference at scale.
|
|
|
93 |
|
94 |
Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application.
|
95 |
|
96 |
+
## 5. Optimum Support
|
97 |
+
For a deeper dive into using Hugging Face libraries on AMD GPUs, check out the [Optimum](https://huggingface.co/docs/optimum/main/en/amd/amdgpu/overview) page
|
98 |
+
describing details on Flash Attention 2, GPTQ Quantization and ONNX Runtime integration.
|
99 |
+
|
100 |
# Serving a model with TGI
|
101 |
|
102 |
Text Generation Inference (a.k.a “TGI”) provides an end-to-end solution to deploy large language models for inference at scale.
|