Instructions to run end-to-end demo
Chapters
I. Installation of KServe & its dependencies
II. Setting up local MinIO S3 storage
III. Setting up your OpenShift AI workbench
V. Convert model to Caikit format and save to S3 storage
V. Deploy model onto Caikit-TGIS Serving Runtime
Prerequisites
- To support training and inference, your cluster needs a node with CPUS, 4 GPUs, and GB memory. Instructions to add GPU support to RHOAI can be found here.
- You have a cluster administrator permissions
- You have installed the OpenShift CLI (
oc
) - You have installed the
Red Hat OpenShift Service Mesh Operator
- You have installed the
Red Hat OpenShift Serverless Operator
- You have installed the
Red Hat OpenShift AI Operator
and created a DataScienceCluster object
Installation of KServe & its dependencies
Instructions adapted from Manually installing KServe
Git clone this repository
git clone https://github.com/trustyai-explainability/trustyai-detoxify-sft.git
Login to your OpenShift cluster as a cluster adminstrator
oc login --token=<token>
Create the required namespace for Red Hat OpenShift Service Mesh
oc create ns istio-system
Create a
ServiceMeshControlPlane
objectoc apply -f manifests/kserve/smcp.yaml -n istio-system
Sanity check to verify creation of the service mesh instance
oc get pods -n istio-system
Expected output:
NAME READY STATUS RESTARTS AGE istio-egressgateway-7c46668687-fzsqj 1/1 Running 0 22h istio-ingressgateway-77f94d8f85-fhsp9 1/1 Running 0 22h istiod-data-science-smcp-cc8cfd9b8-2rkg4 1/1 Running 0 22h
Create the required namespace for a
KnativeServing
instanceoc create ns knative-serving
Create a
ServiceMeshMember
objectoc apply -f manifests/kserve/default-smm.yaml -n knative-serving
Create and define a
KnativeServing
objectoc apply -f manifests/kserve/knativeserving-istio.yaml -n knative-serving
Sanity check to validate creation of the Knative Serving instance
oc get pods -n knative-serving
Expected output:
NAME READY STATUS RESTARTS AGE activator-7586f6f744-nvdlb 2/2 Running 0 22h activator-7586f6f744-sd77w 2/2 Running 0 22h autoscaler-764fdf5d45-p2v98 2/2 Running 0 22h autoscaler-764fdf5d45-x7dc6 2/2 Running 0 22h autoscaler-hpa-7c7c4cd96d-2lkzg 1/1 Running 0 22h autoscaler-hpa-7c7c4cd96d-gks9j 1/1 Running 0 22h controller-5fdfc9567c-6cj9d 1/1 Running 0 22h controller-5fdfc9567c-bf5x7 1/1 Running 0 22h domain-mapping-56ccd85968-2hjvp 1/1 Running 0 22h domain-mapping-56ccd85968-lg6mw 1/1 Running 0 22h domainmapping-webhook-769b88695c-gp2hk 1/1 Running 0 22h domainmapping-webhook-769b88695c-npn8g 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jb4xk 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jxs5p 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-bgd5r 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-hld75 1/1 Running 0 22h webhook-7d49878bc4-8xjbr 1/1 Running 0 22h webhook-7d49878bc4-s4xx4 1/1 Running 0 22h
From the web console, install KServe by going to Operators -> Installed Operators and click on the Red Hat OpenShift AI Operator
Click on the DSC Intialization tab and click on the default-dsci object
Click on the YAML tab and in the
spec
section, change theserviceMesh.managementState
toUnmanaged
spec: serviceMesh: managementState: Unmanaged
Click Save
Click on the Data Science Cluster tab and click on the default-dsc object
Click on the YAML tab and in the
spec
section, change thecomponents.kserve.managementState
and thecomponents.kserve.serving.managementState
toManaged
spec: components: kserve: managementState: Managed serving: managementState: Managed
Click Save
Setting up local MinIO S3 storage
Create a namespace for your project called "detoxify-sft"
oc create namespace detoxify-sft
Set up your local MinIO S3 storage in your newly created namespace
oc apply -f manifests/minio/setup-s3.yaml -n detoxify-sft
Run the following sanity checks
oc get pods -n detoxify-sft | grep "minio"
Expected output:
NAME READY STATUS RESTARTS AGE minio-7586f6f744-nvdl 1/1 Running 0 22h
oc get route -n detoxify-sft | grep "minio"
Expected output:
NAME STATUS LOCATION SERVICE minio-api Accepted https://minio-api... minio-service minio-ui Accepted https://minio-ui... minio-service
Get the MinIO UI location URL and open it in a web browser
oc get route minio-ui -n detoxify-sft
Login using the credentials in
manifests/minio/setup-s3.yaml
user:
minio
password:
minio123
Click on Create a Bucket and choose a name for your bucket and click on Create Bucket
Setting up your OpenShift AI workbench
Go to Red Hat OpenShift AI from the web console
Click on Data Science Projects and then click on Create data science project
Give your project a name and then click Create
Click on the Workbenches tab and then create a workbench with a Pytorch notebook image, set the container size to Large, and select a single NVIDIA GPU. Click on Create Workbench
Click on Add data connection to create a matching data connection for MinIO
Fill out the required fields and then click on Add data collection
Once your workbench status changes from Starting to Running, click on Open to open JupyterHub in a web browser
In your JupyterHub environment, launch a terminal and clone this project
git clone https://github.com/trustyai-explainability/trustyai-detoxify-sft.git
Go into the
notebooks
directory
Train model and evaluate
Open the
01-sft.ipynb
fileRun each cell in the notebook
Once the model trained and uploaded to HuggingFace Hub, open the
02-eval.ipynb
file and run each cell to compare the model trained on raw input-output pairs vs. the one trained on detoxified prompts
Convert model to Caikit format and save to S3 storage
- Open the
03-save_convert_model.ipynb
and run each cell in the notebook to convert the model Caikit format and save it to a MinIO bucket
Deploy model onto Caikit-TGIS Serving Runtime
In the OpenShift AI dashboard, navigate to the project details page and click the Models tab
In the Single-model serving platform tile, click on deploy model. Provide the following values:
Model Name:
opt-350m-caikit
Serving Runtime:
Caikit-TGIS Serving Runtime
Model framework:
caikit
Existing data connection:
My Storage
Path:
models/opt-350m-caikit
Click Deploy
Increase the
initialDelaySeconds
oc patch template caikit-tgis-serving-template --type=='merge' -p '{"spec":{"containers":[{"readinessProbe":"initialDelaySeconds":300, "livenessProbe":"initialDelaySeconds":300}]}}'
Wait for the model Status to show a green checkmark
Model inference
Return to the JupyterHub environment to test out the deployed model
Click on
03-inference_request.ipynb
and run each cell to make an inference request to the detoxified model