Convert text_encoder to CoreML by coremltools
I am an iOS developer with only a basic understanding of PyTorch and machine learning. I tried to convert the Taiyi model to the CoreML format. Below is my code. Although I successfully exported the model, the text feature vectors inferred on the client side often fail to retrieve relevant images. Could you please help me check if my convert process is correct? I would really appreciate it.
!pip install coremltools==7.0b2
!pip install transformers
import torch
from transformers import BertForSequenceClassification, BertConfig, BertTokenizer
from transformers import CLIPProcessor, CLIPModel
import numpy as np
import coremltools as ct
text_tokenizer = BertTokenizer.from_pretrained("IDEA-CCNL/Taiyi-CLIP-Roberta-102M-Chinese")
text_encoder = BertForSequenceClassification.from_pretrained("IDEA-CCNL/Taiyi-CLIP-Roberta-102M-Chinese",return_dict=False)
example_input = text_tokenizer(["ε°η"], return_tensors='pt', max_length=75, padding='max_length')["input_ids"]
traced_model = torch.jit.trace(text_encoder, example_input, strict=False)
text_encoder_model = ct.convert(
traced_model,
convert_to="mlprogram",
minimum_deployment_target=ct.target.iOS16,
inputs=[ct.TensorType(name="prompt",
shape=example_input.shape)],
outputs=[ct.TensorType(name="embOutput", dtype=np.float32)],
)
text_encoder_model.save("TextEncoder_float32_taiyi.mlpackage")
Here is the convert log:
#WARNING:coremltools:Tuple detected at graph output. This will be flattened in the converted model.
#Converting PyTorch Frontend ==> MIL Ops: 0%| | 0/639 [00:00<?, ? ops/s]WARNING:coremltools:Core ML embedding (gather) #layer does not support any inputs besides the weights and indices. Those given will be ignored.
#Converting PyTorch Frontend ==> MIL Ops: 100%|ββββββββββ| 637/639 [00:00<00:00, 2247.71 ops/s]
#Running MIL frontend_pytorch pipeline: 100%|ββββββββββ| 5/5 [00:00<00:00, 120.25 passes/s]
#Running MIL default pipeline: 100%|ββββββββββ| 66/66 [00:21<00:00, 3.08 passes/s]
#Running MIL backend_mlprogram pipeline: 100%|ββββββββββ| 11/11 [00:00<00:00, 76.27 passes/s]