xinference @ git+https://github.com/aresnow1/inference.git@bugfix/model-lock xoscar chatglm-cpp llama-cpp-python