ToolBench-ToolLLaMA-2-7b-GGML

Runtime error

feat: updated model to q5_1 as q8_0 is too slow.

5041f48 over 1 year ago

1.32 kB

	<!DOCTYPE html>
	<html>
	<head>
	<title>ToolBench-ToolLLaMA-2-7b-GGML (q5_1)</title>
	</head>
	<body>
	<h1>ToolBench-ToolLLaMA-2-7b-GGML (q5_1)</h1>
	<p>
	With the utilization of the
	<a href="https://github.com/abetlen/llama-cpp-python">llama-cpp-python</a>
	package, we are excited to introduce the GGML model hosted in the Hugging
	Face Docker Spaces, made accessible through an OpenAI-compatible API. This
	space includes comprehensive API documentation to facilitate seamless
	integration.
	</p>
	<ul>
	<li>
	The API endpoint:
	<a href="https://limcheekin-toolbench-toolllama-2-7b-ggml.hf.space/v1"
	>https://limcheekin-toolbench-toolllama-2-7b-ggml.hf.space/v1</a
	>
	</li>
	<li>
	The API doc:
	<a href="https://limcheekin-toolbench-toolllama-2-7b-ggml.hf.space/docs"
	>https://limcheekin-toolbench-toolllama-2-7b-ggml.hf.space/docs</a
	>
	</li>
	</ul>
	<p>
	If you find this resource valuable, your support in the form of starring
	the space would be greatly appreciated. Your engagement plays a vital role
	in furthering the application for a community GPU grant, ultimately
	enhancing the capabilities and accessibility of this space.
	</p>
	</body>
	</html>