Tanvir1337
/

BanglaLLama-3-8b-unolp-culturax-instruct-v0.0.1-GGUF

Text Generation

large language model

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Tanvir1337/BanglaLLama-3-8b-BnWiki-Instruct-GGUF

This model has been quantized using llama.cpp, a high-performance inference engine for large language models.

System Prompt Format

To interact with the model, use the following prompt format:

{System}
### Prompt:
{User}
### Response:

Usage Instructions

If you're new to using GGUF files, refer to TheBloke's README for detailed instructions.

Quantization Options

The following graph compares various quantization types (lower is better):

For more information on quantization, see Artefact2's notes.

Choosing the Right Model File

To select the optimal model file, consider the following factors:

Memory constraints: Determine how much RAM and/or VRAM you have available.
Speed vs. quality: If you prioritize speed, choose a model that fits within your GPU's VRAM. For maximum quality, consider a model that fits within the combined RAM and VRAM of your system.

Quantization formats:

K-quants (e.g., Q5_K_M): A good starting point, offering a balance between speed and quality.
I-quants (e.g., IQ3_M): Newer and more efficient, but may require specific hardware configurations (e.g., cuBLAS or rocBLAS).

Hardware compatibility:

I-quants: Not compatible with Vulcan (AMD). If you have an AMD card, ensure you're using the rocBLAS build or a compatible inference engine.

For more information on the features and trade-offs of each quantization format, refer to the llama.cpp feature matrix.

Downloads last month: 0

GGUF

Model size

8.03B params

Architecture

llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples

Text Generation

Inference API (serverless) is not available, repository is disabled.

Model tree for Tanvir1337/BanglaLLama-3-8b-unolp-culturax-instruct-v0.0.1-GGUF

Base model

meta-llama/Meta-Llama-3.1-8B

Finetuned

BanglaLLM/BanglaLLama-3-8b-unolp-culturax-instruct-v0.0.1

Quantized

this model

Datasets used to train Tanvir1337/BanglaLLama-3-8b-unolp-culturax-instruct-v0.0.1-GGUF