metadata

license: apache-2.0
datasets:
  - smcleod/golang-coder
  - smcleod/golang-programming-style-best-practices
  - ExAi/Code-Golang-QA-2k
  - google/code_x_glue_ct_code_to_text
  - semeru/code-text-go
language:
  - en
tags:
  - golang
  - code
  - go
  - programming
  - llama
  - text-generation-inference

Llama 3.1 8b Golang Coder v3

This model has been trained on Golang style guides, best practices and code examples. This should (hopefully) make it quite capable with Golang coding tasks.

LoRA

FP16
BF16

GGUF

Q8_0 (with f16 embeddings): https://huggingface.co/smcleod/llama-3-1-8b-smcleod-golang-coder-v3/blob/main/llama-3-1-8b-smcleod-golang-coder-v2.etf16-Q8_0.gguf

Ollama

https://ollama.com/sammcj/llama-3-1-8b-smcleod-golang-coder-v3

Training

I trained this model (based on Llama 3.1 8b) on a merged dataset I created consisting of 50,627 rows, 13.3M input tokens and 2.2M output tokens. The total training consisted of 1,020,719 input tokens and 445,810 output tokens from 45,565 items in the dataset.

The dataset I created for this consists of multiple golang/programming focused datasets cleaned and merged and my own synthetically generated dataset based on several open source golang coding guides.