juungwon's picture
Update README.md
893647a verified
|
raw
history blame
5.36 kB
metadata
library_name: transformers
tags: []

Model Card for Model ID

The Llama-3-instruction-constructionsafety-layertuning model is a fine-tuned model based on beomi/Llama-3-KoEn-8B-Instruct-preview

Model Details

Llama-3-instruction-constructionsafety-layertuning

Llama-3-instruction-constructionsafety-layertuning model is contined pretrained model based on beomi/Llama-3-KoEn-8B-Instruction-preview.

The training was conducted based on the QA datasets and RAW data of Constrution Safety Guidelines provided by the Korea Ocuupational Safety and Health Agency(KOSHA).

The training was conducted using full parameter tuning, utilizing 2xA100GPU(80GB). Approximately 11,000 data were used for the training process.

After fine-tuning the entire layers, layers 0, 30, and 31 were replaced with parameters from the base model. This was done as a precautionary measure to prevent errors resulting from training on raw data.

Simple Use

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model_name = "juungwon/Llama-3-instruction-constructionsafety-layertuning"
tuned_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    token=access_token,
    torch_dtype="auto",
    device_map="auto",
)

tokenizer = AutoTokenizer.from_pretrained(model_name, token=access_token)

tokenizer.pad_token = tokenizer.eos_token
pipe = pipeline("text-generation", model=tuned_model, tokenizer = tokenizer, torch_dtype=torch.bfloat16, device_map="auto")

# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "μΉœμ ˆν•œ κ±΄μ„€μ•ˆμ „μ „λ¬Έκ°€λ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ 'μžμ„Έν•˜κ³ ' μΉœμ ˆν•˜κ²Œ λ‹΅ν•˜μž. λͺ¨λ“  λŒ€λ‹΅μ€ ν•œκ΅­μ–΄(Korean)으둜 λŒ€λ‹΅ν•΄μ€˜.",
    },
    {"role": "user", "content": "흙막이 κ°€μ‹œμ„€ κ³΅μ‚¬μ‹œ μž‘μ—…μž μ•ˆμ „μ‘°μΉ˜μ— λŒ€ν•΄μ„œ 10가지 정도 μ•Œλ €μ£Όμ„Έμš”."},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=1024, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Output

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

μΉœμ ˆν•œ κ±΄μ„€μ•ˆμ „μ „λ¬Έκ°€λ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ 'μžμ„Έν•˜κ³ ' μΉœμ ˆν•˜κ²Œ λ‹΅ν•˜μž. λͺ¨λ“  λŒ€λ‹΅μ€ ν•œκ΅­μ–΄(Korean)으둜 λŒ€λ‹΅ν•΄μ€˜.<|eot_id|><|start_header_id|>user<|end_header_id|>

흙막이 κ°€μ‹œμ„€ κ³΅μ‚¬μ‹œ μž‘μ—…μž μ•ˆμ „μ‘°μΉ˜μ— λŒ€ν•΄μ„œ 10가지 정도 μ•Œλ €μ£Όμ„Έμš”.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

1. 곡사 μ€‘μ—λŠ” ν˜„μž₯사무싀을 μ„€μΉ˜ν•˜μ—¬ 곡사관련 기둝, μž‘μ—…μ§€νœ˜, μž‘μ—…μž λŒ€κΈ°, μ•ˆμ „κ΅μœ‘, μ•ˆμ „μ κ²€ λ“±μ˜ 업무λ₯Ό μˆ˜ν–‰ν•  수 μžˆμ–΄μ•Ό ν•œλ‹€. 2. 곡사 μ „Β·ν›„μ—λŠ” μ£Όλ³€μ˜ μ§€λ°˜μΉ¨ν•˜, μ§€ν•˜μˆ˜μœ„, μ§€ν•˜ 맀섀물, μ§€ν‘œλ©΄μ˜ 이완, μ§€λ°˜μ˜ 이완, μ§€ν‘œλ©΄μ˜ κ· μ—΄, λ…Έλ©΄μ˜ 이상 유무, λ„λ‘œ μ‹œμ„€λ¬Ό λ“±μ˜ 이상 유무λ₯Ό ν™•μΈν•˜μ—¬μ•Ό ν•œλ‹€. 3. μ„€κ³„λ„μ„œ, μ‹œλ°©μ„œ, μ•ˆμ „λ³΄κ±΄κ·œμΉ™, μ•ˆμ „λ³΄κ±΄κ·œμΉ™ 및 κ΄€λ ¨λ²•κ·œ, μ•ˆμ „λ³΄κ±΄κ·œμΉ™κ³Ό κ΄€λ ¨λœ 지침, μ‚°μ—…μ•ˆμ „λ³΄κ±΄κΈ°μ€€μ— κ΄€ν•œ κ·œμΉ™μ„ κ²€ν† ν•˜μ—¬ μ•ˆμ „λŒ€μ±…μ„ μˆ˜λ¦½ν•˜μ—¬μ•Ό ν•œλ‹€. 4. 흙막이 κ°€μ‹œμ„€ 곡사 μ‹œμ—λŠ” μž‘μ—…μžμ˜ 좔락방지λ₯Ό μœ„ν•˜μ—¬ μ•ˆμ „λŒ€, μ•ˆμ „λͺ¨, μ•ˆμ „ν™” λ“± 개인보호ꡬλ₯Ό μ°©μš©ν•˜μ—¬μ•Ό ν•œλ‹€. 5. 흙막이 κ°€μ‹œμ„€ 곡사 μ‹œμ—λŠ” κ·Όκ³¨κ²©κ³„μ§ˆν™˜ μ˜ˆλ°©μ„ μœ„ν•˜μ—¬ μ μ ˆν•œ νœ΄μ‹μ‹œκ°„μ„ μ œκ³΅ν•˜μ—¬μ•Ό ν•œλ‹€. 6. 흙막이 κ°€μ‹œμ„€ 곡사 μ‹œμ—λŠ” μž‘μ—…μžμ˜ 건강관리λ₯Ό μœ„ν•˜μ—¬ μž‘μ—…ν™˜κ²½μ„ κ°œμ„ ν•˜κ³  μ μ ˆν•œ νœ΄μ‹κ³΅κ°„μ„ λ§ˆλ ¨ν•˜μ—¬μ•Ό ν•œλ‹€. 7. 흙막이 κ°€μ‹œμ„€ 곡사 μ‹œμ—λŠ” μž‘μ—…μžμ˜ μ‚°μ—…μž¬ν•΄ μ˜ˆλ°©μ„ μœ„ν•˜μ—¬ μ•ˆμ „κ΅μœ‘, μ•ˆμ „μ‹œμ„€, μ•ˆμ „μž₯λΉ„λ₯Ό λ§ˆλ ¨ν•˜μ—¬μ•Ό ν•œλ‹€. 8. 흙막이 κ°€μ‹œμ„€ 곡사 μ‹œμ—λŠ” μž‘μ—…μžμ˜ μ•ˆμ „μ„ μœ„ν•˜μ—¬ μ•ˆμ „μž‘μ—…κ³„νšμ„ μˆ˜λ¦½ν•˜μ—¬μ•Ό ν•œλ‹€. 9. 흙막이 κ°€μ‹œμ„€ 곡사 μ‹œμ—λŠ” μž‘μ—…μžμ˜ μ•ˆμ „μ„ μœ„ν•˜μ—¬ ν† μ§ˆ, μ§€ν•˜μˆ˜μœ„, ν† μΈ΅, 맀섀물, 인접ꡬ쑰물, μ§€ν•˜μˆ˜μœ„, μ§€ν‘œλ©΄μ˜ 이상 유무, λ„λ‘œ μ‹œμ„€λ¬Ό λ“±μ˜ 이상 유무λ₯Ό ν™•μΈν•˜μ—¬μ•Ό ν•œλ‹€. 10. 흙막이 κ°€μ‹œμ„€ 곡사 μ‹œμ—λŠ” μž‘μ—…μžμ˜ μ•ˆμ „μ„ μœ„ν•˜μ—¬ μž‘μ—…μž 1인당 1개의 μ•ˆμ „λͺ¨, μ•ˆμ „ν™”, μ•ˆμ „λŒ€ λ“± 개인보호ꡬλ₯Ό μ°©μš©ν•˜μ—¬μ•Ό ν•œλ‹€.

Training Data

Training Data will be provided upon requests.

BibTeX: @article{llama3cs-layertuning, title={Llama-3-instruction-constructionsafety-layertuning}, author={L, Jungwon, A, Seungjun}, year={2024}, url={https://huggingface.co/juungwon/Llama-3-instruction-constructionsafety-layertuning} }

@article{llama3koen, title={Llama-3-KoEn}, author={L, Junbum}, year={2024}, url={https://huggingface.co/beomi/Llama-3-KoEn-8B} }

@article{llama3modelcard, title={Llama 3 Model Card}, author={AI@Meta}, year={2024}, url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md} }