brildev7's picture
Update README.md
dc806f8 verified
metadata
library_name: peft
base_model: google/gemma-2b
language:
  - ko
  - en
tags:
  - translation
  - gemma

Model Card for Model ID

Model Details

Model Description

Training Details

Training Data

https://huggingface.co/datasets/traintogpb/aihub-koen-translation-integrated-tiny-100k

Inference Examples

import os
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

model_id = "google/gemma-2b"
peft_model_id = "brildev7/gemma-2b-translation-koen-sft-qlora"
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4"
)

model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    quantization_config=quantization_config, 
    torch_dtype=torch.float32,
    attn_implementation="sdpa",
)
model = PeftModel.from_pretrained(model, peft_model_id)

tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
tokenizer.pad_token_id = tokenizer.eos_token_id

# example
prompt_template = """λ‹€μŒ λ‚΄μš©μ„ μ˜μ–΄λ‘œ λ²ˆμ—­ν•˜μ„Έμš”.:
{}

λ²ˆμ—­:
"""
sentences = "μœ„μ€‘μ„€μ΄ λ‚˜λŒλ˜ μœŒλ¦¬μ—„ 영ꡭ μ™•μ„Έμžμ˜ 뢀인 μΌ€μ΄νŠΈ λ―Έλ“€ν„΄ μ™•μ„ΈμžλΉˆ(42)이 κ²°κ΅­ μ•” 진단을 λ°›μ•˜λ‹€. λ‘œμ΄ν„° 톡신에 λ”°λ₯΄λ©΄ μ™•μ„ΈμžλΉˆμ€ 22일(ν˜„μ§€μ‹œκ°„) μΈμŠ€νƒ€κ·Έλž¨ μ˜μƒ λ©”μ‹œμ§€λ₯Ό 톡해 μ§€λ‚œ 1μ›” 볡뢀 μˆ˜μˆ μ„ 받은 λ’€ μ‹€μ‹œν•œ 후속 κ²€μ‚¬μ—μ„œ 암이 발견돼 ν˜„μž¬ ν™”ν•™μΉ˜λ£Œλ₯Ό λ°›κ³  μžˆλ‹€κ³  λ°ν˜”λ‹€. μ™•μ„ΈμžλΉˆμ€ 'μ˜λ£Œμ§„μ€ 예방적 μ°¨μ›μ—μ„œ ν™”ν•™μΉ˜λ£Œλ₯Ό κΆŒκ³ ν–ˆλ‹€'λ©΄μ„œ 'λ¬Όλ‘  이것은 큰 좩격으둜 λ‹€κ°€μ™”μ§€λ§Œ μœŒλ¦¬μ—„κ³Ό μ €λŠ” μ–΄λ¦° 가쑱듀을 μœ„ν•΄ 이 문제λ₯Ό ν•΄κ²°ν•˜κ³ μž μ΅œμ„ μ„ λ‹€ν•˜κ³  μžˆλ‹€'κ³  λ§ν–ˆλ‹€. κ·ΈλŸ¬λ©΄μ„œ 'ν˜„μž¬ μ•”μœΌλ‘œ 인해 영ν–₯을 받은 λͺ¨λ“  μ‚¬λžŒλ“€μ„ μƒκ°ν•˜κ³  μžˆλ‹€'λ©° '믿음과 희망을 μžƒμ§€ 말아 달라. μ—¬λŸ¬λΆ„μ€ ν˜Όμžκ°€ μ•„λ‹ˆλ‹€'라고 λ§λΆ™μ˜€λ‹€."
texts = prompt_template.format(sentences)
inputs = tokenizer(texts, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Prince William's wife Kate Middleton, 42, has been diagnosed with cancer after undergoing surgery for her abdominal pain, according to Reuters news agency. In an Instagram message on the 22nd (local time), Kate Middleton, the wife of Prince William, said that she was diagnosed with cancer after undergoing surgery for her abdominal pain in January and is currently undergoing chemical therapy. She said that the medical team recommended chemical therapy as a measure to prevent the spread of the disease, but that she and Prince William are trying to resolve the issue for their young family. She added that "The medical team recommended chemical therapy as a measure to prevent the spread of the disease.

# example
prompt_template = """λ‹€μŒ λ‚΄μš©μ„ μ˜μ–΄λ‘œ λ²ˆμ—­ν•˜μ„Έμš”.:
{}

λ²ˆμ—­:
"""
sentences = "μ• ν”Œμ΄ μ£Όλ ₯ μ‹œμž₯ 쀑에 ν•˜λ‚˜μΈ μ€‘κ΅­μ—μ„œ ν˜„μ§€ 슀마트폰 μ œμ‘°μ‚¬λ“€μ—κ²Œ 밀리며 μœ„κΈ°κ°μ΄ 증폭된 κ°€μš΄λ° 쀑ꡭ μ†ŒλΉ„μž μž‘κΈ°μ— λ‚˜μ„œκ³  μžˆλ‹€. νŒ€ μΏ‘ CEO(졜고경영자)κ°€ 직접 쀑ꡭ을 λ°©λ¬Έν•΄ 투자λ₯Ό μ•½μ†ν•˜κ³ , '아이폰' λ“± μžμ‚¬ 기기에 쀑ꡭ λ°”μ΄λ‘μ˜ AI(인곡지λŠ₯) λͺ¨λΈμ„ νƒ‘μž¬ν•˜λŠ” λ°©μ•ˆλ„ κ²€ν† ν•˜κ³  μžˆλ‹€. 쀑ꡭ λ³Έν† μ„œ 아이폰 할인 곡세에 이어 μ „λ°©μœ„μ  투자λ₯Ό λŠ˜λ¦¬λŠ” λͺ¨μ–‘μƒˆλ‹€."
texts = prompt_template.format(sentences)
inputs = tokenizer(texts, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- With Apple becoming a target in China, a major market, the company is taking a stance in a Chinese consumer magazine. CEO Tim Cook is visiting China and is planning to invest, and is also considering adding Chinese Big Data AI models on Apple's products such as 'iPhone'. It seems that China is making a wide-ranging investment following the iPhone discounting wave on the mainland.