Edit model card

Model Card for Korean Self-Introduction Feedback Model

This model provides feedback on Korean self-introductions, utilizing the Gemma2 architecture fine-tuned specifically for this task. It evaluates and generates constructive feedback to enhance the quality of self-introductions.

Model Details

Model Description

This model is based on the Gemma2 architecture and has been fine-tuned on a dataset of Korean self-introductions. It aims to provide useful feedback for improving self-introductions by evaluating various aspects of the text and suggesting enhancements.

  • Developed by: MINJUN JUNG
  • Funded by [optional]: [More Information Needed]
  • Shared by [optional]: [More Information Needed]
  • Model type: Text Generation
  • Language(s) (NLP): Korean
  • License: [More Information Needed]
  • Finetuned from model [optional]: Gemma2

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

Direct Use

The model can be used directly to generate feedback on Korean self-introductions, helping users refine their personal statements.

Downstream Use [optional]

When integrated into applications or services aimed at improving job applications or personal introductions, the model can enhance user-generated content through feedback and suggestions.

Out-of-Scope Use

The model may not perform well on non-Korean text or texts that are not self-introductions. It is not designed for tasks outside the scope of personal statement improvement.

Bias, Risks, and Limitations

Recommendations

Users should be aware that the model's feedback is based on patterns learned from training data, which might not cover all possible self-introduction scenarios. It is recommended to use the feedback as a guide rather than an absolute measure.

How to Get Started with the Model

To get started with the model, you can load it using the Hugging Face transformers library and use it to generate feedback for Korean self-introductions.


import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from peft import LoraConfig, PeftModel

# Load the pre-trained model and tokenizer
model = AutoModelForCausalLM.from_pretrained("BanAPP/gemma2-2b-kor-resume-feedback")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b-it", add_special_tokens=True)

# Create a text generation pipeline using the fine-tuned model and tokenizer
pipe_finetuned = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=512)

# Define placeholders for the job, question, and answer
job = ""
question = ""
answer = ""

# Construct a list of messages to be used in the input prompt
messages = [
    {
        "role": "user",
        "content": (
            f"์ž๊ธฐ์†Œ๊ฐœ์„œ ๋ฌธํ•ญ์— ๋Œ€ํ•ด์„œ ์ง€์›์ž๊ฐ€ ์ž‘์„ฑํ•œ ์ž๊ธฐ์†Œ๊ฐœ์„œ ๋‹ต๋ณ€์„ ์ง€์› ์ง๋ฌด๋ฅผ ๊ณ ๋ คํ•˜์—ฌ, ์ฑ„์šฉ ๋‹ด๋‹น์ž ๊ด€์ ์—์„œ ๊ฐœ์„ ์ ์„ ํ”ผ๋“œ๋ฐฑ ํ•ด์ฃผ์„ธ์š”.\n"
            f"์ง€์› ์ง๋ฌด: {job}\n"
            f"์ž๊ธฐ์†Œ๊ฐœ์„œ ๋ฌธํ•ญ: {question}\n"
            f"์ž๊ธฐ์†Œ๊ฐœ์„œ ๋‹ต๋ณ€: {answer}"
        )
    }
]

# Prepare the input prompt using the tokenizer and chat template
# Note: Apply chat template is used to format the messages as per the chat-based input
prompt = pipe_finetuned.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Generate feedback by passing the formatted prompt to the pipeline
# Configure sampling parameters to control text generation
outputs = pipe_finetuned(
    prompt,
    do_sample=True,          # Enable sampling to generate diverse outputs
    temperature=0.2,         # Control randomness in text generation (lower value makes the output more focused)
    top_k=50,                # Limit the sampling pool to the top 50 tokens
    top_p=0.95,              # Use nucleus sampling to focus on the top 95% of probability mass
    add_special_tokens=True  # Include special tokens as per the model's requirements
)

# Print the generated feedback, excluding the input prompt from the output
print(outputs[0]["generated_text"][len(prompt):])
Downloads last month
18
Safetensors
Model size
2.51B params
Tensor type
F32
ยท
Inference Examples
Inference API (serverless) is not available, repository is disabled.