metadata

base_model:
  - Undi95/Llama-3-Unholy-8B
library_name: transformers
tags:
  - medical
license: other
datasets:
  - mlabonne/orpo-dpo-mix-40k
  - Open-Orca/SlimOrca-Dedup
  - jondurbin/airoboros-3.2
  - microsoft/orca-math-word-problems-200k
  - m-a-p/Code-Feedback
  - MaziyarPanahi/WizardLM_evol_instruct_V2_196k
  - ruslanmv/ai-medical-chatbot
model-index:
  - name: Ryeta-0
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 59.13
            name: normalized accuracy
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 82.9
            name: normalized accuracy
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 60.35
            name: accuracy
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 49.65
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 78.93
            name: accuracy
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 60.35
            name: accuracy
language:
  - en

Ryeta-0

Built upon the powerful LLaMa-3 architecture and fine-tuned on an extensive dataset of health information, this model leverages its vast medical knowledge to offer clear, comprehensive answers.

This model is generally better for accurate and informative responses, particularly for users seeking in-depth medical advice.

Benchmarks

Subject	Model Accuracy (%)
Clinical Knowledge	71.70
Medical Genetics	78.00
Human Aging	70.40
Human Sexuality	73.28
College Medicine	62.43
Anatomy	64.44
College Biology	72.22
High School Biology	77.10
Professional Medicine	63.97
Nutrition	73.86
Professional Psychology	68.95
Virology	54.22
High School Psychology	83.67
Average	70.33

Usage:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

class MedicalAssistant:
    def __init__(self, model_name="SpectreLynx/Ryeta-0", device="cuda"):
        self.device = device
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.model = AutoModelForCausalLM.from_pretrained(model_name).to(self.device)
        self.sys_message = ''' 
        You are an AI Medical Assistant trained on a vast dataset of health information. Please be thorough and
        provide an informative answer. If you don't know the answer to a specific medical inquiry, advise seeking professional help.
        '''

    def format_prompt(self, question):
        messages = [
            {"role": "system", "content": self.sys_message},
            {"role": "user", "content": question}
        ]
        prompt = self.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
        return prompt

    def generate_response(self, question, max_new_tokens=512):
        prompt = self.format_prompt(question)
        inputs = self.tokenizer(prompt, return_tensors="pt").to(self.device)
        with torch.no_grad():
            outputs = self.model.generate(**inputs, max_new_tokens=max_new_tokens, use_cache=True)
        answer = self.tokenizer.batch_decode(outputs, skip_special_tokens=True)[0].strip()
        return answer

if __name__ == "__main__":
    assistant = MedicalAssistant()
    question = '''
    Symptoms:
    Dizziness, headache, and nausea.

    What is the differential diagnosis?
    '''
    response = assistant.generate_response(question)
    print(response)