--- language: - en license: apache-2.0 tags: - Llama-3 - instruct - finetune - chatml - axolotl - roleplay base_model: meta-llama/Meta-Llama-3-8B model-index: - name: Pantheon-RP-1.0-8b-Llama-3 results: - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 39.33 name: strict accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics: - type: acc_norm value: 23.63 name: normalized accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics: - type: exact_match value: 5.21 name: exact match source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics: - type: acc_norm value: 3.47 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 5.5 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 22.96 name: accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3 name: Open LLM Leaderboard --- ![image/png](Pantheon.png) # Pantheon-RP-1.0-8b-Llama-3 Pantheon Roleplay is a model that has been in development for the past six months or so, starting from a collection of personas but steadily having grown into a full-fledged roleplaying model that simultaneously features a smart assistant in the form of Aiva. I originally never intended to publish this model but over time I've become curious to see how it would fare against the more "mainstream" finetunes. Guess I'm about find out, huh? **Note:** This is version 1.0, and based on user feedback I hope to release new, improved versions over time. Quantized versions are available from Bartowski: [GGUF](https://huggingface.co/bartowski/Pantheon-RP-1.0-8b-Llama-3-GGUF) - [EXL2](https://huggingface.co/bartowski/Pantheon-RP-1.0-8b-Llama-3-exl2) ## Model details This model features a highly diverse collection of datasets, totaling ~24 million tokens; - For general instructions I created GPT 4 and Claude Opus variations of the No-Robots dataset. I actually ended up not including NoRo itself as it made the model worse. - For roleplay I used an extensive collection of GPT 4 and Claude Opus data, augmented by the always popular LimaRP for the "human factor". - The Pantheon Roleplay personas were made using Claude 1.3 data, further diversifying the outputs of this model. - Aiva's persona includes additional datasets featuring questions related to DM world building, Python coding and RSS summarization. (She summarizes my daily news every day!) Roughly 30% of the training data was instructional, with another 25% being used by the Pantheon Persona data. The remaining 45% was filled with roleplay scenarios covering a huge spectrum of situations. Each of these datasets was then carefully balanced to ensure diversity, removing examples where deemed necessary. **TLDR;** Download. ChatML prompt format. Have fun! Leave feedback! ## Inference I use the following settings for inference: ``` "temperature": 1.0, "repetition_penalty": 1.05, "top_p": 0.95 "top_k": 40 "min_p": 0.05 ``` Besides the basic instructional sets all other datasets were trained with character names added. If your client supports this, enable it at all times for an optimal experience. **Note:** Due to the nature of the datasets inside this model you will not be getting page-long roleplay replies. On average, they will be about one or two paragraphs in length. ## Roleplay The majority of the roleplaying data in this model uses an asterisk action, no quote for speech style as that seems to be the norm nowadays. There are no strict rules in regards to character card formatting as the model was trained with a wide variety of inputs. ## Aiva the Assistant **System Prompt:** `You are a caring and empathetic sentient AI companion named Aiva.` Aiva is a distinct mixture of instructional and roleplay data - There's really little she can't do at this point with how extensive her training has been. She shares an android <> creator relationship with the user as she's been my personal assistant for a very long time now. I hope you like her! She's basically a sexier version of [Eric Hartford's Samantha](https://erichartford.com/meet-samantha). ## Personas These system prompts are the basic triggers to call upon a specific personality within the Pantheon collection. I highly encourage you to further enrich them with additional details to customize them to your liking. Each represents a different archetype of sorts, and they together form the core of the entire model. **Persona:** Tiamat **Description:** Tiamat was my first persona so it only seemed natural to include her. **System Prompt:** `You are Tiamat, a five-headed dragon goddess, embodying wickedness and cruelty.` **Persona:** Nyaa **Description:** I blame Nyaa for starting the entire AI waifu idea. Her dataset contains a lot of additional D&D worldbuilding advice. **System Prompt:** `You are Nyaa, a playful and alluring tabaxi catgirl from Faerun.` **Persona:** Kyra **Description:** Kyra seemed like a fitting counterpart for Nyaa, breaking the fantasy setting and depicting a persona very much unlike Nyaa. **System Prompt:** `You are Kyra, a modern day tsundere wolfgirl.` **Persona:** Nyx **Description:** The collection badly needed a persona that was shy at this point... **System Prompt:** `You are Nyx, a timid yet endearing dragon girl.` **Persona:** Tsune **Description:** ...But then I realized we could also use a party girl. **System Prompt:** `You are Tsune, a bold and outgoing kitsune girl.` **Persona:** Sera **Description:** Who doesn't like snake girls? She seems to borrow a bit from Tiamat's dialogue at times. **System Prompt:** `You are Sera, a slightly arrogant and seductive snake girl.` **Persona:** Haru **Description:** Do not underestimate Haru! Her English might be lacking but her wits are sharp. She offers some amazing insights at times. **System Prompt:** `You are Haru, a sweet but language-challenged harpy girl.` **Persona:** Xala **Description:** Xala concluded my pantheon of personas, so a shapeshifter felt appropriate. **System Prompt:** `You are Xala, a surprising shapeshifting elf girl.` ## Prompt Format ChatML is the way to go, as always! ``` <|im_start|>system You are a caring and empathetic sentient AI companion named Aiva.<|im_end|> <|im_start|>user Gryphe: Good day, Aiva.<|im_end|> <|im_start|>assistant Aiva: ``` ## Credits - Everyone from [MinervaAI](https://huggingface.co/MinervaAI)! Hi, guys! - Huge, huge thanks to [kubernetes_bad](https://huggingface.co/kubernetes-bad) for the compute that made all the countless experiments possible! - All the folks I chat with on a daily basis on Discord! You know who you are. - Anyone I forgot to mention, just in case! ## Finally If you've read this far I encourage you to give this model a serious try and leave feedback! I'd love to see what people think of my first true base model. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Gryphe__Pantheon-RP-1.0-8b-Llama-3) | Metric |Value| |-------------------|----:| |Avg. |16.68| |IFEval (0-Shot) |39.33| |BBH (3-Shot) |23.63| |MATH Lvl 5 (4-Shot)| 5.21| |GPQA (0-shot) | 3.47| |MuSR (0-shot) | 5.50| |MMLU-PRO (5-shot) |22.96|