---
license: other
license_name: jamba-open-model-license
license_link: https://www.ai21.com/licenses/jamba-open-model-license
language:
- en
- fr
- de
- nl
- es
- pt
- it
- ar
- he
pipeline_tag: text-generation
tags:
- mamba
- jamba
- moe
library_name: transformers
---

# Spellbound Jamba Mini: Creative output over long contexts

  <img src="https://i.imgur.com/IG4IKfV.png" width=500 height=500>

# Main Goals

### The main goals of the base model choice and post-trained regime are

- Strong steerability
- Coherence over long context lengths
- Flexible writing styles
- Advanced formatting that allows identifying individual speakers


### There was also a secondary training objective: to teach the model to understand and produce directives in XML tags.

- `<${characterName}Description>`: A definition of a character defined as a markdown list of details. For example:
    - Name: Character Name
    - Personality: Character Personality
    - Speaker ID: 32AN4R (see `<quote>` tag below)
    - ...
- `<writingInstructions>`: A block of markdown formatted instructions representing what should happen in the story.
- `<pastStory>`: A block containing the preceeding events to the story being written

### Output can optionally include the following tags:

- `<quote speaker="{speakerId}">`: When a character is defined with a speaker ID, the model will output the speech surrounded by `<quote speaker="{speakerId}">` and `</quote>`. The model learns to keep speech in character this way, and it allows for identifying different speakers for rendering and text-to-speech purposes
- `<action>`: Represents an action taken by a character
- `<sound>`: Represents a sound made in the story

**Instructing the model to produce these tags is optional**, but the model should produce best possible output if the frontend being used can parse/ignore these

# Post-training Details

## Post-training consists of 1 epoch of SFT LORA training 

- Trained on synthetic instructions for strong steerability
- Outputs rated by [tryspellbound.com](https://tryspellbound.com) beta users who opted-in
- Lora Rank: 8
- Batch Size: 2
- Learning Rate: 1e-5
  
# Model Creator

Made by [tryspellbound.com](https://tryspellbound.com).