--- license: other license_name: jamba-open-model-license license_link: https://www.ai21.com/licenses/jamba-open-model-license language: - en - fr - de - nl - es - pt - it - ar - he pipeline_tag: text-generation tags: - mamba - jamba - moe library_name: transformers --- # Spellbound Jamba Mini: Creative output over long contexts # Main Goals ### The main goals of the base model choice and post-trained regime are - Strong steerability - Coherence over long context lengths - Flexible writing styles - Advanced formatting that allows identifying individual speakers ### There was also a secondary training objective: to teach the model to understand and produce directives in XML tags. - `<${characterName}Description>`: A definition of a character defined as a markdown list of details. For example: - Name: Character Name - Personality: Character Personality - Speaker ID: 32AN4R (see `` tag below) - ... - ``: A block of markdown formatted instructions representing what should happen in the story. - ``: A block containing the preceeding events to the story being written ### Output can optionally include the following tags: - ``: When a character is defined with a speaker ID, the model will output the speech surrounded by `` and ``. The model learns to keep speech in character this way, and it allows for identifying different speakers for rendering and text-to-speech purposes - ``: Represents an action taken by a character - ``: Represents a sound made in the story **Instructing the model to produce these tags is optional**, but the model should produce best possible output if the frontend being used can parse/ignore these # Post-training Details ## Post-training consists of 1 epoch of SFT LORA training - Trained on synthetic instructions for strong steerability - Outputs rated by [tryspellbound.com](https://tryspellbound.com) beta users who opted-in - Lora Rank: 8 - Batch Size: 2 - Learning Rate: 1e-5 # Model Creator Made by [tryspellbound.com](https://tryspellbound.com).