--- library_name: transformers license: apache-2.0 language: - en tags: - MemGPT - function - function calling --- This is a test release of DPO version of [MemGPT](https://github.com/cpacker/MemGPT) Language Model. This model has been quantized using exllama for 8-bits. # Model Description This repository contains a MoE (Mixture of Experts) model of [Mistral 7B Instruct](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2). It has 2 experts per token. This model is specifically designed for function calling in MemGPT. It demonstrates comparable performances to GPT-4 when it comes to working with MemGPT. # Key Features * Function calling * Dedicated to working with MemGPT * Supports medium-length context, up to sequences of 8,192 # Prompt Format This model uses **ChatML** prompt format: ``` <|im_start|>system {system_instruction}<|im_end|> <|im_start|>user {user_message}<|im_end|> <|im_start|>assistant {assistant_response}<|im_end|> ``` # Usage This model is designed to be ran on multiple backends, such as [oogabooga's textgen WebUI](https://github.com/oobabooga/text-generation-webui). Simply install your preferred backend, and then load up this model. Then, configure MemGPT using `memgpt configure`, and chat with MemGPT via `memgpt run` command! # Model Details * Developed by: @starsnatched * Model type: This repo contains a language model based on the transformer decoder architecture. * Language: English * Contact: For any questions, concerns or comments about this model, please contact me at Discord, @starsnatched. # Training Infrastructure * Hardware: The model in this repo was trained on 2x A100 80GB GPUs. # Intended Use The model is designed to be used as the base model for MemGPT agents. # Limitations and Risks The model may exhibit unreliable, unsafe, or biased behaviours. Please double check the results this model may produce.