--- license: cc-by-4.0 language: - en pipeline_tag: any-to-any tags: - multimodal library_name: transformers --- # Align-Anything Chameleon 7B Base ## Introduction Repository for Align-Anything Chameleon 7B Base, a powerful model for text-image interleaved input and output. This model is based on the [Chameleon](https://huggingface.co/facebook/chameleon-7b) model, and is trained on the [Align-Anything](https://github.com/PKU-Alignment/Align-Anything) framework to further unlock its capability of image generation. ## Usage To use this model, you can refer to the [Align-Anything](https://github.com/PKU-Alignment) repository for more details, including the training, inference and evaluation: ```bash git clone https://github.com/PKU-Alignment/align-anything.git cd align-anything/projects/text_image_to_text_image ``` Then follow the instructions in the README.md file to set up the environment and run the scripts. Currently, the official Transformer repo does not support Chameleon model with image output (see [this PR](https://github.com/huggingface/transformers/pull/32013) for more details), so we rely on a certain fork of the repo. After installing Align-Anything and correctly set up the envrionment, you can install the forked stable version of the repo by running: ```bash pip install git+https://github.com/htlou/transformers.git@hantao_stable_cham ``` If you want to generate image (pure text generation can be directly done by `Transformers`), you can follow the instructions in the [mmsg_chameleon](https://github.com/htlou/mmsg_chameleon) repo to run the inference. ```bash git clone https://github.com/htlou/mmsg_chameleon.git cd mmsg_chameleon ``` Then set up the envrionment using ```bash pip install -e . ``` After setting up the envrioment, set up the correct paths in `scripts/interleaved_gen.sh` and then run ```bash bash scripts/interleaved_gen.sh ```