SegGPT: Segmenting Everything In Context

Xinlong Wang^1*, Xiaosong Zhang^1*, Yue Cao^1*, Wen Wang², Chunhua Shen², Tiejun Huang^1,3

Enjoy the Demo and Code

We present SegGPT, a generalist model for segmenting everything in context. With only one single model, SegGPT can perform arbitrary segmentation tasks in images or videos via in-context inference, such as object instance, stuff, part, contour, and text. SegGPT is evaluated on a broad range of tasks, including few-shot semantic segmentation, video object segmentation, semantic segmentation, and panoptic segmentation. Our results show strong capabilities in segmenting in-domain and out-of-domain targets, either qualitatively or quantitatively.

[Paper] [Code] [Demo]

Model

A pre-trained SegGPT model is available at 🤗 HF link.

Citation

@article{SegGPT,
  title={SegGPT: Segmenting Everything In Context},
  author={Wang, Xinlong and Zhang, Xiaosong and Cao, Yue and Wang, Wen and Shen, Chunhua and Huang, Tiejun},
  journal={arXiv preprint arXiv:2304.03284},
  year={2023}
}

Contact

We are hiring at all levels at BAAI Vision Team, including full-time researchers, engineers and interns. If you are interested in working with us on foundation model, visual perception and multimodal learning, please contact Xinlong Wang (wangxinlong@baai.ac.cn) and Yue Cao (caoyue@baai.ac.cn).