Theia

The AI Institute

Theia is a vision foundation model for robot learning that distills multiple off-the-shelf vision foundation models trained on varied vision tasks. Theia’s rich visual representations encode diverse visual knowledge, enhancing downstream robot learning. It was introduced in the paper Theia: Distilling Diverse Vision Foundation Models for Robot Learning, which also includes experiments demonstrating that Theia outperforms its teacher models and prior robot learning models using less training data and smaller model sizes. Demo videos can be found on the project page.

Model Details

The theia-base-patch16-224-cdiv model, uses DeiT-Base as a backbone, and simulatenously distills CLIP, DINOv2, and ViT. For more information on usage, please visit the Theia repository.

Citation

If you use Theia in your research, please use the following BibTeX entry:

@article{shang2024theia,
  author    = {Shang, Jinghuan and Schmeckpeper, Karl and May, Brandon B. and Minniti, Maria Vittoria and Kelestemur, Tarik and Watkins, David and Herlant, Laura},
  title     = {Theia: Distilling Diverse Vision Foundation Models for Robot Learning},
  journal   = {arXiv},
  year      = {2024},
}

Usage

The pre-trained model weights and code released with Theia are available for use under The AI Institute License, reproduced in full below:

Copyright (c) 2024 Boston Dynamics AI Institute LLC

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the copyright notice included
with the software, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the copyright notice, this
list of conditions and the following disclaimer in the documentation and/or
other materials provided with the distribution.
3. Modified versions of the software must be conspicuously marked as such.
4. The software may only be used for non-commercial research purposes.
For profit enterprises may use the software, subject to this limitation.

THIS SOFTWARE IS PROVIDED BY THE AI INSTITUTE AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, NON-
INFRINGEMENT,TITLE, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE AI INSTITUTE OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, PUNITIVE OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, DAMAGES ARISING OUT OF CLAIMS OF
INTELLECTUAL PROPERTY RIGHTS INFRINGEMENT; PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Downloads last month
3,453
Safetensors
Model size
140M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.

Collection including theaiinstitute/theia-base-patch16-224-cdiv