Edit model card

Model Card for Jat

This is a multi-modal and multi-task model.

Model Details

Model Description

  • Developed by: The JAT Team
  • License: Apache 2.0

Model Sources

Training

The model was trained on the following tasks:
  • Alien
  • Amidar
  • Assault
  • Asterix
  • Asteroids
  • Atlantis
  • Bank Heist
  • Battle Zone
  • Beam Rider
  • Berzerk
  • Bowling
  • Boxing
  • Breakout
  • Centipede
  • Chopper Command
  • Crazy Climber
  • Defender
  • Demon Attack
  • Double Dunk
  • Enduro
  • Fishing Derby
  • Freeway
  • Frostbite
  • Gopher
  • Gravitar
  • H.E.R.O.
  • Ice Hockey
  • James Bond
  • Kangaroo
  • Krull
  • Kung-Fu Master
  • Montezuma's Revenge
  • Ms. Pacman
  • Name This Game
  • Phoenix
  • PitFall
  • Pong
  • Private Eye
  • Q*Bert
  • River Raid
  • Road Runner
  • Robotank
  • Seaquest
  • Skiing
  • Solaris
  • Space Invaders
  • Star Gunner
  • Surround
  • Tennis
  • Time Pilot
  • Tutankham
  • Up and Down
  • Venture
  • Video Pinball
  • Wizard of Wor
  • Yars Revenge
  • Zaxxon
  • Action Obj Door
  • Blocked Unlock Pickup
  • Boss Level No Unlock
  • Boss Level
  • Find Obj S5
  • Go To Door
  • Go To Imp Unlock
  • Go To Local
  • Go To Obj Door
  • Go To Obj
  • Go To Red Ball Grey
  • Go To Red Ball No Dists
  • Go To Red Ball
  • Go To Red Blue Ball
  • Go To Seq
  • Go To
  • Key Corridor
  • Mini Boss Level
  • Move Two Across S8N9
  • One Room S8
  • Open Door
  • Open Doors Order N4
  • Open Red Door
  • Open Two Doors
  • Open
  • Pickup Above
  • Pickup Dist
  • Pickup Loc
  • Pickup
  • Put Next Local
  • Put Next S7N4
  • Synth Loc
  • Synth Seq
  • Synth
  • Unblock Pickup
  • Unlock Local
  • Unlock Pickup
  • Unlock To Unlock
  • Unlock
  • Assembly
  • Basketball
  • BinPicking
  • Box Close
  • Button Press Topdown Wall
  • Button Press Topdown
  • Button Press Wall
  • Button Press
  • Coffee Button
  • Coffee Pull
  • Coffee Push
  • Dial Turn
  • Disassemble
  • Door Close
  • Door Lock
  • Door Open
  • Door Unlock
  • Drawer Close
  • Drawer Open
  • Faucet Close
  • Faucet Open
  • Hammer
  • Hand Insert
  • Handle Press Side
  • Handle Press
  • Handle Pull Side
  • Handle Pull
  • Lever Pull
  • Peg Insert Side
  • Peg Unplug Side
  • Pick Out Of Hole
  • Pick Place Wall
  • Pick Place
  • Plate Slide Back Side
  • Plate Slide Back
  • Plate Slide Side
  • Plate Slide
  • Push Back
  • Push Wall
  • Push
  • Reach Wall
  • Reach
  • Shelf Place
  • Soccer
  • Stick Pull
  • Stick Push
  • Sweep Into
  • Sweep
  • Window Close
  • Window Open
  • Ant
  • Inverted Double Pendulum
  • Half Cheetah
  • Hopper
  • Humanoid
  • Inverted Pendulum
  • Pusher
  • Reacher
  • Humanoid Standup
  • Swimmer
  • Walker 2d

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("jat-project/jat")

Citation

@article{gallouedec2024jack,
    title = {{Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent}},
    author = {Gallouédec, Quentin and Beeching, Edward and Romac, Clément and Dellandréa, Emmanuel},
    journal = {arXiv preprint arXiv:2402.09844},
    year = {2024},
    url = {https://arxiv.org/abs/2402.09844}
}
Downloads last month
269
Safetensors
Model size
193M params
Tensor type
F32
·
Video Preview
loading

Dataset used to train jat-project/jat

Evaluation results

  • IQM expert normalized total reward on Atari 57
    self-reported
    0.14 [0.14, 0.15]
  • IQM human normalized total reward on Atari 57
    self-reported
    0.38 [0.37, 0.39]
  • IQM expert normalized total reward on BabyAI
    self-reported
    0.99 [0.99, 0.99]
  • IQM expert normalized total reward on MetaWorld
    self-reported
    0.65 [0.64, 0.67]
  • IQM expert normalized total reward on MuJoCo
    self-reported
    0.85 [0.83, 0.86]
  • Total reward on Alien
    self-reported
    1518.70 +/- 568.14
  • Expert normalized total reward on Alien
    self-reported
    0.08 +/- 0.03
  • Human normalized total reward on Alien
    self-reported
    0.19 +/- 0.08
  • Total reward on Amidar
    self-reported
    89.17 +/- 78.73
  • Expert normalized total reward on Amidar
    self-reported
    0.04 +/- 0.04