Featured in the release are: * The MambaOut model, a cheeky arch inspired by SSM but without the SSM part, a ConvNeXt with gating. * Several timm trained MambaOut variations with arch tweaks and ImageNet-12k pretrain to verify scaling, supplement ported weights. * The smallest MobileNetV4, a 0.5x width scaled Conv-Small. * Two impressive MobileNetV3 Large models outperforming all previous, using MNV4 Small recipe. * 'Zepto,' a new compact ConvNeXt variant even smaller than the previous Atto, 2.2M params, RMSNorm, and solid results for its size. * Newly ported SigLIP SO400M/16 ViT multi-lingual weights, the largest i18n weights, prevous was B/16. * Two ImageNet-1k fine-tuned SigLIP SO400M models at 378x378 * InternViT 300M weight port. A really solid ViT encoder distilled from OpenGVLab 6B VL model encoder. * An assortment of very small, sub 1M param pretrained test models to improve library unit tests and serve low-resource applications.
A 'small' MobileNet-V4 update, I just pushed weights for the smallest model I've trained in the series, a 0.5 width multiplier version of the MobileNet-V4 Conv Small.
Now you may look at this and say hey, why is this impressive? 64.8% top-1 and 2.2M params? MobileNetV3-Small 0.75, and MobileNet-V2 0.5 are both fewer params (at ~2M) and over 65% top-1, what gives? Well this is where MobileNet-V4 differs from the previous versions of the model family, it trades off (gives up) a little parameter efficiency for some computational efficiency.