|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- HuggingFaceFW/fineweb |
|
- PleIAs/YouTube-Commons |
|
- allenai/WildChat-1M |
|
language: |
|
- de |
|
- en |
|
- ja |
|
- fr |
|
library_name: mlx |
|
tags: |
|
- moe |
|
- multimodal |
|
- j.o.s.i.e. |
|
--- |
|
|
|
# This will be the repo for J.O.S.I.E.v4o |
|
|
|
Like **OpenAIs GPT-4o**, it's natively Multimodal, based on the **NExT-GPT** combined with **ROPE**, **RMS Normalisation**, and **MoE**, parred with the **GPT-4o Tokenizer** from OpenAI. |
|
This is a *future project* and will take it's time. |
|
|
|
Further more, I will probably make a **UI application** with that model too. |
|
|
|
Further updates comming soon!!! |
|
|
|
|
|
First architecture Overview: |
|
|
|
First Beta will utilize the already pretrained ImageBind Model. The linear input Projection is because the outputs of the ImageBind model are not in the correct dimensions. |
|
Later on the input projection will be removed. |
|
|
|
<img src="Architecture_overview_beta3.png" width="100%" height="auto"/> |
|
|
|
|
|
Source code and more info will be available on my <a href="https://github.com/Goekdeniz-Guelmez/J.O.S.I.E.v4-o.git">GitHub Repo</a> |