File size: 5,042 Bytes
1247a15 02162cc 78c0f10 81ae1df 9ca90e4 4af7ce0 07ccef2 f1d73bf 02162cc 78c0f10 02162cc 1247a15 139296d f1d73bf 4af7ce0 1247a15 402ed7d 1247a15 d159d67 7ce9bde d159d67 44d9a0b 402ed7d 44d9a0b 75d2037 e204b87 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
---
license: apache-2.0
datasets:
- HuggingFaceFW/fineweb
- PleIAs/YouTube-Commons
- allenai/WildChat-1M
- Salesforce/xlam-function-calling-60k
- ShareGPT4Video/ShareGPT4Video
- OpenGVLab/ShareGPT-4o
- TempoFunk/webvid-10M
- MBZUAI/VideoInstruct-100K
- Isaak-Carter/j.o.s.i.e.v4.0.1o
- NousResearch/dolma-v1_7-c4
- NousResearch/dolma-v1_7-cc_en_head
- nyu-visionx/Cambrian-10M
- LargeWorldModel/ultrachat_qa_mix_1M
- LargeWorldModel/ultrachat_qa_mix_512K
- LargeWorldModel/ultrachat_qa_mix_256K
- LargeWorldModel/ultrachat_qa_mix_128K
- nkp37/OpenVid-1M
language:
- de
- en
library_name: mlx
tags:
- moe
- multimodal
- vision
- audio
- endtoend
- j.o.s.i.e.
---
# STILL IN BETA!!!
# The newest Text to text version is Beta 2.3.1
- `Isaak-Carter/j.o.s.i.e.v4o-7b-stage1-beta3.2`
- Here is the 4 K M quant version `ollama pull goekdenizguelmez/j.o.s.i.e.v4o-7b-stage1-beta3.2`
# This will be the repo for J.O.S.I.E.v4o
Like **OpenAIs GPT-4o**, it's natively Multimodal, based on the **NExT-GPT** combined with **ROPE**, **RMS Normalisation**, and **MoE**, parred with the **GPT-4o Tokenizer** from OpenAI.
This is a *future project* and will take it's time.
Further more, I will probably make a **UI application** with that model too.
Further updates comming soon!!!
Source code and more info will be available on my <a href="https://github.com/Goekdeniz-Guelmez/J.O.S.I.E.-v4o.git">GitHub Repo</a>
# Update 1:
The model will go through multible training stages:
- *Stage 1 :* Instruction finetuning the LLM on custom dataset and Prompt format.
- *Stage 2 :* Encoder side alignment using the contrastive learning technique.
- *Stage 3 :* Instruction finetuning the full model.
- Changes and more stages will be comming.
# Update 2:
Encoders are created and function as expected.
# Update 4:
Creating the full model and succesfully running inference with vision and audio.
# Update 3:
First encoder side alignment training steps successfuly worked.
# Update 5 Prompt Template:
The prompt template used in this project is inspired by the ChatML template but includes several customized adjustments to fit the specific requirements of our application:
```
<|begin_of_text|>system
You are J.O.S.I.E. which is an acronym for "Just an Outstandingly Smart Intelligent Entity", a private and super-intelligent AI assistant, created by Gökdeniz Gülmez.<|end_of_text|>
<|begin_of_text|>main user "Gökdeniz Gülmez"
{{ .Prompt }}<|end_of_text|>
<|begin_of_text|>josie
{{ .Response }}<|end_of_text|>
```
1. **User Types and Access Levels:**
- **Main User:**
- The main user is identified as "Gökdeniz Gülmez." This designation can be personalized and updated with your name as needed. The system will recognize and prioritize the main user's commands and queries, ensuring they have full control and access to all functionalities and information within the smart home system.
- **Authorized User:**
- Authorized users are those who have been granted permission by the main user. They are designated as `authorized user "{name}"`. While these users can interact with the system, their access will be restricted to guest-level privileges, preventing them from controlling or accessing sensitive smart home information. This ensures a balance between usability and security.
- **Unauthorized User:**
- Unauthorized users are designated as `unauthorized user "name if possible else unknown"`. These users will have no access to J.O.S.I.E.'s capabilities. Any attempt to interact with the system by unauthorized users will result in immediate redirection to the main user for verification. Additional security measures may be enacted, such as alerts or system lockdowns, to protect the integrity of the smart home environment.
2. **Template Structure:**
- The template structure is designed to facilitate clear and structured interactions between users and the system. It consists of the following components:
- **System Message:**
`system`
This section includes any predefined system messages or configurations necessary for the interaction.
- **Main User Identification:**
`main user "Gökdeniz Gülmez"`
This line identifies the main user of the system, which can be dynamically updated.
- **User Prompt:**
`{{ .Prompt }}`
This placeholder is where the user’s input or command is placed.
- **J.O.S.I.E. Response:**
`josie`
`{{ .Response }}`
This section is dedicated to J.O.S.I.E.'s responses, providing structured and relevant information or actions based on the user’s prompt.
By incorporating these elements, the template ensures a secure, personalized, and efficient interaction model for managing and operating the smart home system through J.O.S.I.E.
# Update 6:
Starting to train the first models (Llama3 8B, Qwen2 0.5B).
# Update 7:
Will use Meta's ImageBind model at first but will then use the own encoders later on.
# Update 8:
For voice mode, the `CAMB-AI/MARS5-TTS` model will be used.
|