Isaak Carter Augustus commited on
Commit
feb23b6
1 Parent(s): 6b7a2af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -61
README.md CHANGED
@@ -31,84 +31,82 @@ tags:
31
  - j.o.s.i.e.
32
  ---
33
 
34
- # STILL IN BETA!!!
35
 
36
- # The newest Text to text version is Beta 2.3.1
37
- - `Isaak-Carter/josiev4o-7b-stage1-v0.1`
38
- - Quants: <a href="https://huggingface.co/Isaak-Carter/J.O.S.I.E.v4o-7b-stage1-v0.1-gguf">Isaak-Carter/J.O.S.I.E.v4o-7b-stage1-v0.1-gguf</a>
39
 
40
- # This will be the repo for J.O.S.I.E.v4o
41
 
42
- Like **OpenAIs GPT-4o**, it's natively Multimodal, based on the **NExT-GPT** combined with **ROPE**, **RMS Normalisation**, and **MoE**, parred with the **GPT-4o Tokenizer** from OpenAI.
43
- This is a *future project* and will take it's time.
 
 
 
 
 
44
 
45
- Further more, I will probably make a **UI application** with that model too.
46
 
47
- Further updates comming soon!!!
48
 
49
- Source code and more info will be available on my <a href="https://github.com/Goekdeniz-Guelmez/J.O.S.I.E.-v4o.git">GitHub Repo</a>
50
 
51
- # Update 1:
52
- The model will go through multible training stages:
 
 
 
 
 
 
 
53
 
54
- - *Stage 1 :* Instruction finetuning the LLM on custom dataset and Prompt format.
55
- - *Stage 2 :* Encoder side alignment using the contrastive learning technique.
56
- - *Stage 3 :* Instruction finetuning the full model.
57
- - Changes and more stages will be comming.
58
 
59
- # Update 2:
60
- Encoders are created and function as expected.
61
 
62
- # Update 4:
63
- Creating the full model and succesfully running inference with vision and audio.
 
64
 
65
- # Update 3:
66
- First encoder side alignment training steps successfuly worked.
 
67
 
 
 
 
68
 
69
- # Update 5 Prompt Template:
70
- The prompt template used in this project is inspired by the ChatML template but includes several customized adjustments to fit the specific requirements of our application:
 
71
 
72
- ```
73
- <|begin_of_text|>system
74
- You are J.O.S.I.E. which is an acronym for "Just an Outstandingly Smart Intelligent Entity", a private and super-intelligent AI assistant, created by Gökdeniz Gülmez.<|end_of_text|>
75
- <|begin_of_text|>main user "Gökdeniz Gülmez"
76
- {{ .Prompt }}<|end_of_text|>
77
- <|begin_of_text|>josie
78
- {{ .Response }}<|end_of_text|>
79
- ```
80
 
81
- 1. **User Types and Access Levels:**
82
- - **Main User:**
83
- - The main user is identified as "Gökdeniz Gülmez." This designation can be personalized and updated with your name as needed. The system will recognize and prioritize the main user's commands and queries, ensuring they have full control and access to all functionalities and information within the smart home system.
84
- - **Authorized User:**
85
- - Authorized users are those who have been granted permission by the main user. They are designated as `authorized user "{name}"`. While these users can interact with the system, their access will be restricted to guest-level privileges, preventing them from controlling or accessing sensitive smart home information. This ensures a balance between usability and security.
86
- - **Unauthorized User:**
87
- - Unauthorized users are designated as `unauthorized user "name if possible else unknown"`. These users will have no access to J.O.S.I.E.'s capabilities. Any attempt to interact with the system by unauthorized users will result in immediate redirection to the main user for verification. Additional security measures may be enacted, such as alerts or system lockdowns, to protect the integrity of the smart home environment.
88
 
89
- 2. **Template Structure:**
90
- - The template structure is designed to facilitate clear and structured interactions between users and the system. It consists of the following components:
91
- - **System Message:**
92
- `system`
93
- This section includes any predefined system messages or configurations necessary for the interaction.
94
- - **Main User Identification:**
95
- `main user "Gökdeniz Gülmez"`
96
- This line identifies the main user of the system, which can be dynamically updated.
97
- - **User Prompt:**
98
- `{{ .Prompt }}`
99
- This placeholder is where the user’s input or command is placed.
100
- - **J.O.S.I.E. Response:**
101
- `josie`
102
- `{{ .Response }}`
103
- This section is dedicated to J.O.S.I.E.'s responses, providing structured and relevant information or actions based on the user’s prompt.
104
 
105
- By incorporating these elements, the template ensures a secure, personalized, and efficient interaction model for managing and operating the smart home system through J.O.S.I.E.
 
 
106
 
107
- # Update 6:
108
- Starting to train the first models (Llama3 8B, Qwen2 0.5B).
109
 
110
- # Update 7:
111
- Will use Meta's ImageBind model at first but will then use the own encoders later on.
112
 
113
- # Update 8:
114
- For voice mode, the `CAMB-AI/MARS5-TTS` model will be used.
 
 
 
 
 
 
 
 
 
 
 
 
31
  - j.o.s.i.e.
32
  ---
33
 
34
+ # J.O.S.I.E. (Just a Smart and Intelligent Entity)
35
 
36
+ Welcome to the J.O.S.I.E. project repository! J.O.S.I.E. is a cutting-edge, super intelligent AI assistant designed to revolutionize the way we interact with smart home systems and general AI capabilities. This document provides an overview of J.O.S.I.E.'s features, capabilities, and development roadmap.
 
 
37
 
38
+ ## Table of Contents
39
 
40
+ 1. [Introduction](#introduction)
41
+ 2. [Features](#features)
42
+ 3. [Training Stages](#training-stages)
43
+ 4. [Current Progress](#current-progress)
44
+ 5. [Usage](#usage)
45
+ 6. [Contributing](#contributing)
46
+ 7. [License](#license)
47
 
48
+ ## Introduction
49
 
50
+ J.O.S.I.E. stands for "Just a Smart and Intelligent Entity." It is not just a conversational AI assistant but a fully multimodal AI designed to understand and process images, videos, thermal images, depth, and audio in real-time. J.O.S.I.E. is built to autonomously manage smart homes and provide general-purpose assistance, with advanced capabilities accessible only to the main user.
51
 
52
+ ## Features
53
 
54
+ - **Real-Time Processing:** J.O.S.I.E. operates in real-time, ensuring quick and efficient responses.
55
+ - **Tool Calling:** Capable of calling various tools to perform tasks (only for the main user).
56
+ - **Short/Long-Term Memory:** Remembers past interactions and uses this data to provide a more personalized experience.
57
+ - **Secure Information Access:** Accesses top-secret information upon receiving a special password from the main user.
58
+ - **Contextual Greetings:** Greets users based on contextual data such as time of day, birthdays, and more.
59
+ - **Voice Interaction:** Will support real-time voice responses with a response time under 0.3 ms.
60
+ - **Advanced Multimodal Capabilities:** Initially uses Meta's image binding model, transitioning to a self-implemented encoder.
61
+ - **Uncensored Interaction:** Full, uncensored interaction capabilities are reserved for the main user.
62
+ - **Autonomous Smart Home Management:** Manages smart home devices and systems autonomously.
63
 
64
+ ## Training Stages
 
 
 
65
 
66
+ J.O.S.I.E.'s development is structured into several meticulously planned stages, each focusing on different aspects of its capabilities:
 
67
 
68
+ ### Stage 1: **Genesis**
69
+ - **Objective:** Fine-tune the Large Language Model (LLM) with a custom dataset and prompt format. The LLM used is Qwen2 7B and 0.5B.
70
+ - **Outcome:** A robust foundation for text-based interactions.
71
 
72
+ ### Stage 2: **Fusion**
73
+ - **Objective:** Train encoders separately using transfer learning to align input embeddings with text embeddings.
74
+ - **Outcome:** Harmonized multimodal input processing.
75
 
76
+ ### Stage 3: **Synergy**
77
+ - **Objective:** Fine-tune the LLM for multimodal reasoning using a custom dataset.
78
+ - **Outcome:** Enhanced reasoning capabilities across text and other modalities.
79
 
80
+ ### Stage 4: **Vocalize**
81
+ - **Objective:** Fine-tune the decoder for audio output, giving J.O.S.I.E. a voice.
82
+ - **Outcome:** Synchronized text and audio responses.
83
 
84
+ ### Stage 5: **Convergence**
85
+ - **Objective:** Perform full model fine-tuning for seamless integration of all components.
86
+ - **Outcome:** A fully multimodal, real-time interactive AI assistant.
 
 
 
 
 
87
 
88
+ ## Current Progress
 
 
 
 
 
 
89
 
90
+ J.O.S.I.E. is currently in its beta stage, specifically in Stage 1. The model is being actively developed, and the current version is focused on fine-tuning the LLM with custom datasets.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91
 
92
+ ### Latest Beta Version: 2.3.1
93
+ - **Model:** [Isaak-Carter/josiev4o-7b-stage1-v0.1](https://huggingface.co/Isaak-Carter/J.O.S.I.E.v4o-7b-stage1-v0.1-gguf)
94
+ - **Quants:** [Isaak-Carter/J.O.S.I.E.v4o-7b-stage1-v0.1-gguf](https://huggingface.co/Isaak-Carter/J.O.S.I.E.v4o-7b-stage1-v0.1-gguf)
95
 
96
+ For a sneak peek at the current progress, visit the [GitHub Repo](https://github.com/Goekdeniz-Guelmez/J.O.S.I.E.-v4o.git).
 
97
 
98
+ ## Source Code
 
99
 
100
+ To se the latest updates on J.O.S.I.E.v4o you can se the <a href="https://github.com/Goekdeniz-Guelmez/J.O.S.I.E.-v4o.git>Github Repo</a>
101
+
102
+ ## Contributing
103
+
104
+ I welcome contributions from the you! To contribute to J.O.S.I.E., please fork the repository and create a pull request with your changes. Ensure that your code adheres to my coding standards and includes appropriate tests and comments.
105
+
106
+ ## License
107
+
108
+ J.O.S.I.E. is licensed under the Apache2 License. See the [LICENSE](LICENSE) file for more details.
109
+
110
+ ---
111
+
112
+ Thank you for being part of the J.O.S.I.E. journey. Together, we are building the future of intelligent assistants!