LeroyDyer commited on
Commit
e245501
1 Parent(s): 56db493

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -0
README.md CHANGED
@@ -17,6 +17,22 @@ language:
17
  - **License:** apache-2.0
18
  - **Finetuned from model :** LeroyDyer/SpydazWeb_AI_HumanAI_007
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  To create a pipeline for encoding and decoding files (sound or images) to and from Base64, we need to account for the following:
21
 
22
  Generalized File Handling:
 
17
  - **License:** apache-2.0
18
  - **Finetuned from model :** LeroyDyer/SpydazWeb_AI_HumanAI_007
19
 
20
+
21
+ ## The textvision model Works ! the sound/Vision Text model Works !
22
+
23
+ In the creation of models for multimodality is it suggested to use a different architecture ?
24
+ Is it for thier pretraining ?
25
+ So is it for just cutting the corner of the expensive training that the people are using a Vision Transformer ?
26
+
27
+ Well In fact a simple transformer model can do ALL modalitys ! It is Neural network after all !
28
+ the problem did not change , its only how to frame the question into a text based format : Here with the spydazweb models we use BASE64 Encoding !
29
+
30
+ enabling for encoding and decoding of an image ! .. So a model CAN generate a Image using base64 as a representation ! ( yes Its large context! )
31
+ Lets GO !
32
+
33
+
34
+
35
+
36
  To create a pipeline for encoding and decoding files (sound or images) to and from Base64, we need to account for the following:
37
 
38
  Generalized File Handling: