Model classes???
I'm quite confused, there's interesting and useful information here but right off the bat it's not clear what are classes 1-4. How do you identify which class a model is?. Any examples or explanations are appreciated.
For example what class would this model be and why?. https://huggingface.co/Lewdiculous/Violet_Twilight-v0.2-GGUF-IQ-Imatrix
I'm quite confused, there's interesting and useful information here but right off the bat it's not clear what are classes 1-4. How do you identify which class a model is?. Any examples or explanations are appreciated.
For example what class would this model be and why?. https://huggingface.co/Lewdiculous/Violet_Twilight-v0.2-GGUF-IQ-Imatrix
Here's what AI told me when I asked what it was:
How to Determine the Class of a Language Model You Are Using
To determine the class of the language model you are using, you should pay attention to several key aspects:
1. Model Architecture
Type of Model: Find out which family your model belongs to. For example, GPT (Generative Pre-trained Transformer) models use causal modeling, where the next word is predicted based on previous words. BERT (Bidirectional Encoder Representations from Transformers) models use masked modeling, where some words are replaced with a special token [MASK], and the model must predict these words.
Number of n-grams: If your model is based on n-grams (e.g., unigrams, bigrams, etc.), this also indicates its class. Unigram models consider words independently, while bigram and trigram models take into account one or two previous words, respectively.
2. Training and Data
Volume of Data: Large language models (LLMs), such as GPT-3 or BERT, are trained on vast amounts of data (hundreds of billions of words). If your model was trained on such a volume of data, it likely belongs to the LLM class.
Training Methods: Pay attention to the training methodology of the model. Some models may use specific techniques, such as reinforcement learning or fine-tuning for specific tasks.
3. Functionality and Application
- Tasks: Determine what tasks your model can perform. If it is capable of generating text or engaging in dialogue, this may indicate the use of causal modeling. If the model is designed for information extraction or classification, this could be a sign of masked modeling.
4. Documentation and Specifications
- Official Documentation: Check the documentation or description of the model. Developers usually specify the class of the model and its architecture. This is the most reliable way to learn about your language model's characteristics.
In summary, to determine the class of your language model, you need to analyze its architecture, training methods, and functional capabilities, as well as refer to official documentation.
For all models I have created at my repo, the class is noted just before the link to this page.
For rough idea of any model (any repo), here is rough list:
Class 1: Generally any model, fine tune, or merge (exception pass-through merge).
Class 2: Some fine tunes / merges for specific use case(s) and/or pass-through merge of 2 models.
Class 3/4: These are very specific use case models, often modified / merges / pass-through of 2+ models, and modified / augmented models (IE Brainstorm adapter).
That being said, another way to look at the classes is model behavior. If you are getting repeats, "gibberish", paragraph repeat, "long winded endless output" and other issues then applying class 3 or 4 settings can correct or stop these issues.
These issues can occur with any model.
Another way to look at it: Class3/4 settings curtail model behavior ("trim it") - which can impact instruction following with multi-turn chat and other operation for the better in certain use case situations.
Thank you, I really appreciate the reply and this clears things up.