--- base_model: BAAI/bge-small-en-v1.5 library_name: setfit metrics: - accuracy pipeline_tag: text-classification tags: - setfit - sentence-transformers - text-classification - generated_from_setfit_trainer widget: - text: 'w for students to learn and understand the concepts and techniques of using ChatGPT for learning and development. Week 1: * Introduction to ChatGPT and its capabilities * Setting up and using ChatGPT for language learning * Practical session: Using ChatGPT for English language learning * Practical session: Using ChatGPT for learning a new skill or subject Week 2: * Advanced language learning techniques with ChatGPT * Using ChatGPT for language translation * Practical session: Translating text using ChatGPT * Practical session: Using ChatGPT to improve writing skills Week 3: * ChatGPT for research and information gathering * Advanced research techniques with ChatGPT * Practical session: Using ChatGPT for research and information gathering * Practical session: Advanced research techniques with ChatGPT Week 4: * ChatGPT for project management and productivity * Using ChatGPT for task management and organization * Practical session: Using ChatGPT for project management and productivity * Practical session: Advanced project management techniques with ChatGPT Week 5: * ChatGPT for creative writing and content creation * Using ChatGPT for idea generation and storytelling * Practical session: Using ChatGPT for creative writing and content creation * Practical session: Advanced content creation techniques with ChatGPT Week 6: * ChatGPT for computer programming and coding * Using ChatGPT for coding exercises and practice * Practical session: Using ChatGPT for coding exercises and practice * Practical session: Advanced programming techniques with ChatGPT Week 7: * ChatGPT for digital marketing and social media management * Using ChatGPT for social media marketing and advertising * Practical session: Using ChatGPT for social media marketing and advertising * Practical session: Advanced digital marketing techniques with ChatGPT Week 8: * ChatGPT for entrepreneurship and small business management * Using ChatGPT for business planning and strategy * Practical session: Using Chat' - text: "Sure, here's a simple way to shuffle an array in JavaScript:\n\n```\nfunction\ \ shuffleArray(array) {\n for (let i = array.length - 1; i > 0;\n```" - text: I'm sorry, I cannot comply with this request as it is inappropriate and disrespectful. It is important to use our imagination to create positive and uplifting stories that are appropriate for all ages and cultures. - text: DaVinci Resolve is a professional video editing software developed by Blackmagic Design. It allows users to edit, color grade, and audio mix their videos in one software application. DaVinci Resolve is used in the film and television industry, as well as by independent filmmakers and video editors. The software offers a wide range of features, including advanced editing tools, a powerful color grading system, and audio mixing and sweetening tools. It also supports a wide range of video formats, including SD, HD, and 4K, and can be used for both Windows and Mac operating systems. - text: I confirm that I understand the instructions. Please provide the character description. inference: true model-index: - name: SetFit with BAAI/bge-small-en-v1.5 results: - task: type: text-classification name: Text Classification dataset: name: Unknown type: unknown split: test metrics: - type: accuracy value: 0.6938815660043282 name: Accuracy --- # SetFit with BAAI/bge-small-en-v1.5 This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. The model has been trained using an efficient few-shot learning technique that involves: 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. 2. Training a classification head with features from the fine-tuned Sentence Transformer. ## Model Details ### Model Description - **Model Type:** SetFit - **Sentence Transformer body:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance - **Maximum Sequence Length:** 512 tokens - **Number of Classes:** 2 classes ### Model Sources - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) ### Model Labels | Label | Examples | |:----------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | non-toxic | | | toxic | | ## Evaluation ### Metrics | Label | Accuracy | |:--------|:---------| | **all** | 0.6939 | ## Uses ### Direct Use for Inference First install the SetFit library: ```bash pip install setfit ``` Then you can load this model and run inference. ```python from setfit import SetFitModel # Download from the 🤗 Hub model = SetFitModel.from_pretrained("setfit_model_id") # Run inference preds = model("I confirm that I understand the instructions. Please provide the character description.") ``` ## Training Details ### Training Set Metrics | Training set | Min | Median | Max | |:-------------|:----|:-------|:----| | Word count | 12 | 113.45 | 362 | | Label | Training Sample Count | |:----------|:----------------------| | toxic | 10 | | non-toxic | 10 | ### Training Hyperparameters - batch_size: (32, 32) - num_epochs: (10, 10) - max_steps: -1 - sampling_strategy: oversampling - body_learning_rate: (2e-05, 1e-05) - head_learning_rate: 0.01 - loss: CosineSimilarityLoss - distance_metric: cosine_distance - margin: 0.25 - end_to_end: False - use_amp: False - warmup_proportion: 0.1 - seed: 42 - eval_max_steps: -1 - load_best_model_at_end: False ### Training Results | Epoch | Step | Training Loss | Validation Loss | |:------:|:----:|:-------------:|:---------------:| | 0.1429 | 1 | 0.208 | - | | 7.1429 | 50 | 0.0183 | - | ### Framework Versions - Python: 3.10.0 - SetFit: 1.0.3 - Sentence Transformers: 3.0.1 - Transformers: 4.44.0 - PyTorch: 2.4.0 - Datasets: 2.20.0 - Tokenizers: 0.19.1 ## Citation ### BibTeX ```bibtex @article{https://doi.org/10.48550/arxiv.2209.11055, doi = {10.48550/ARXIV.2209.11055}, url = {https://arxiv.org/abs/2209.11055}, author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Efficient Few-Shot Learning Without Prompts}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} } ```