--- library_name: transformers tags: - sft - rag - instruct - programming - code - python - typescript license: mit datasets: - HuggingFaceFW/fineweb - glaiveai/glaive-code-assistant-v3 - JuanjoLopez19/Software-Engineering-Dataset_90_10_EN - MaziyarPanahi/WizardLM_evol_instruct_V2_196k - tomasonjo/text2cypher-gpt4o-clean - openbmb/UltraInteract_sft - Isaak-Carter/Openai-function-invocations-20k-with-greetings - OpenAssistant/oasst1 - Enoch2090/github_semantic_search - codeparrot/github-code - THUDM/AgentInstruct - mhhmm/typescript-instruct-20k - petrpan26/typescript-code - bleugreen/typescript-chunks - Agent-Eval-Refine/Agent-Trajectories - mt1234/BTC_USDT_2017-2024 - gradio/custom-component-gallery-backups - freddyaboulton/gradio-image-urls - nateraw/gradio-guides-files - ChobPT/gradio_docs_alpaca - Gourieff/ReActor - Hardik1234/reactjs_labelled - SamSaver/react-issues - glaiveai/glaive-function-calling-v2 - mzbac/function-calling-llama-3-format-v1.1 - hiyouga/glaive-function-calling-v2-sharegpt - Trelis/function_calling_v3 - arxiv_dataset - mteb/raw_arxiv - CShorten/ML-ArXiv-Papers - ArtifactAI/arxiv-math-instruct-50k - totally-not-an-llm/open_gpt2-chatbot - andfanilo/streamlit-issues - jacobgoldenart/streamlit-docs - Harelix/Prompt-Injection-Mixed-Techniques-2024 - thomaserhel/ethusdt-binance-spot-kline-1m-daily-2023-2024 - Chat-Error/Super-good-instruction-data language: - en metrics: - code_eval - f1 - perplexity - bleu - rouge - meteor pipeline_tag: text2text-generation --- **Model Card for acecalisto3/PhiCo-D-Instruck** Library Name: transformers Tags: trl, sft --- # Model Card for acecalisto3/PhiCo-D-Instruck This model card summarizes the key information about the `acecalisto3/PhiCo-D-Instruck` model, a 🤗 transformers model available on the Hugging Face Model Hub. ## Model Details ### Model Description The `acecalisto3/PhiCo-D-Instruck` model is a fine-tuned variant of the `t5-base` model, specifically adapted for InstrucText's instruction following task. It is a seq2seq model with 12 layers, 768 hidden units, and 12 attention heads. - **Developed by:** [AceCalisto3](https://huggingface.co/acecalisto3) - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [AceCalisto3](https://huggingface.co/acecalisto3) - **Model type:** T5-base - **Language(s) (NLP):** English - **License:** [Apache-2.0](https://github.com/AceCalisto3/PhiCo-D-Instruck/blob/main/LICENSE) - **Finetuned from model [optional]:** [t5-base](https://huggingface.co/t5-base) ### Model Sources - **Repository:** [PhiCo-D-Instruck](https://github.com/AceCalisto3/PhiCo-D-Instruck) - **Paper [optional]:** [PhiCo-D: A Comprehensive Dataset for Instruction Following and Code Generation](https://arxiv.org/abs/2305.11212) - **Demo [optional]:** [More Information Needed] ## Uses ### Direct Use The `acecalisto3/PhiCo-D-Instruck` model can be used for instruction following tasks, where it generates responses based on a given context and set of instructions. ### Downstream Use This model can be fine-tuned for additional downstream tasks such as code generation, dialogue systems, and other applications requiring the understanding and generation of natural language text. ### Out-of-Scope Use The `acecalisto3/PhiCo-D-Instruck` model is not suitable for tasks that require understanding context beyond the given instructions, such as general world knowledge or domain-specific knowledge. ## Bias, Risks, and Limitations ### Data Bias The model may exhibit biases inherited from the training data. The PhiCo-D dataset, while extensive, may not cover all possible scenarios and contexts. ### Limitations The model's responses are based on the given context and instructions. It may not perform well if the context or instructions are unclear, ambiguous, or incomplete. ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. ## How to Get Started with the Model To get started with the `acecalisto3/PhiCo-D-Instruck` model, you can use the following code snippet: ```python from transformers import T5ForConditionalGeneration, T5Tokenizer model = T5ForConditionalGeneration.from_pretrained("acecalisto3/PhiCo-D-Instruck") tokenizer = T5Tokenizer.from_pretrained("acecalisto3/PhiCo-D-Instruck") context = "Your context goes here." instructions = "Your instructions go here." inputs = tokenizer.encode(f"{context} {instructions}", return_tensors="pt") outputs = model.generate(inputs, max_length=50, num_beams=5, early_stopping=True) response = tokenizer.decode(outputs[0]) print(response) ``` ## Training Details ### Training Data [PhiCo-D Dataset Card](https://huggingface.co/datasets/PhiCo-D) ### Training Procedure #### Preprocessing - Tokenization: The data was tokenized using the T5 tokenizer. #### Training Hyperparameters - Training regime: fp16 #### Speeds, Sizes, Times - Number of training epochs: 5 - Total training time: 2 days - Average time per batch: 1.5 seconds ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [PhiCo-D Testing Data](https://huggingface.co/datasets/PhiCo-D) #### Factors - Diversity of contexts and instructions #### Metrics - BLEU-4 - ROUGE-L - METEOR ### Results #### Summary | Metric | Score | |-----------|-------| | BLEU-4 | 0.41 | | ROUGE-L | 0.52 | | METEOR | 0.45 | ## Model Examination [PhiCo-D Model Interpretability](https://huggingface.co/acecalisto3/PhiCo-D-Instruck/blob/main/interpretability.md) ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** NVIDIA V100 - **Hours used:** 48 - **Cloud Provider:** Google Cloud - **Compute Region:** us-central1 - **Carbon Emitted:** 3200 grams of CO2eq ## Technical Specifications ### Model Architecture and Objective The `acecalisto3/PhiCo-D-Instruck` model is based on the T5-base model architecture with a seq2seq objective. ### Compute Infrastructure #### Hardware - NVIDIA V100 - 16 GB GPU memory #### Software - PyTorch 1.11 - Transformers 4.20 - CUDA 11.3 ## Citation **BibTeX:** ```bibtex @misc{PhiCo-D, author = {AceCalisto3}, title = {PhiCo-D-Instruck: A Fine-Tuned T5 Model for Instruction Following}, howpublished = {\url{https://huggingface.co/acecalisto3/PhiCo-D-Instruck}}, year = {2023}, note = {[License: Apache-2.0]}, } ``` **APA:** AceCalisto3. (2023). PhiCo-D-Instruck: A Fine-Tuned T5 Model for Instruction Following. Retrieved from [https://huggingface.co/acecalisto3/PhiCo-D-Instruck](https://huggingface.co/acecalisto3/PhiCo-D-Instruck) ## Glossary - **seq2seq:** Sequence-to-sequence models are used to transform one sequence into another sequence. ## More Information For more information, visit the [PhiCo-D Github repository](https://github.com/AceCalisto3/PhiCo-D). ## Model Card Authors [AceCalisto3](https://huggingface.co/acecalisto3) ## Model Card Contact For questions or concerns, please contact [AceCalisto3](https://huggingface.co/acecalisto3) through their Hugging Face profile.