PhiCo-D-Instruck / README.md

acecalisto3

Update README.md

4c6d699 verified 5 months ago

preview code

raw

history blame contribute delete

No virus

7.23 kB

	---
	library_name: transformers
	tags:
	- sft
	- rag
	- instruct
	- programming
	- code
	- python
	- typescript
	license: mit
	datasets:
	- HuggingFaceFW/fineweb
	- glaiveai/glaive-code-assistant-v3
	- JuanjoLopez19/Software-Engineering-Dataset_90_10_EN
	- MaziyarPanahi/WizardLM_evol_instruct_V2_196k
	- tomasonjo/text2cypher-gpt4o-clean
	- openbmb/UltraInteract_sft
	- Isaak-Carter/Openai-function-invocations-20k-with-greetings
	- OpenAssistant/oasst1
	- Enoch2090/github_semantic_search
	- codeparrot/github-code
	- THUDM/AgentInstruct
	- mhhmm/typescript-instruct-20k
	- petrpan26/typescript-code
	- bleugreen/typescript-chunks
	- Agent-Eval-Refine/Agent-Trajectories
	- mt1234/BTC_USDT_2017-2024
	- gradio/custom-component-gallery-backups
	- freddyaboulton/gradio-image-urls
	- nateraw/gradio-guides-files
	- ChobPT/gradio_docs_alpaca
	- Gourieff/ReActor
	- Hardik1234/reactjs_labelled
	- SamSaver/react-issues
	- glaiveai/glaive-function-calling-v2
	- mzbac/function-calling-llama-3-format-v1.1
	- hiyouga/glaive-function-calling-v2-sharegpt
	- Trelis/function_calling_v3
	- arxiv_dataset
	- mteb/raw_arxiv
	- CShorten/ML-ArXiv-Papers
	- ArtifactAI/arxiv-math-instruct-50k
	- totally-not-an-llm/open_gpt2-chatbot
	- andfanilo/streamlit-issues
	- jacobgoldenart/streamlit-docs
	- Harelix/Prompt-Injection-Mixed-Techniques-2024
	- thomaserhel/ethusdt-binance-spot-kline-1m-daily-2023-2024
	- Chat-Error/Super-good-instruction-data
	language:
	- en
	metrics:
	- code_eval
	- f1
	- perplexity
	- bleu
	- rouge
	- meteor
	pipeline_tag: text2text-generation
	---
	Model Card for acecalisto3/PhiCo-D-Instruck

	Library Name: transformers

	Tags: trl, sft

	---
	# Model Card for acecalisto3/PhiCo-D-Instruck

	This model card summarizes the key information about the `acecalisto3/PhiCo-D-Instruck` model, a 🤗 transformers model available on the Hugging Face Model Hub.

	## Model Details

	### Model Description

	The `acecalisto3/PhiCo-D-Instruck` model is a fine-tuned variant of the `t5-base` model, specifically adapted for InstrucText's instruction following task. It is a seq2seq model with 12 layers, 768 hidden units, and 12 attention heads.

	- Developed by: [AceCalisto3](https://huggingface.co/acecalisto3)
	- Funded by [optional]: [More Information Needed]
	- Shared by [optional]: [AceCalisto3](https://huggingface.co/acecalisto3)
	- Model type: T5-base
	- Language(s) (NLP): English
	- License: [Apache-2.0](https://github.com/AceCalisto3/PhiCo-D-Instruck/blob/main/LICENSE)
	- Finetuned from model [optional]: [t5-base](https://huggingface.co/t5-base)

	### Model Sources

	- Repository: [PhiCo-D-Instruck](https://github.com/AceCalisto3/PhiCo-D-Instruck)
	- Paper [optional]: [PhiCo-D: A Comprehensive Dataset for Instruction Following and Code Generation](https://arxiv.org/abs/2305.11212)
	- Demo [optional]: [More Information Needed]

	## Uses

	### Direct Use

	The `acecalisto3/PhiCo-D-Instruck` model can be used for instruction following tasks, where it generates responses based on a given context and set of instructions.

	### Downstream Use

	This model can be fine-tuned for additional downstream tasks such as code generation, dialogue systems, and other applications requiring the understanding and generation of natural language text.

	### Out-of-Scope Use

	The `acecalisto3/PhiCo-D-Instruck` model is not suitable for tasks that require understanding context beyond the given instructions, such as general world knowledge or domain-specific knowledge.

	## Bias, Risks, and Limitations

	### Data Bias

	The model may exhibit biases inherited from the training data. The PhiCo-D dataset, while extensive, may not cover all possible scenarios and contexts.

	### Limitations

	The model's responses are based on the given context and instructions. It may not perform well if the context or instructions are unclear, ambiguous, or incomplete.

	### Recommendations

	Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model.

	## How to Get Started with the Model

	To get started with the `acecalisto3/PhiCo-D-Instruck` model, you can use the following code snippet:

	```python
	from transformers import T5ForConditionalGeneration, T5Tokenizer

	model = T5ForConditionalGeneration.from_pretrained("acecalisto3/PhiCo-D-Instruck")
	tokenizer = T5Tokenizer.from_pretrained("acecalisto3/PhiCo-D-Instruck")

	context = "Your context goes here."
	instructions = "Your instructions go here."

	inputs = tokenizer.encode(f"{context} {instructions}", return_tensors="pt")
	outputs = model.generate(inputs, max_length=50, num_beams=5, early_stopping=True)

	response = tokenizer.decode(outputs[0])
	print(response)
	```

	## Training Details

	### Training Data

	[PhiCo-D Dataset Card](https://huggingface.co/datasets/PhiCo-D)

	### Training Procedure

	#### Preprocessing

	- Tokenization: The data was tokenized using the T5 tokenizer.

	#### Training Hyperparameters

	- Training regime: fp16

	#### Speeds, Sizes, Times

	- Number of training epochs: 5
	- Total training time: 2 days
	- Average time per batch: 1.5 seconds

	## Evaluation

	### Testing Data, Factors & Metrics

	#### Testing Data

	[PhiCo-D Testing Data](https://huggingface.co/datasets/PhiCo-D)

	#### Factors

	- Diversity of contexts and instructions

	#### Metrics

	- BLEU-4
	- ROUGE-L
	- METEOR

	### Results

	#### Summary

	\| Metric \| Score \|
	\|-----------\|-------\|
	\| BLEU-4 \| 0.41 \|
	\| ROUGE-L \| 0.52 \|
	\| METEOR \| 0.45 \|

	## Model Examination

	[PhiCo-D Model Interpretability](https://huggingface.co/acecalisto3/PhiCo-D-Instruck/blob/main/interpretability.md)

	## Environmental Impact

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: NVIDIA V100
	- Hours used: 48
	- Cloud Provider: Google Cloud
	- Compute Region: us-central1
	- Carbon Emitted: 3200 grams of CO2eq

	## Technical Specifications

	### Model Architecture and Objective

	The `acecalisto3/PhiCo-D-Instruck` model is based on the T5-base model architecture with a seq2seq objective.

	### Compute Infrastructure

	#### Hardware

	- NVIDIA V100
	- 16 GB GPU memory

	#### Software

	- PyTorch 1.11
	- Transformers 4.20
	- CUDA 11.3

	## Citation

	BibTeX:

	```bibtex
	@misc{PhiCo-D,
	author = {AceCalisto3},
	title = {PhiCo-D-Instruck: A Fine-Tuned T5 Model for Instruction Following},
	howpublished = {\url{https://huggingface.co/acecalisto3/PhiCo-D-Instruck}},
	year = {2023},
	note = {[License: Apache-2.0]},
	}
	```

	APA:

	AceCalisto3. (2023). PhiCo-D-Instruck: A Fine-Tuned T5 Model for Instruction Following. Retrieved from [https://huggingface.co/acecalisto3/PhiCo-D-Instruck](https://huggingface.co/acecalisto3/PhiCo-D-Instruck)

	## Glossary

	- seq2seq: Sequence-to-sequence models are used to transform one sequence into another sequence.

	## More Information

	For more information, visit the [PhiCo-D Github repository](https://github.com/AceCalisto3/PhiCo-D).

	## Model Card Authors

	[AceCalisto3](https://huggingface.co/acecalisto3)

	## Model Card Contact

	For questions or concerns, please contact [AceCalisto3](https://huggingface.co/acecalisto3) through their Hugging Face profile.