Update README.md

c047dbc verified 29 days ago

4.2 kB

	---
	license: creativeml-openrail-m
	datasets:
	- GAIR/o1-journey
	language:
	- en
	base_model:
	- Qwen/Qwen2.5-0.5B-Instruct
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- Qwen2.5
	- Llama-Cpp
	- CoT
	- o1-journey
	- text-generation-inference
	- safetensors
	- Ollama
	---
	### Acrux-500M-o1-Journey Model Files

	The Acrux-500M-o1-Journey is a lightweight, instruction-tuned language model fine-tuned from the Qwen2.5-0.5B-Instruct base model. With a size of 500 million parameters, it is designed for cost-effective deployment and fast text generation while maintaining quality performance for instruction-following tasks.

	\| File Name \| Size \| Description \| Upload Status \|
	\|----------------------------\|----------------\|-------------------------------------------\|--------------------\|
	\| `.gitattributes` \| 1.57 kB \| Git attributes for managing LFS files. \| Uploaded \|
	\| `README.md` \| 195 Bytes \| Model overview or documentation. \| Updated \|
	\| `added_tokens.json` \| 657 Bytes \| Custom tokens for the tokenizer. \| Uploaded \|
	\| `config.json` \| 859 Bytes \| Model configuration file. \| Uploaded \|
	\| `generation_config.json` \| 280 Bytes \| Configuration for text generation. \| Uploaded \|
	\| `merges.txt` \| 1.82 MB \| Merge rules for byte-pair encoding (BPE). \| Uploaded \|
	\| `pytorch_model.bin` \| 988 MB \| Model weights (PyTorch format). \| Uploaded (LFS) \|
	\| `special_tokens_map.json` \| 644 Bytes \| Mapping for special tokens. \| Uploaded \|
	\| `tokenizer.json` \| 11.4 MB \| Full tokenizer configuration. \| Uploaded (LFS) \|
	\| `tokenizer_config.json` \| 7.73 kB \| Additional tokenizer settings. \| Uploaded \|
	\| `vocab.json` \| 2.78 MB \| Vocabulary for the tokenizer. \| Uploaded \|
	### Key Features:

	1. Compact Size with Efficient Performance:
	The smaller parameter count (500M) ensures faster inference and reduced hardware requirements.

	2. Instruction Optimization:
	Fine-tuned to follow prompts effectively, making it suitable for interactive applications and prompt-based tasks.

	3. Domain-Specific Training:
	Trained on the GAIR/o1-journey dataset, providing tailored capabilities for specific use cases.

	---

	### Training Details:
	- Base Model: [Qwen2.5-0.5B-Instruct](#)
	- Dataset Used for Fine-Tuning: [GAIR/o1-journey](#)
	- A compact dataset focusing on instruction-driven generation with 1.42k samples.

	---
	### Capabilities:

	1. Instruction Following:
	- Generates accurate and coherent responses to user instructions.
	- Handles summarization, question-answering, and conversational tasks.

	2. Fast Inference:
	- Ideal for real-time applications due to reduced latency from its smaller size.

	3. Interactive AI Development:
	- Suitable for chatbots, virtual assistants, and instructional interfaces.

	---
	### Usage Instructions:

	1. Setup:
	Download all model files, ensuring compatibility with the Hugging Face Transformers library.

	2. Loading the Model:
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "prithivMLmods/Acrux-500M-o1-Journey"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)
	```
	3. Sample Generate Text:
	```python
	input_text = "Explain the concept of machine learning in simple terms."
	inputs = tokenizer(input_text, return_tensors="pt")
	outputs = model.generate(**inputs, max_length=100, temperature=0.7)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```
	4. Optimize Generation:
	Adjust parameters in `generation_config.json` for better control of output, such as:
	- `temperature` for randomness.
	- `top_p` for sampling diversity.
	- `max_length` for output size.
	---