Edit model card

Model Card for CryptoTrader-LM

The model predicts a trading decision—buy, sell, or hold—for either Bitcoin (BTC) or Ethereum (ETH) based on cryptocurrency news and historical price data. This model is fine-tuned using LoRA on the Ministral-8B-Instruct-2410 base model, specifically for the FinNLP @ COLING-2025 Cryptocurrency Trading Challenge.

Model Details

Model Description

This model is fine-tuned using LoRA (Low-Rank Adaptation) on the Ministral-8B-Instruct-2410 model, designed to predict daily cryptocurrency trading decisions (buy, sell, or hold) based on real-time news articles and BTC/ETH price data. The model's goal is to maximize profitability by making informed trading decisions under volatile market conditions.

Uses

Direct Use

The model can be used to predict daily trading decisions for BTC or ETH based on real-time financial news and historical cryptocurrency price data. It is designed for participants of the FinNLP Cryptocurrency Trading Challenge, but it could also be applied to other cryptocurrency trading contexts.

Downstream Use

The model can be integrated into automated crypto trading systems, agent-based trading platforms (such as FinMem), or used for research in financial decision-making models.

Out-of-Scope Use

This model is not designed for:

  • Predicting trading decisions for assets other than Bitcoin (BTC) or Ethereum (ETH).
  • High-frequency trading (HFT); the model is optimized for daily decision-making, not minute-by-minute trading.
  • Use in non-financial domains. It is not suitable for generic text-generation tasks or sentiment analysis outside of financial contexts.

Bias, Risks, and Limitations

Bias

The model is fine-tuned on specific data (cryptocurrency news and price data) and may not generalize well to other financial markets or different news sources. There could be biases based on the news outlets and timeframes present in the training data.

Risks

  • Market Volatility: Cryptocurrency markets are inherently volatile. The model’s predictions are based on past data and news, which may not always predict future market conditions accurately.
  • Decision-making: The model offers trading advice, but users should employ appropriate risk management techniques and not rely solely on the model for financial decisions.

Limitations

  • The model’s evaluation is primarily focused on profitability (Sharpe Ratio), and it may not account for other factors such as market liquidity, transaction fees, or slippage.
  • The model may not perform well in scenarios with significant market regime changes, such as sudden regulatory shifts or unexpected global events.

Recommendations

  • Risk Management: Users should complement the model’s predictions with traditional risk management strategies and not use the model in isolation for trading.
  • Bias Awareness: Be aware of potential biases in the news sources and timeframe used in training. The model may underrepresent certain news sources or overemphasize specific types of news.

How to Get Started with the Model

To start using the model for predictions, you can follow the example code below:

from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

from huggingface_hub import login
login("YOUR TOKEN HERE")


PROMPT = "[INST]YOUR PROMPT HERE[/INST]"
MAX_LENGTH = 32768  # Do not change
DEVICE = "cpu"


model_id = "agarkovv/CryptoTrader-LM"
base_model_id = "mistralai/Ministral-8B-Instruct-2410"

model = AutoPeftModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

model = model.to(DEVICE)
model.eval()
inputs = tokenizer(
    PROMPT, return_tensors="pt", padding=False, max_length=MAX_LENGTH, truncation=True
)
inputs = {key: value.to(model.device) for key, value in inputs.items()}

res = model.generate(
    **inputs,
    use_cache=True,
    max_new_tokens=MAX_LENGTH,
)
output = tokenizer.decode(res[0], skip_special_tokens=True)
print(output)

Training Details

Training Data

The model was fine-tuned on cryptocurrency market data, including:

  • Cryptocurrency to USD exchange rates for Bitcoin (BTC) and Ethereum (ETH).
  • News articles: Textual data related to cryptocurrency markets, including news URLs, titles, sources, and publication dates. The dataset was provided in JSON format, where each entry corresponds to a piece of news relevant to the crypto market.

Data Periods:

  • Training Data: Data period from 2022-01-01 to 2024-10-15.

The model was trained to correlate news sentiment, content, and cryptocurrency price trends, aiming to predict optimal trading decisions.

Training Procedure

Preprocessing

  1. Text Preprocessing: The raw news data underwent preprocessing which included text normalization, tokenization, and removal of irrelevant tokens (like stop words and special characters).
  2. Price Data Normalization: Historical price data was normalized to reflect percentage changes over time, making it easier for the model to capture price trends.
  3. Data Alignment: News articles were aligned with the corresponding time periods of price data to enable the model to learn from both data sources simultaneously.

Training Hyperparameters

  • Batch size: 1
  • Learning rate: 5e-5
  • Epochs: 3
  • Precision: Mixed precision (FP16), which helped speed up training while conserving memory.
  • Optimizer: AdamW
  • LoRA Parameters: LoRA rank 8, alpha 16, dropout 0.1

Speeds, Sizes, Times

  • Training Time: Approximately 3 hours on an 4x A100 GPU setup.
  • Model Size: 8B parameters (base model: Ministral-8B-Instruct).
  • Checkpoint Size: ~16GB due to the parameter-efficient fine-tuning.

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on a validation set of cryptocurrency market data (both price data and news articles). The testing dataset aligns with time periods not seen in training.

Factors

The model’s evaluation primarily focuses on:

  • Profitability: The model’s ability to make profitable trading decisions.
  • Volatility Handling: How well the model adapts to market volatility.
  • Timeliness: The ability to react to time-sensitive news.

Metrics

  • Sharpe Ratio (SR): The main evaluation metric for the challenge. The Sharpe Ratio is used to measure the risk-adjusted return of the model’s trading decisions.
  • Profit and Loss (PnL): The net profit or loss generated by the model’s trading decisions over a given time period.
  • Accuracy: The percentage of correct trading decisions (buy/sell/hold) compared to the optimal strategy.

Results

The model achieved a Sharpe Ratio of 0.94 on the validation set, indicating a strong risk-adjusted return. The model demonstrated consistent profitability over the testing period and effectively managed news-based volatility.

Summary

  • Sharpe Ratio: 0.94
  • Accuracy: 72%
  • Profitability: The model’s decisions resulted in an average 8% profit over the testing period.

Model Examination [optional]

Initial interpretability studies show that the model places significant weight on news headlines containing strong market sentiment indicators (e.g., "surge", "plummet"). Further analysis is recommended to explore how different types of news (e.g., regulatory updates vs. technical analysis) influence model decisions.

Environmental Impact

Carbon emissions and energy consumption estimates during model training:

  • Hardware Type: 4x NVIDIA A100 GPUs.
  • Hours used: ~3 hours of total training time.
  • Cloud Provider: AWS.
  • Compute Region: US-East.
  • Carbon Emitted: Approximately 1.1 kg CO2e, as estimated using the Machine Learning Impact calculator.

Technical Specifications

Model Architecture and Objective

  • Model Architecture: LoRA fine-tuned version of the Mistral-8B model, which is a transformer-based architecture optimized for instruction-following tasks.
  • Objective: To predict daily trading decisions (buy/sell/hold) for BTC/ETH based on financial news and cryptocurrency price data.

Compute Infrastructure

Hardware

  • Training Hardware: 4x NVIDIA A100 GPUs with 40GB of VRAM.
  • Inference Hardware: Can be run on a single GPU with at least 24GB of VRAM.

Software

  • Framework: PEFT (Parameter Efficient Fine-Tuning) with Hugging Face Transformers.
  • Deep Learning Libraries: PyTorch, Hugging Face Transformers.
  • Python Version: 3.10

Citation

If you use this model in your work, please cite it as follows:

BibTeX:

@misc{CryptoTrader-LM,
  author = {300k/ns team},
  title = {CryptoTrader-LM: A LoRA-tuned Ministral-8B Model for Cryptocurrency Trading Decisions},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/agarkovv/CryptoTrader-LM}},
}

APA:

300k/ns team. (2024). CryptoTrader-LM: A LoRA-tuned Ministral-8B Model for Cryptocurrency Trading Decisions. Hugging Face. https://huggingface.co/agarkovv/CryptoTrader-LM

Glossary [optional]

  • LoRA (Low-Rank Adaptation): A parameter-efficient fine-tuning method that reduces the number of trainable parameters by transforming the large matrices in transformers into low-rank decompositions, allowing for quicker and more memory-efficient fine-tuning.
  • BTC: The ticker symbol for Bitcoin, a decentralized cryptocurrency.
  • ETH: The ticker symbol for Ethereum, a decentralized cryptocurrency and blockchain platform.
  • Sharpe Ratio (SR): A measure of risk-adjusted return, used to evaluate the performance of an investment or trading strategy.
  • PnL (Profit and Loss): The financial gain or loss realized from trading over a specific time period.

More Information [optional]

For more information on the training process, model performance, or any specific details, please contact the model authors.

Model Card Authors [optional]

  • 300k/ns
  • Contact via Telegram: @allocfree

Model Card Contact

For any inquiries, please contact via Telegram: @allocfree

Framework Versions

  • PEFT: v0.13.2
  • Transformers: v4.33.3
  • PyTorch: v2.1.0

Downloads last month
211
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for agarkovv/CryptoTrader-LM

Adapter
(2)
this model