model_v5 / README.md
taoshi-mbrown's picture
Update README.md
d5bca6b verified
|
raw
history blame
No virus
3.6 kB
---
license: mit
language:
- en
tags:
- bittensor
---
MIT License
Copyright (c) 2024 Taoshi Inc
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
# Background
The models provided here were created using open source modeling techniques
provided in https://github.com/taoshidev/time-series-prediction-subnet (TSPS).
They were achieved using the `runnable/miner_training.py`, and tested against
existing models in `runnable/miner_testing.py`.
> **Note**<br>
This model requires the Feature Set Creator (FSC) functionality added in the
latest release of the TSPS.
# Build Strategy
This section outlines the strategy used to build the models.
## Understanding Dataset Used
The dataset used to build the models can be generated using the
`runnable/generate_historical_data.py`. A lookback period between June 2023 and
January 2024 on the 5m interval was used to train the model. Recent data was
used because it more closely correlates to the current market and
macroeconomic conditions.
Testing data was used between January 2024 and February 2024 to determine the
performance of the models. This was tested using the `runnable/miner_testing.py`
file with live historical data sources.
## Understanding Model Creation
As of now, the model only uses the following features to predict:
- close
- high
- low
- volume
- time of day
- time of week
- time of month
Other features from a wide range of feature sources are being added to TSPS
infrastructure in the near future as improvements to the FSC.
A variety of windows and parameters were tested and eliminated. The final
strategy to derive this model was the following:
```
model = BaseMiningModel(
filename="model_v5_1.h5",
mode="w",
feature_count=7,
sample_count=500,
prediction_feature_count=1,
prediction_count=10,
prediction_length=100,
layers=[
[1024, 0],
[1024, 0.3],
],
learning_rate=0.000001,
dtype=Policy("mixed_float16"),
)
```
The LSTM model has two stacked layers with a 0.3 dropout rate.
## Understanding Training Decisions
Training was done with 500 samples per scenario and 128 scenarios per batch,
with 20 training epochs and 10 passes over the entire dataset. Additional
epochs and passes were not found to improve the model's predictions.
## Strategy to Predict
The strategy to predict 100 closes of data into the future was to use 10
predictions evenly spaced along the length of the prediction space, and then
linearly interpolating between each prediction. By doing so, the model could
learn to predict the general shape of the market movement, rather than
predicting all 100.