model_v5 / README.md

taoshi-mbrown

Update README.md

d5bca6b verified 7 months ago

preview code

raw

history blame

No virus

3.6 kB

	---
	license: mit
	language:
	- en
	tags:
	- bittensor
	---
	MIT License

	Copyright (c) 2024 Taoshi Inc

	Permission is hereby granted, free of charge, to any person obtaining a copy
	of this software and associated documentation files (the "Software"), to deal
	in the Software without restriction, including without limitation the rights
	to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
	copies of the Software, and to permit persons to whom the Software is
	furnished to do so, subject to the following conditions:

	The above copyright notice and this permission notice shall be included in all
	copies or substantial portions of the Software.

	THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
	IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
	FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
	AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
	LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
	OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
	SOFTWARE.

	# Background

	The models provided here were created using open source modeling techniques
	provided in https://github.com/taoshidev/time-series-prediction-subnet (TSPS).
	They were achieved using the `runnable/miner_training.py`, and tested against
	existing models in `runnable/miner_testing.py`.

	> Note<br>
	This model requires the Feature Set Creator (FSC) functionality added in the
	latest release of the TSPS.

	# Build Strategy

	This section outlines the strategy used to build the models.

	## Understanding Dataset Used

	The dataset used to build the models can be generated using the
	`runnable/generate_historical_data.py`. A lookback period between June 2023 and
	January 2024 on the 5m interval was used to train the model. Recent data was
	used because it more closely correlates to the current market and
	macroeconomic conditions.

	Testing data was used between January 2024 and February 2024 to determine the
	performance of the models. This was tested using the `runnable/miner_testing.py`
	file with live historical data sources.


	## Understanding Model Creation

	As of now, the model only uses the following features to predict:
	- close
	- high
	- low
	- volume
	- time of day
	- time of week
	- time of month

	Other features from a wide range of feature sources are being added to TSPS
	infrastructure in the near future as improvements to the FSC.

	A variety of windows and parameters were tested and eliminated. The final
	strategy to derive this model was the following:

	```
	model = BaseMiningModel(
	filename="model_v5_1.h5",
	mode="w",
	feature_count=7,
	sample_count=500,
	prediction_feature_count=1,
	prediction_count=10,
	prediction_length=100,
	layers=[
	[1024, 0],
	[1024, 0.3],
	],
	learning_rate=0.000001,
	dtype=Policy("mixed_float16"),
	)
	```

	The LSTM model has two stacked layers with a 0.3 dropout rate.

	## Understanding Training Decisions

	Training was done with 500 samples per scenario and 128 scenarios per batch,
	with 20 training epochs and 10 passes over the entire dataset. Additional
	epochs and passes were not found to improve the model's predictions.

	## Strategy to Predict

	The strategy to predict 100 closes of data into the future was to use 10
	predictions evenly spaced along the length of the prediction space, and then
	linearly interpolating between each prediction. By doing so, the model could
	learn to predict the general shape of the market movement, rather than
	predicting all 100.