File size: 3,861 Bytes
dd2d9ab
 
 
14b1fa6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f04362
 
14b1fa6
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
license: cc-by-4.0
---

# Mistral-Astronomy-7b-v0.1

Mistral-Astronomy-7b-v0.1, developed by Phanerozoic, is a specialized language model fine-tuned on "Astronomy 2e" by OpenStax. This model is adept at providing detailed and accurate responses in the field of astronomy, significantly improving upon the capabilities of the base OpenHermes 2.5 model in this specific domain.

## Model Description
- **Developed by:** Phanerozoic
- **License for Training Data:** Creative Commons Attribution 4.0 International (CC BY 4.0)
- **Finetuned from model:** OpenHermes 2.5

## License Details
The content of "Astronomy 2e" by OpenStax, used for training Mistral-Astronomy-7b-v0.1, is licensed under Creative Commons Attribution 4.0 International (CC BY 4.0). This license allows for the following:
- **Share:** Permission to copy and redistribute the material in any medium or format for any purpose, even commercially.
- **Adapt:** Freedom to remix, transform, and build upon the material for any purpose, even commercially.
- **Attribution Requirements:** Users must give appropriate credit, provide a link to the license, and indicate if changes were made. These requirements must be fulfilled in any reasonable manner but not in any way that suggests the licensor endorses the user or their use.
- **No Additional Restrictions:** Users may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

## Direct Use
Mistral-Astronomy-7b-v0.1 is particularly suitable for astronomy enthusiasts, educators, researchers, and anyone seeking accurate and detailed knowledge in astronomy.

## Downstream Use
The model is ideal for applications requiring specialized astronomical knowledge, such as virtual planetariums, educational software, and research assistance tools.

## Out-of-Scope Use
While specialized in astronomy, Mistral-Astronomy-7b-v0.1 may not perform optimally in non-astronomical contexts and is not intended for general language tasks.

## Performance Comparison
Although it has a higher perplexity score on Wikitext compared to OpenHermes 2.5, Mistral-Astronomy-7b-v0.1 provides more accurate and detailed responses in the field of astronomy, with some trade-offs in formatting.

## Bias, Risks, and Limitations
The model's focus on astronomy means that its performance in other domains may be limited. Users should consider this when applying the model outside its specialized area.

## Custom Stopping Strings Usage
To enhance response clarity and structure, the following custom stopping strings are used:
- "},"
- "User:"
- "You:"
- "\"\n"
- "\nUser"
- "\nUser:"

## Training Data
Approximately 1000 question-and-answer sets derived from "Astronomy 2e" by OpenStax were used for training, ensuring context-specific and structured training input.

## Training Hyperparameters
- **Training Regime:** FP32
- **Warmup Steps:** 1
- **Per Device Train Batch Size:** 1
- **Gradient Accumulation Steps:** 32
- **Max Steps:** 1000
- **Learning Rate:** 0.0002
- **Logging Steps:** 1
- **Save Steps:** 1
- **Lora Alpha:** 32
- **Dimension Count:** 16

## Compute Infrastructure
- **Hardware Type:** RTX 6000 Ada GPU
- **Training Duration:** ~15 minutes

## Acknowledgments and Attribution
Special thanks to the OpenStax team for "Astronomy 2e," the foundational content for the Mistral-Astronomy-7b-v0.1 model. This work is based on "Astronomy 2e" by OpenStax, which is licensed under [Creative Commons Attribution 4.0 International License (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/). Changes were made to the original text for the purpose of creating this language model. This acknowledgment does not imply endorsement by OpenStax or the original authors.

Further appreciation is extended to the Mistral and OpenHermes 2.5 teams for their foundational work in language modeling.