phanerozoic commited on
Commit
14b1fa6
1 Parent(s): b1c9907

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md CHANGED
@@ -1,3 +1,65 @@
1
  ---
2
  license: cc-by-4.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
  ---
4
+
5
+ # Mistral-Astronomy-7b-v0.1
6
+
7
+ Mistral-Astronomy-7b-v0.1, developed by Phanerozoic, is a specialized language model fine-tuned on "Astronomy 2e" by OpenStax. This model is adept at providing detailed and accurate responses in the field of astronomy, significantly improving upon the capabilities of the base OpenHermes 2.5 model in this specific domain.
8
+
9
+ ## Model Description
10
+ - **Developed by:** Phanerozoic
11
+ - **License for Training Data:** Creative Commons Attribution 4.0 International (CC BY 4.0)
12
+ - **Finetuned from model:** OpenHermes 2.5
13
+
14
+ ## License Details
15
+ The content of "Astronomy 2e" by OpenStax, used for training Mistral-Astronomy-7b-v0.1, is licensed under Creative Commons Attribution 4.0 International (CC BY 4.0). This license allows for the following:
16
+ - **Share:** Permission to copy and redistribute the material in any medium or format for any purpose, even commercially.
17
+ - **Adapt:** Freedom to remix, transform, and build upon the material for any purpose, even commercially.
18
+ - **Attribution Requirements:** Users must give appropriate credit, provide a link to the license, and indicate if changes were made. These requirements must be fulfilled in any reasonable manner but not in any way that suggests the licensor endorses the user or their use.
19
+ - **No Additional Restrictions:** Users may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
20
+
21
+ ## Direct Use
22
+ Mistral-Astronomy-7b-v0.1 is particularly suitable for astronomy enthusiasts, educators, researchers, and anyone seeking accurate and detailed knowledge in astronomy.
23
+
24
+ ## Downstream Use
25
+ The model is ideal for applications requiring specialized astronomical knowledge, such as virtual planetariums, educational software, and research assistance tools.
26
+
27
+ ## Out-of-Scope Use
28
+ While specialized in astronomy, Mistral-Astronomy-7b-v0.1 may not perform optimally in non-astronomical contexts and is not intended for general language tasks.
29
+
30
+ ## Performance Comparison
31
+ Although it has a higher perplexity score on Wikitext compared to OpenHermes 2.5, Mistral-Astronomy-7b-v0.1 provides more accurate and detailed responses in the field of astronomy, with some trade-offs in formatting.
32
+
33
+ ## Bias, Risks, and Limitations
34
+ The model's focus on astronomy means that its performance in other domains may be limited. Users should consider this when applying the model outside its specialized area.
35
+
36
+ ## Custom Stopping Strings Usage
37
+ To enhance response clarity and structure, the following custom stopping strings are used:
38
+ - "},"
39
+ - "User:"
40
+ - "You:"
41
+ - "\"\n"
42
+ - "\nUser"
43
+ - "\nUser:"
44
+
45
+ ## Training Data
46
+ Approximately 1000 question-and-answer sets derived from "Astronomy 2e" by OpenStax were used for training, ensuring context-specific and structured training input.
47
+
48
+ ## Training Hyperparameters
49
+ - **Training Regime:** FP32
50
+ - **Warmup Steps:** 1
51
+ - **Per Device Train Batch Size:** 1
52
+ - **Gradient Accumulation Steps:** 32
53
+ - **Max Steps:** 1000
54
+ - **Learning Rate:** 0.0002
55
+ - **Logging Steps:** 1
56
+ - **Save Steps:** 1
57
+
58
+ ## Compute Infrastructure
59
+ - **Hardware Type:** RTX 6000 Ada GPU
60
+ - **Training Duration:** ~15 minutes
61
+
62
+ ## Acknowledgments and Attribution
63
+ Special thanks to the OpenStax team for "Astronomy 2e," the foundational content for the Mistral-Astronomy-7b-v0.1 model. This work is based on "Astronomy 2e" by OpenStax, which is licensed under [Creative Commons Attribution 4.0 International License (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/). Changes were made to the original text for the purpose of creating this language model. This acknowledgment does not imply endorsement by OpenStax or the original authors.
64
+
65
+ Further appreciation is extended to the Mistral and OpenHermes 2.5 teams for their foundational work in language modeling.