Thalesian
/

akk-111m

Safetensors

Generated from Trainer

Model card Files Files and versions Community

Thalesian commited on Sep 11

Commit

a63b0e0

•

1 Parent(s): 68062e3

Update README.md

Browse files

Files changed (1) hide show

README.md +52 -7

README.md CHANGED Viewed

@@ -6,27 +6,72 @@ model-index:
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# t5-small-p-l-akk-en-20240910-174859
-This model was trained from scratch on the None dataset.
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:

   results: []
 ---
+A model for translating cuneiform to english using [google's t5-small](https://huggingface.co/google-t5/t5-small) as a baseline.
+- Akkadian: 𒄿 𒈾 𒌗 𒃶 𒌓 𒐉 𒆚 𒀀 𒈾 𒆳 𒆸 𒄭 𒇻 𒁺 𒅅 𒆳 𒁀 𒀀 𒍝 𒆳 𒊓 𒅈 𒁀 𒇷 𒀀 𒆳 𒁲 𒁺 𒀀 𒆷 𒀀 𒁲 𒌷 𒈨 𒌍 𒉌 𒃻 𒅆 𒁲 𒀀 𒇉 𒊒 𒌑 𒊒 𒊭 𒆳 𒈨 𒄴 𒊑 𒀝 𒋤 𒊩 𒆷 𒋢 𒉡 𒃻 𒋗 𒈨 𒌍 𒋗 𒉡 𒌑 𒊺 𒍝 𒀀 𒀀 𒈾 𒌷 𒅀 𒀸 𒋩 𒌒 𒆷'
+- English: 'in the month kislimu the fourth day i marched to the land habhu i conquered the lands bazu sarbaliu and didualu together with the cities on the banks of the river ruru of the land mehru i brought forth their booty and possessions and brought them to my city assur' Prediction: 'in the mo nth tammuz iv i conquered the land s que and que i conquered the land s que and bi t yakin i conquered the cities f ro m the river i conquered and plundered the cities on the bo rd er of the land elam'
+Note that the training loss does not reflect full training - this model was trained at expanding context sizes (56 -> 512) restricted to complete sequences. It was trained on cuneiform -> English, transliteration, and grouping in both directions to reinforce itself. It is an instruct model, so it requires a request to intepret data.
+# akk-111m
+This model was trained from scratch on the [Akkademia dataset](https://github.com/gaigutherz/Akkademia).
+It achieves the following categorical cross-entropy results on the evaluation set (512 tokens):
+- Loss: 0.0753
+Cuneiform -> English Bleu score
+- 500 tokens: 38.91
+- 100 tokens: 43.13
+Transliterated -> English Bleu score
+- 500 tokens: 37.02
+- 100 tokens: 41.67
+Cuneiform -> Transliteration Bleu score
+- 500 tokens: 94.31
+- 100 tokens: 94.36
+Cuneiform -> Transliteration Accuracy
+- 100 tokens: 50% (note a single missed character significantly decreases accuracy in seq2seq models, see Bleu score for positional flexibility)
 ## Model description
+This is an instruct model, meaning it is capable of multiple tasks. It is intended for primarily translation + transliteration, but it can also be used for reverse translation as well.
+###Translation Instrutions:
+- "Translate Akkadian cuneiform to English" + cuneiform signs -> English
+- "Translate Akkadian simple transliteration to English" + simple transliteration -> English
+- "Translate Akkadian grouped transliteration to English" + transliteration with spacial symbols -> English
+- "Translate English to Akkadian cuneiform" + English -> Akkadian cuneiform signs
+- "Translate English to simple Akkadian transliteration" + English -> Akkadian simple transliteration with no special symbols
+- "Translate English to grouped Akkadian transliteration" + English -> Akkadian transliteration grouped into words with special symbols
+###Transliteration Instructions:
+- "Transliterate Akkadian cuneiform to simple Latin Characters" + cuneiform signs -> transliteration with no special symbols
+- "Transliterate Akkadian cuneiform to grouped Latin characters" + cuneiform signs -> transliteration with special symbols/subscripts
+- "Group Akkadian transliteration into likely words" + simple transliteration -> transliteration with special symbols/subscripts
 ## Intended uses & limitations
+This model is designed to facilitate the translation/transliteration of Akkadian cuneiform. It may have limited facility in the reverse (e.g. translate English to Akkadian cuneiform) but these use cases are untested.
 ## Training and evaluation data
+Data was used from the [Akkademia project](https://github.com/gaigutherz/Akkademia), previously published in [PNAS Nexus](https://academic.oup.com/pnasnexus/article/2/5/pgad096/7147349). More information on the training data, as well as the test and validation splits, can be found on both the GitHub and published methodology.
 ## Training procedure
+Because of the unequal distribution of data (many short sequences + long sequences) data was trained with different padded lengths:
+An initial few epochs with a max length of 56 tokens
+A follow-up series of epochs at 128 tokens
+The same for 256 tokens
+A final 5 epochs for 512 tokens
+The origional t5-small model had its tokens and embedding layers expanded by the additional linguistic data. Cuneiform symbols were split by spaces to be fed directly into the model, following the instructions detailed above.
 ### Training hyperparameters
 The following hyperparameters were used during training: