ViravirastSHZ
/

Hafez_Bert

Inference Endpoints

Model card Files Files and versions Community

Aminrhmni commited on Jul 29

Commit

2f6bb0f

•

1 Parent(s): ad38ac0

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -13,11 +13,11 @@ The paragraph describes the development of a language model named "Hafez," which
 <b>Model Type:</b> Hafez is based on the BERT architecture, which is a popular model for natural language processing (NLP).
-Cultural Reference: The model is named after Hafez, a renowned Persian poet known for his deeply emotional and philosophical verses. This choice of name suggests a connection to Persian literature and an intention to handle language in a way that may resonate with the cultural significance of the poet. (NLP).
-Training Data: The model has been trained on a substantial dataset comprising over 12 billion tokens. The text used to train the Hafez language model is comprised of two parts: 90% consists of educational materials, including research papers, dissertations, and theses, while the remaining 10% includes general texts. This careful selection of content aims to provide the model with a strong foundation in academic language and discourse.
-Text Cleaning and Preprocessing: The training data underwent a cleaning and preprocessing phase, which is essential for ensuring that the data is of high quality and suitable for training a machine learning model. The cleaning and preparation were conducted using "Viravirast text tools," which are likely specialized tools designed for text processing in this context.
 ### How to use

 <b>Model Type:</b> Hafez is based on the BERT architecture, which is a popular model for natural language processing (NLP).
+<b>Cultural Reference:</b> The model is named after Hafez, a renowned Persian poet known for his deeply emotional and philosophical verses. This choice of name suggests a connection to Persian literature and an intention to handle language in a way that may resonate with the cultural significance of the poet. (NLP).
+<b>Training Data:</b> The model has been trained on a substantial dataset comprising over 12 billion tokens. The text used to train the Hafez language model is comprised of two parts: 90% consists of educational materials, including research papers, dissertations, and theses, while the remaining 10% includes general texts. This careful selection of content aims to provide the model with a strong foundation in academic language and discourse.
+<b>Text Cleaning and Preprocessing:</b> The training data underwent a cleaning and preprocessing phase, which is essential for ensuring that the data is of high quality and suitable for training a machine learning model. The cleaning and preparation were conducted using "Viravirast text tools," which are likely specialized tools designed for text processing in this context.
 ### How to use