phanerozoic commited on
Commit
847ce97
1 Parent(s): c943865

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -3
README.md CHANGED
@@ -1,3 +1,68 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - bert
7
+ - named-entity-recognition
8
+ - conll2003
9
+ widget:
10
+ - text: |
11
+ Enter your text here for Named Entity Recognition.
12
+ example_title: "NER Prediction"
13
+ ---
14
+
15
+ # BERT-NER-Classifier
16
+
17
+ The BERT-NER-Classifier is a sophisticated model based on the `bert-base-uncased` architecture. It has been fine-tuned specifically for Named Entity Recognition (NER) using the CoNLL-2003 dataset, aiming to accurately identify entities such as persons, organizations, locations, and miscellaneous entities in text.
18
+
19
+ - **Developed by**: phanerozoic
20
+ - **Model type**: BertForTokenClassification
21
+ - **Source model**: `bert-base-uncased`
22
+ - **License**: cc-by-nc-4.0
23
+ - **Languages**: English
24
+
25
+ ## Model Details
26
+
27
+ The BERT-NER-Classifier uses a self-attention mechanism that differentiates the importance of each word in the context of others, tailored for NER tasks.
28
+
29
+ ### Configuration
30
+ - **Attention probs dropout prob**: 0.1
31
+ - **Hidden act**: gelu
32
+ - **Hidden size**: 768
33
+ - **Number of attention heads**: 12
34
+ - **Number of hidden layers**: 12
35
+
36
+ ## Training and Evaluation Data
37
+
38
+ The model utilizes the CoNLL-2003 dataset, which consists of texts annotated with named entities. This dataset is a standard benchmark for NER models.
39
+
40
+ ## Training Procedure
41
+
42
+ The model training was guided by an automated script designed to explore and identify the best hyperparameters for optimal performance. The script conducted extensive experimentation across the hyperparameter space, iteratively training and evaluating the model to pinpoint the most effective settings.
43
+
44
+ - **Initial exploratory training**: Using various combinations of epochs, batch sizes, and learning rates.
45
+ - **Refinement and focused training**: Upon identifying the best performing hyperparameters, the model underwent further training three additional times to ensure stability and consistency in performance.
46
+
47
+ ### Optimal Hyperparameters Identified
48
+ - **Epochs**: 3
49
+ - **Batch size**: 32
50
+ - **Learning rate**: 3e-5
51
+
52
+ ### Performance
53
+ The refined training approach resulted in a model with robust predictive capabilities:
54
+ - **Validation Precision**: 0.8702
55
+ - **Validation Recall**: 0.8209
56
+ - **Validation F1 Score**: 0.8430
57
+
58
+ ## Usage
59
+
60
+ This model is highly effective for identifying named entities in English texts, particularly in contexts similar to the CoNLL-2003 dataset upon which the model was trained.
61
+
62
+ ## Limitations
63
+
64
+ While the model excels in contexts similar to its training data (CoNLL-2003), its performance might vary on text from other domains or other languages. Future enhancements could involve expanding the training data to include more diverse text sources.
65
+
66
+ ## Acknowledgments
67
+
68
+ Thanks to the developers of the BERT architecture and the Hugging Face team. The tools and frameworks provided were instrumental in the development of this model.