Update README.md
Browse files
README.md
CHANGED
@@ -93,7 +93,7 @@ We compared our merged model's performance on the hallucination detection benchm
|
|
93 |
|
94 |
Scores from arize/phoenix
|
95 |
|
96 |
-
As shown in the table, our merged model achieves competitive performance, with an F1 score of 0.
|
97 |
|
98 |
## Model description
|
99 |
|
|
|
93 |
|
94 |
Scores from arize/phoenix
|
95 |
|
96 |
+
As shown in the table, our merged model achieves competitive performance, with an F1 score of 0.83, matching or outperforming several state-of-the-art language models on this hallucination detection task.
|
97 |
|
98 |
## Model description
|
99 |
|