Merge branch 'main' of hf.co:Lihuchen/pearl_small
Browse files
README.md
CHANGED
@@ -6,25 +6,30 @@ tags:
|
|
6 |
- Phrase Representation
|
7 |
- String Matching
|
8 |
- Fuzzy Join
|
9 |
-
pipeline_tag: sentence-similarity
|
10 |
---
|
11 |
## PEARL-small
|
12 |
-
|
13 |
[Learning High-Quality and General-Purpose Phrase Representations](https://arxiv.org/pdf/2401.10407.pdf). <br>
|
14 |
-
[Lihu Chen](chenlihu.com), [Gaël Varoquaux](https://gael-varoquaux.info/), [Fabian M. Suchanek](https://suchanek.name/)
|
|
|
15 |
|
16 |
PEARL-small is finetuned on [E5-small](https://huggingface.co/intfloat/e5-small-v2),
|
17 |
-
which can yield better representations for
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
-
| Model |Size| PPDB | PPDB filtered |Turney|BIRD|YAGO|UMLS|CoNLL|BC5CDR|AutoFJ|
|
20 |
|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
|
21 |
-
| FastText |-|
|
22 |
-
| Sentence-BERT |110M| 94.6 | 66.8 | 50.4 | 62.6 | 21.6|23.6|25.5|48.4| 57.2|
|
23 |
-
| Phrase-BERT |110M| 96.8 | 68.7 | 57.2 | 68.8 |23.7|26.1|35.4| 59.5|66.9|
|
24 |
-
| E5-small |34M| 96.0| 56.8|55.9| 63.1|43.3| 42.0|27.6| 53.7|74.8|
|
25 |
-
|E5-base|110M|
|
26 |
-
|PEARL-small|34M|
|
27 |
-
|PEARL-base|110M|97.3|72.2|59.7|72.6|50.7|45.8|39.3|69.4|77.1|
|
28 |
|
29 |
## Usage
|
30 |
|
|
|
6 |
- Phrase Representation
|
7 |
- String Matching
|
8 |
- Fuzzy Join
|
|
|
9 |
---
|
10 |
## PEARL-small
|
|
|
11 |
[Learning High-Quality and General-Purpose Phrase Representations](https://arxiv.org/pdf/2401.10407.pdf). <br>
|
12 |
+
[Lihu Chen](https://chenlihu.com), [Gaël Varoquaux](https://gael-varoquaux.info/), [Fabian M. Suchanek](https://suchanek.name/).
|
13 |
+
<br> Accepted by EACL Findings 2024 <br>
|
14 |
|
15 |
PEARL-small is finetuned on [E5-small](https://huggingface.co/intfloat/e5-small-v2),
|
16 |
+
which can yield better representations for phrases and strings. <br>
|
17 |
+
If you require semantic similarity computation for strings, our PEARL model might be a helpful tool.<br>
|
18 |
+
It offers powerful embeddings suitable for tasks like string matching, entity retrieval, entity clustering, and fuzzy join.
|
19 |
+
|
20 |
+
🤗 [PEARL-small](https://huggingface.co/Lihuchen/pearl_small) 🤗 [PEARL-base](https://huggingface.co/Lihuchen/pearl_base)
|
21 |
+
<br>
|
22 |
+
|
23 |
|
24 |
+
| Model |Size|Avg| PPDB | PPDB filtered |Turney|BIRD|YAGO|UMLS|CoNLL|BC5CDR|AutoFJ|
|
25 |
|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
|
26 |
+
| FastText |-| 40.3| 94.4 | 61.2 | 59.6 | 58.9 |16.9|14.5|3.0|0.2| 53.6|
|
27 |
+
| Sentence-BERT |110M|50.1| 94.6 | 66.8 | 50.4 | 62.6 | 21.6|23.6|25.5|48.4| 57.2|
|
28 |
+
| Phrase-BERT |110M|54.5| 96.8 | 68.7 | 57.2 | 68.8 |23.7|26.1|35.4| 59.5|66.9|
|
29 |
+
| E5-small |34M|57.0| 96.0| 56.8|55.9| 63.1|43.3| 42.0|27.6| 53.7|74.8|
|
30 |
+
|E5-base|110M| 61.1| 95.4|65.6|59.4|66.3| 47.3|44.0|32.0| 69.3|76.1|
|
31 |
+
|PEARL-small|34M| 62.5| 97.0|70.2|57.9|68.1| 48.1|44.5|42.4|59.3|75.2|
|
32 |
+
|PEARL-base|110M|64.8|97.3|72.2|59.7|72.6|50.7|45.8|39.3|69.4|77.1|
|
33 |
|
34 |
## Usage
|
35 |
|