English transformer pipeline based on en_core_web_trf plus an entity recognizer based on the ESCO taxonomy. (Transformer(name='roberta-base', piece_encoder='byte-bpe', stride=104, type='roberta', width=768, window=144, vocab_size=50265)). Components: transformer, tagger, parser, ner, attribute_ruler, lemmatizer.
Feature | Description |
---|---|
Name | en_core_web_trf_esco_ner |
Version | 3.7.3 |
spaCy | >=3.7.2,<3.8.0 |
Default Pipeline | transformer , tagger , parser , attribute_ruler , lemmatizer , ner , entity_ruler |
Components | transformer , tagger , parser , attribute_ruler , lemmatizer , ner , entity_ruler |
Vectors | 0 keys, 0 unique vectors (0 dimensions) |
Sources | OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston) ClearNLP Constituent-to-Dependency Conversion (Emory University) WordNet 3.0 (Princeton University) roberta-base (Yinhan Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and Mike Lewis and Luke Zettlemoyer and Veselin Stoyanov) |
License | MIT |
Author | robipolli@gmail.com |
Label Scheme
View label scheme (113 labels for 4 components)
Component | Labels |
---|---|
tagger |
$ , '' , , , -LRB- , -RRB- , . , : , ADD , AFX , CC , CD , DT , EX , FW , HYPH , IN , JJ , JJR , JJS , LS , MD , NFP , NN , NNP , NNPS , NNS , PDT , POS , PRP , PRP$ , RB , RBR , RBS , RP , SYM , TO , UH , VB , VBD , VBG , VBN , VBP , VBZ , WDT , WP , WP$ , WRB , XX , ```` |
parser |
ROOT , acl , acomp , advcl , advmod , agent , amod , appos , attr , aux , auxpass , case , cc , ccomp , compound , conj , csubj , csubjpass , dative , dep , det , dobj , expl , intj , mark , meta , neg , nmod , npadvmod , nsubj , nsubjpass , nummod , oprd , parataxis , pcomp , pobj , poss , preconj , predet , prep , prt , punct , quantmod , relcl , xcomp |
ner |
CARDINAL , DATE , EVENT , FAC , GPE , LANGUAGE , LAW , LOC , MONEY , NORP , ORDINAL , ORG , PERCENT , PERSON , PRODUCT , QUANTITY , TIME , WORK_OF_ART |
entity_ruler |
ESCO |
Accuracy
Type | Score |
---|---|
TOKEN_ACC |
99.86 |
TOKEN_P |
99.57 |
TOKEN_R |
99.58 |
TOKEN_F |
99.57 |
TAG_ACC |
98.13 |
SENTS_P |
94.89 |
SENTS_R |
85.79 |
SENTS_F |
90.11 |
DEP_UAS |
95.26 |
DEP_LAS |
93.91 |
ENTS_P |
90.08 |
ENTS_R |
90.30 |
ENTS_F |
90.19 |
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Evaluation results
- NER Precisionself-reported0.901
- NER Recallself-reported0.903
- NER F Scoreself-reported0.902
- TAG (XPOS) Accuracyself-reported0.981
- Unlabeled Attachment Score (UAS)self-reported0.953
- Labeled Attachment Score (LAS)self-reported0.939
- Sentences F-Scoreself-reported0.901