metadata
library_name: sklearn
tags:
- sklearn
- skops
- tabular-classification
model_format: pickle
model_file: model.joblib
Model description
[More Information Needed]
Intended uses & limitations
[More Information Needed]
Training Procedure
Hyperparameters
The model is trained with below hyperparameters.
Click to expand
Hyperparameter | Value |
---|---|
memory | |
steps | [('feature_extraction', ColumnTransformer(transformers=[('abbreviations', <__main__.ELFAbbreviationTransformer object at 0x7f38e082e4f0>, 0), ('tokenizer', CountVectorizer(binary=True, lowercase=False, tokenizer=<__main__.LegalEntityTokenizer object at 0x7f38e082ee50>), 0)])), ('classifier', ComplementNB())] |
verbose | False |
feature_extraction | ColumnTransformer(transformers=[('abbreviations', <__main__.ELFAbbreviationTransformer object at 0x7f38e082e4f0>, 0), ('tokenizer', CountVectorizer(binary=True, lowercase=False, tokenizer=<__main__.LegalEntityTokenizer object at 0x7f38e082ee50>), 0)]) |
classifier | ComplementNB() |
feature_extraction__n_jobs | |
feature_extraction__remainder | drop |
feature_extraction__sparse_threshold | 0.3 |
feature_extraction__transformer_weights | |
feature_extraction__transformers | [('abbreviations', <__main__.ELFAbbreviationTransformer object at 0x7f38e082e4f0>, 0), ('tokenizer', CountVectorizer(binary=True, lowercase=False, tokenizer=<__main__.LegalEntityTokenizer object at 0x7f38e082ee50>), 0)] |
feature_extraction__verbose | False |
feature_extraction__verbose_feature_names_out | True |
feature_extraction__abbreviations | <__main__.ELFAbbreviationTransformer object at 0x7f38e082e4f0> |
feature_extraction__tokenizer | CountVectorizer(binary=True, lowercase=False, tokenizer=<__main__.LegalEntityTokenizer object at 0x7f38e082ee50>) |
feature_extraction__abbreviations__elf_abbreviations | <__main__.ELFAbbreviations object at 0x7f38f438b670> |
feature_extraction__abbreviations__jurisdiction | PL |
feature_extraction__abbreviations__use_endswith | True |
feature_extraction__abbreviations__use_lowercasing | True |
feature_extraction__tokenizer__analyzer | word |
feature_extraction__tokenizer__binary | True |
feature_extraction__tokenizer__decode_error | strict |
feature_extraction__tokenizer__dtype | <class 'numpy.int64'> |
feature_extraction__tokenizer__encoding | utf-8 |
feature_extraction__tokenizer__input | content |
feature_extraction__tokenizer__lowercase | False |
feature_extraction__tokenizer__max_df | 1.0 |
feature_extraction__tokenizer__max_features | |
feature_extraction__tokenizer__min_df | 1 |
feature_extraction__tokenizer__ngram_range | (1, 1) |
feature_extraction__tokenizer__preprocessor | |
feature_extraction__tokenizer__stop_words | |
feature_extraction__tokenizer__strip_accents | |
feature_extraction__tokenizer__token_pattern | (?u)\b\w\w+\b |
feature_extraction__tokenizer__tokenizer | <__main__.LegalEntityTokenizer object at 0x7f38e082ee50> |
feature_extraction__tokenizer__vocabulary | |
classifier__alpha | 1.0 |
classifier__class_prior | |
classifier__fit_prior | True |
classifier__norm | False |
Model Plot
The model plot is below.
Pipeline(steps=[('feature_extraction',ColumnTransformer(transformers=[('abbreviations',<__main__.ELFAbbreviationTransformer object at 0x7f38e082e4f0>,0),('tokenizer',CountVectorizer(binary=True,lowercase=False,tokenizer=<__main__.LegalEntityTokenizer object at 0x7f38e082ee50>),0)])),('classifier', ComplementNB())])Please rerun this cell to show the HTML repr or trust the notebook.
Pipeline(steps=[('feature_extraction',ColumnTransformer(transformers=[('abbreviations',<__main__.ELFAbbreviationTransformer object at 0x7f38e082e4f0>,0),('tokenizer',CountVectorizer(binary=True,lowercase=False,tokenizer=<__main__.LegalEntityTokenizer object at 0x7f38e082ee50>),0)])),('classifier', ComplementNB())])
ColumnTransformer(transformers=[('abbreviations',<__main__.ELFAbbreviationTransformer object at 0x7f38e082e4f0>,0),('tokenizer',CountVectorizer(binary=True, lowercase=False,tokenizer=<__main__.LegalEntityTokenizer object at 0x7f38e082ee50>),0)])
0
<__main__.ELFAbbreviationTransformer object at 0x7f38e082e4f0>
0
CountVectorizer(binary=True, lowercase=False,tokenizer=<__main__.LegalEntityTokenizer object at 0x7f38e082ee50>)
ComplementNB()
Evaluation Results
You can find the details about evaluation process and the evaluation results.
Metric | Value |
---|---|
f1 | 0.971647 |
f1 macro | 0.522164 |
How to Get Started with the Model
[More Information Needed]
Model Card Authors
This model card is written by following authors:
[More Information Needed]
Model Card Contact
You can contact the model card authors through following channels: [More Information Needed]
Citation
Below you can find information related to citation.
BibTeX:
[More Information Needed]