MaximProshin's picture
Update README.md
0f44237 verified
|
raw
history blame
5.78 kB
---
license: apache-2.0
language:
- en
tags:
- generated_from_trainer
datasets:
- glue
metrics:
- accuracy
model-index:
- name: yujiepan/bert-base-uncased-sst2-int8-unstructured80
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: GLUE SST2
type: glue
config: sst2
split: validation
args: sst2
metrics:
- name: Accuracy
type: accuracy
value: 0.91284
base_model:
- google-bert/bert-base-uncased
base_model_relation: quantized
---
# bert-base-uncased-sst2-unstructured80-int8-ov
* Model creator: [Google](https://huggingface.co/google-bert)
* Original model: [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)
## Description
This model conducts unstructured magnitude pruning, quantization and distillation at the same time on [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) when finetuning on the GLUE SST2 dataset.
It achieves the following results on the evaluation set:
- Torch accuracy: **0.9128**
- OpenVINO IR accuracy: **0.9128**
- Sparsity in transformer block linear layers: **0.80**
The model was converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT8 by [NNCF](https://github.com/openvinotoolkit/nncf).
## Compatibility
The provided OpenVINO™ IR model is compatible with:
* OpenVINO version 2024.3.0 and higher
* Optimum Intel 1.19.0 and higher
## Optimization Parameters
Optimization was performed using `nncf` with the following `nncf_config.json` file:
```
[
{
"algorithm": "quantization",
"preset": "mixed",
"overflow_fix": "disable",
"initializer": {
"range": {
"num_init_samples": 300,
"type": "mean_min_max"
},
"batchnorm_adaptation": {
"num_bn_adaptation_samples": 0
}
},
"scope_overrides": {
"activations": {
"{re}.*matmul_0": {
"mode": "symmetric"
}
}
},
"ignored_scopes": [
"{re}.*Embeddings.*",
"{re}.*__add___[0-1]",
"{re}.*layer_norm_0",
"{re}.*matmul_1",
"{re}.*__truediv__*"
]
},
{
"algorithm": "magnitude_sparsity",
"ignored_scopes": [
"{re}.*NNCFEmbedding.*",
"{re}.*LayerNorm.*",
"{re}.*pooler.*",
"{re}.*classifier.*"
],
"sparsity_init": 0.0,
"params": {
"power": 3,
"schedule": "polynomial",
"sparsity_freeze_epoch": 10,
"sparsity_target": 0.8,
"sparsity_target_epoch": 9,
"steps_per_epoch": 2105,
"update_per_optimizer_step": true
}
}
]
```
For more information on optimization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization.html).
## Running Model Training
1. Install required packages:
```
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
pip install optimum[openvino,nncf]
pip install datasets sentencepiece scipy scikit-learn protobuf evaluate
pip install wandb # optional
```
2. Run model training:
```
NNCFCFG=/path/to/nncf_config.json
python run_glue.py \
--lr_scheduler_type cosine_with_restarts \
--cosine_lr_scheduler_cycles 11 6 \
--record_best_model_after_epoch 9 \
--load_best_model_at_end True \
--metric_for_best_model accuracy \
--model_name_or_path textattack/bert-base-uncased-SST-2 \
--teacher_model_or_path yoshitomo-matsubara/bert-large-uncased-sst2 \
--distillation_temperature 2 \
--task_name sst2 \
--nncf_compression_config $NNCFCFG \
--distillation_weight 0.95 \
--output_dir /tmp/bert-base-uncased-sst2-int8-unstructured80 \
--overwrite_output_dir \
--run_name bert-base-uncased-sst2-int8-unstructured80 \
--do_train \
--do_eval \
--max_seq_length 128 \
--per_device_train_batch_size 32 \
--per_device_eval_batch_size 32 \
--learning_rate 5e-05 \
--optim adamw_torch \
--num_train_epochs 17 \
--logging_steps 1 \
--evaluation_strategy steps \
--eval_steps 250 \
--save_strategy steps \
--save_steps 250 \
--save_total_limit 1 \
--fp16 \
--seed 1
```
For more details, refer to the [training configuration and script](https://gist.github.com/yujiepan-work/5d7e513a47b353db89f6e1b512d7c080).
## Usage examples
* [OpenVINO notebooks](https://github.com/openvinotoolkit/openvino_notebooks):
- [Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/sparsity-optimization/sparsity-optimization.ipynb)
## Limitations
Check the original model card for [limitations](https://huggingface.co/google-bert/bert-base-uncased).
## Legal information
The original model is distributed under [apache-2.0](https://choosealicense.com/licenses/apache-2.0/) license. More details can be found in [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) model card.
## Disclaimer
Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See [Intel’s Global Human Rights Principles](https://www.intel.com/content/dam/www/central-libraries/us/en/documents/policy-human-rights.pdf). Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.