File size: 5,777 Bytes
3841ad3
6d32b9d
3841ad3
 
 
 
 
 
 
 
 
e01ab27
3841ad3
 
 
 
 
 
 
 
 
 
 
 
 
 
0f44237
 
 
3841ad3
 
6d32b9d
3841ad3
6d32b9d
 
3841ad3
6d32b9d
 
 
3841ad3
6d32b9d
 
 
 
 
 
00e9da9
 
 
 
 
 
 
6d32b9d
3841ad3
6d32b9d
3841ad3
 
6d32b9d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3841ad3
 
6d32b9d
3841ad3
6d32b9d
 
 
 
 
 
 
 
 
 
3841ad3
6d32b9d
3841ad3
6d32b9d
 
3841ad3
eed0e15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3841ad3
 
6d32b9d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3841ad3
6d32b9d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
---
license: apache-2.0
language:
- en
tags:
- generated_from_trainer
datasets:
- glue
metrics:
- accuracy
model-index:
- name: yujiepan/bert-base-uncased-sst2-int8-unstructured80
  results:
  - task:
      name: Text Classification
      type: text-classification
    dataset:
      name: GLUE SST2
      type: glue
      config: sst2
      split: validation
      args: sst2
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.91284
base_model:
- google-bert/bert-base-uncased
base_model_relation: quantized
---

# bert-base-uncased-sst2-unstructured80-int8-ov

 * Model creator: [Google](https://huggingface.co/google-bert)
 * Original model: [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)

## Description

This model conducts unstructured magnitude pruning, quantization and distillation at the same time on [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) when finetuning on the GLUE SST2 dataset.
It achieves the following results on the evaluation set:
- Torch accuracy: **0.9128**
- OpenVINO IR accuracy: **0.9128**
- Sparsity in transformer block linear layers: **0.80**

The model was converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT8 by [NNCF](https://github.com/openvinotoolkit/nncf).

## Compatibility

The provided OpenVINO™ IR model is compatible with:

* OpenVINO version 2024.3.0 and higher
* Optimum Intel 1.19.0 and higher

## Optimization Parameters

Optimization was performed using `nncf` with the following `nncf_config.json` file:

```
[
    {
        "algorithm": "quantization",
        "preset": "mixed",
        "overflow_fix": "disable",
        "initializer": {
            "range": {
                "num_init_samples": 300,
                "type": "mean_min_max"
            },
            "batchnorm_adaptation": {
                "num_bn_adaptation_samples": 0
            }
        },
        "scope_overrides": {
            "activations": {
                "{re}.*matmul_0": {
                    "mode": "symmetric"
                }
            }
        },
        "ignored_scopes": [
            "{re}.*Embeddings.*",
            "{re}.*__add___[0-1]",
            "{re}.*layer_norm_0",
            "{re}.*matmul_1",
            "{re}.*__truediv__*"
        ]
    },
    {
        "algorithm": "magnitude_sparsity",
        "ignored_scopes": [
            "{re}.*NNCFEmbedding.*",
            "{re}.*LayerNorm.*",
            "{re}.*pooler.*",
            "{re}.*classifier.*"
        ],
        "sparsity_init": 0.0,
        "params": {
            "power": 3,
            "schedule": "polynomial",
            "sparsity_freeze_epoch": 10,
            "sparsity_target": 0.8,
            "sparsity_target_epoch": 9,
            "steps_per_epoch": 2105,
            "update_per_optimizer_step": true
        }
    }
]
```

For more information on optimization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization.html).

## Running Model Training

1. Install required packages:

```
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
pip install optimum[openvino,nncf]
pip install datasets sentencepiece scipy scikit-learn protobuf evaluate
pip install wandb # optional
```

2. Run model training:

```
NNCFCFG=/path/to/nncf_config.json
python run_glue.py \
  --lr_scheduler_type cosine_with_restarts \
  --cosine_lr_scheduler_cycles 11 6 \
  --record_best_model_after_epoch 9 \
  --load_best_model_at_end True \
  --metric_for_best_model accuracy \
  --model_name_or_path textattack/bert-base-uncased-SST-2 \
  --teacher_model_or_path yoshitomo-matsubara/bert-large-uncased-sst2 \
  --distillation_temperature 2 \
  --task_name sst2 \
  --nncf_compression_config $NNCFCFG \
  --distillation_weight 0.95 \
  --output_dir /tmp/bert-base-uncased-sst2-int8-unstructured80 \
  --overwrite_output_dir \
  --run_name bert-base-uncased-sst2-int8-unstructured80 \
  --do_train \
  --do_eval \
  --max_seq_length 128 \
  --per_device_train_batch_size 32 \
  --per_device_eval_batch_size 32 \
  --learning_rate 5e-05 \
  --optim adamw_torch \
  --num_train_epochs 17 \
  --logging_steps 1 \
  --evaluation_strategy steps \
  --eval_steps 250 \
  --save_strategy steps \
  --save_steps 250 \
  --save_total_limit 1 \
  --fp16 \
  --seed 1
```

For more details, refer to the [training configuration and script](https://gist.github.com/yujiepan-work/5d7e513a47b353db89f6e1b512d7c080).

## Usage examples

* [OpenVINO notebooks](https://github.com/openvinotoolkit/openvino_notebooks):
  - [Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/sparsity-optimization/sparsity-optimization.ipynb)

## Limitations

Check the original model card for [limitations](https://huggingface.co/google-bert/bert-base-uncased).

## Legal information

The original model is distributed under [apache-2.0](https://choosealicense.com/licenses/apache-2.0/) license. More details can be found in [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) model card.

## Disclaimer

Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See [Intel’s Global Human Rights Principles](https://www.intel.com/content/dam/www/central-libraries/us/en/documents/policy-human-rights.pdf). Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.