--- license: apache-2.0 language: - en tags: - generated_from_trainer datasets: - glue metrics: - accuracy model-index: - name: yujiepan/bert-base-uncased-sst2-int8-unstructured80 results: - task: name: Text Classification type: text-classification dataset: name: GLUE SST2 type: glue config: sst2 split: validation args: sst2 metrics: - name: Accuracy type: accuracy value: 0.91284 base_model: - google-bert/bert-base-uncased base_model_relation: quantized --- # bert-base-uncased-sst2-unstructured80-int8-ov * Model creator: [Google](https://huggingface.co/google-bert) * Original model: [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) ## Description This model conducts unstructured magnitude pruning, quantization and distillation at the same time on [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) when finetuning on the GLUE SST2 dataset. It achieves the following results on the evaluation set: - Torch accuracy: **0.9128** - OpenVINO IR accuracy: **0.9128** - Sparsity in transformer block linear layers: **0.80** The model was converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT8 by [NNCF](https://github.com/openvinotoolkit/nncf). ## Compatibility The provided OpenVINO™ IR model is compatible with: * OpenVINO version 2024.3.0 and higher * Optimum Intel 1.19.0 and higher ## Optimization Parameters Optimization was performed using `nncf` with the following `nncf_config.json` file: ``` [ { "algorithm": "quantization", "preset": "mixed", "overflow_fix": "disable", "initializer": { "range": { "num_init_samples": 300, "type": "mean_min_max" }, "batchnorm_adaptation": { "num_bn_adaptation_samples": 0 } }, "scope_overrides": { "activations": { "{re}.*matmul_0": { "mode": "symmetric" } } }, "ignored_scopes": [ "{re}.*Embeddings.*", "{re}.*__add___[0-1]", "{re}.*layer_norm_0", "{re}.*matmul_1", "{re}.*__truediv__*" ] }, { "algorithm": "magnitude_sparsity", "ignored_scopes": [ "{re}.*NNCFEmbedding.*", "{re}.*LayerNorm.*", "{re}.*pooler.*", "{re}.*classifier.*" ], "sparsity_init": 0.0, "params": { "power": 3, "schedule": "polynomial", "sparsity_freeze_epoch": 10, "sparsity_target": 0.8, "sparsity_target_epoch": 9, "steps_per_epoch": 2105, "update_per_optimizer_step": true } } ] ``` For more information on optimization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization.html). ## Running Model Training 1. Install required packages: ``` conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia pip install optimum[openvino,nncf] pip install datasets sentencepiece scipy scikit-learn protobuf evaluate pip install wandb # optional ``` 2. Run model training: ``` NNCFCFG=/path/to/nncf_config.json python run_glue.py \ --lr_scheduler_type cosine_with_restarts \ --cosine_lr_scheduler_cycles 11 6 \ --record_best_model_after_epoch 9 \ --load_best_model_at_end True \ --metric_for_best_model accuracy \ --model_name_or_path textattack/bert-base-uncased-SST-2 \ --teacher_model_or_path yoshitomo-matsubara/bert-large-uncased-sst2 \ --distillation_temperature 2 \ --task_name sst2 \ --nncf_compression_config $NNCFCFG \ --distillation_weight 0.95 \ --output_dir /tmp/bert-base-uncased-sst2-int8-unstructured80 \ --overwrite_output_dir \ --run_name bert-base-uncased-sst2-int8-unstructured80 \ --do_train \ --do_eval \ --max_seq_length 128 \ --per_device_train_batch_size 32 \ --per_device_eval_batch_size 32 \ --learning_rate 5e-05 \ --optim adamw_torch \ --num_train_epochs 17 \ --logging_steps 1 \ --evaluation_strategy steps \ --eval_steps 250 \ --save_strategy steps \ --save_steps 250 \ --save_total_limit 1 \ --fp16 \ --seed 1 ``` For more details, refer to the [training configuration and script](https://gist.github.com/yujiepan-work/5d7e513a47b353db89f6e1b512d7c080). ## Usage examples * [OpenVINO notebooks](https://github.com/openvinotoolkit/openvino_notebooks): - [Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/sparsity-optimization/sparsity-optimization.ipynb) ## Limitations Check the original model card for [limitations](https://huggingface.co/google-bert/bert-base-uncased). ## Legal information The original model is distributed under [apache-2.0](https://choosealicense.com/licenses/apache-2.0/) license. More details can be found in [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) model card. ## Disclaimer Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See [Intel’s Global Human Rights Principles](https://www.intel.com/content/dam/www/central-libraries/us/en/documents/policy-human-rights.pdf). Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.