File size: 7,221 Bytes
186701e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
# Model Complexity Analysis

We provide a `tools/analysis_tools/get_flops.py` script to help with the complexity analysis for models of MMYOLO.
Currently, it provides the interfaces to compute parameter, activation and flops of the given model,
and supports printing the related information layer-by-layer in terms of network structure or table.

The commands as follows:

```shell
python tools/analysis_tools/get_flops.py
    ${CONFIG_FILE} \                           # config file path
    [--shape ${IMAGE_SIZE}] \                  # input image size (int), default 640*640
    [--show-arch ${ARCH_DISPLAY}] \            # print related information by network layers
    [--not-show-table ${TABLE_DISPLAY}] \      # print related information by table
    [--cfg-options ${CFG_OPTIONS}]             # config file option
# [] stands for optional parameter, do not type [] when actually entering the command line
```

Let's take the `rtmdet_s_syncbn_fast_8xb32-300e_coco.py` config file in RTMDet as an example to show how this script can be used:

## Usage Example 1: Print Flops, Parameters and related information by table

```shell
python tools/analysis_tools/get_flops.py  configs/rtmdet/rtmdet_s_syncbn_fast_8xb32-300e_coco.py
```

Output:

```python
==============================
Input shape: torch.Size([640, 640])
Model Flops: 14.835G
Model Parameters: 8.887M
==============================
```

| module                            | #parameters or shape | #flops  | #activations |
| :-------------------------------- | :------------------- | :------ | :----------: |
| model                             | 8.887M               | 14.835G |   35.676M    |
| backbone                          | 4.378M               | 5.416G  |   22.529M    |
| backbone.stem                     | 7.472K               | 0.765G  |    6.554M    |
| backbone.stem.0                   | 0.464K               | 47.514M |    1.638M    |
| backbone.stem.1                   | 2.336K               | 0.239G  |    1.638M    |
| backbone.stem.2                   | 4.672K               | 0.478G  |    3.277M    |
| backbone.stage1                   | 42.4K                | 0.981G  |    7.373M    |
| backbone.stage1.0                 | 18.56K               | 0.475G  |    1.638M    |
| backbone.stage1.1                 | 23.84K               | 0.505G  |    5.734M    |
| backbone.stage2                   | 0.21M                | 1.237G  |    4.915M    |
| backbone.stage2.0                 | 73.984K              | 0.473G  |    0.819M    |
| backbone.stage2.1                 | 0.136M               | 0.764G  |    4.096M    |
| backbone.stage3                   | 0.829M               | 1.221G  |    2.458M    |
| backbone.stage3.0                 | 0.295M               | 0.473G  |    0.41M     |
| backbone.stage3.1                 | 0.534M               | 0.749G  |    2.048M    |
| backbone.stage4                   | 3.29M                | 1.211G  |    1.229M    |
| backbone.stage4.0                 | 1.181M               | 0.472G  |    0.205M    |
| backbone.stage4.1                 | 0.657M               | 0.263G  |    0.307M    |
| backbone.stage4.2                 | 1.452M               | 0.476G  |    0.717M    |
| neck                              | 3.883M               | 4.366G  |    8.141M    |
| neck.reduce_layers.2              | 0.132M               | 52.634M |    0.102M    |
| neck.reduce_layers.2.conv         | 0.131M               | 52.429M |    0.102M    |
| neck.reduce_layers.2.bn           | 0.512K               | 0.205M  |      0       |
| neck.top_down_layers              | 0.491M               | 1.23G   |    4.506M    |
| neck.top_down_layers.0            | 0.398M               | 0.638G  |    1.638M    |
| neck.top_down_layers.1            | 92.608K              | 0.593G  |    2.867M    |
| neck.downsample_layers            | 0.738M               | 0.472G  |    0.307M    |
| neck.downsample_layers.0          | 0.148M               | 0.236G  |    0.205M    |
| neck.downsample_layers.1          | 0.59M                | 0.236G  |    0.102M    |
| neck.bottom_up_layers             | 1.49M                | 0.956G  |    2.15M     |
| neck.bottom_up_layers.0           | 0.3M                 | 0.48G   |    1.434M    |
| neck.bottom_up_layers.1           | 1.19M                | 0.476G  |    0.717M    |
| neck.out_layers                   | 1.033M               | 1.654G  |    1.075M    |
| neck.out_layers.0                 | 0.148M               | 0.945G  |    0.819M    |
| neck.out_layers.1                 | 0.295M               | 0.472G  |    0.205M    |
| neck.out_layers.2                 | 0.59M                | 0.236G  |    51.2K     |
| neck.upsample_layers              |                      | 1.229M  |      0       |
| neck.upsample_layers.0            |                      | 0.41M   |      0       |
| neck.upsample_layers.1            |                      | 0.819M  |      0       |
| bbox_head.head_module             | 0.625M               | 5.053G  |    5.006M    |
| bbox_head.head_module.cls_convs   | 0.296M               | 2.482G  |    2.15M     |
| bbox_head.head_module.cls_convs.0 | 0.295M               | 2.481G  |    2.15M     |
| bbox_head.head_module.cls_convs.1 | 0.512K               | 0.819M  |      0       |
| bbox_head.head_module.cls_convs.2 | 0.512K               | 0.205M  |      0       |
| bbox_head.head_module.reg_convs   | 0.296M               | 2.482G  |    2.15M     |
| bbox_head.head_module.reg_convs.0 | 0.295M               | 2.481G  |    2.15M     |
| bbox_head.head_module.reg_convs.1 | 0.512K               | 0.819M  |      0       |
| bbox_head.head_module.reg_convs.2 | 0.512K               | 0.205M  |      0       |
| bbox_head.head_module.rtm_cls     | 30.96K               | 86.016M |    0.672M    |
| bbox_head.head_module.rtm_cls.0   | 10.32K               | 65.536M |    0.512M    |
| bbox_head.head_module.rtm_cls.1   | 10.32K               | 16.384M |    0.128M    |
| bbox_head.head_module.rtm_cls.2   | 10.32K               | 4.096M  |     32K      |
| bbox_head.head_module.rtm_reg     | 1.548K               | 4.301M  |    33.6K     |
| bbox_head.head_module.rtm_reg.0   | 0.516K               | 3.277M  |    25.6K     |
| bbox_head.head_module.rtm_reg.1   | 0.516K               | 0.819M  |     6.4K     |
| bbox_head.head_module.rtm_reg.2   | 0.516K               | 0.205M  |     1.6K     |

## Usage Example 2: Print related information by network layers

```shell
python tools/analysis_tools/get_flops.py  configs/rtmdet/rtmdet_s_syncbn_fast_8xb32-300e_coco.py --show-arch
```

Due to the complex structure of RTMDet, the output is long.
The following shows only the output from bbox_head.head_module.rtm_reg section:

```python
(rtm_reg): ModuleList(
        #params: 1.55K, #flops: 4.3M, #acts: 33.6K
        (0): Conv2d(
          128, 4, kernel_size=(1, 1), stride=(1, 1)
          #params: 0.52K, #flops: 3.28M, #acts: 25.6K
        )
        (1): Conv2d(
          128, 4, kernel_size=(1, 1), stride=(1, 1)
          #params: 0.52K, #flops: 0.82M, #acts: 6.4K
        )
        (2): Conv2d(
          128, 4, kernel_size=(1, 1), stride=(1, 1)
          #params: 0.52K, #flops: 0.2M, #acts: 1.6K
        )
```