DinoVd'eau is a fine-tuned version of microsoft/resnet-50. It achieves the following results on the test set:

Loss: nan
F1 Micro: 0.0002
F1 Macro: 0.0002
Roc Auc: 0.4995
Accuracy: 0.0003

Model description

DinoVd'eau is a model built on top of dinov2 model for underwater multilabel image classification.The classification head is a combination of linear, ReLU, batch normalization, and dropout layers.

The source code for training the model can be found in this Git repository.

Developed by: lombardata, credits to César Leblanc and Victor Illien

Intended uses & limitations

You can use the raw model for classify diverse marine species, encompassing coral morphotypes classes taken from the Global Coral Reef Monitoring Network (GCRMN), habitats classes and seagrass species.

Training and evaluation data

Details on the number of images for each class are given in the following table:

Class	train	val	test	Total
Acropore_branched	1469	464	475	2408
Acropore_digitised	568	160	160	888
Acropore_sub_massive	150	50	43	243
Acropore_tabular	999	297	293	1589
Algae_assembly	2546	847	845	4238
Algae_drawn_up	367	126	127	620
Algae_limestone	1652	557	563	2772
Algae_sodding	3148	984	985	5117
Atra/Leucospilota	1084	348	360	1792
Bleached_coral	219	71	70	360
Blurred	191	67	62	320
Dead_coral	1979	642	643	3264
Fish	2018	656	647	3321
Homo_sapiens	161	62	59	282
Human_object	157	58	55	270
Living_coral	406	154	141	701
Millepore	385	127	125	637
No_acropore_encrusting	441	130	154	725
No_acropore_foliaceous	204	36	46	286
No_acropore_massive	1031	336	338	1705
No_acropore_solitary	202	53	48	303
No_acropore_sub_massive	1401	433	422	2256
Rock	4489	1495	1473	7457
Rubble	3092	1030	1001	5123
Sand	5842	1939	1938	9719
Sea_cucumber	1408	439	447	2294
Sea_urchins	327	107	111	545
Sponge	269	96	105	470
Syringodium_isoetifolium	1212	392	391	1995
Thalassodendron_ciliatum	782	261	260	1303
Useless	579	193	193	965

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Number of Epochs: 150
Learning Rate: 0.001
Train Batch Size: 32
Eval Batch Size: 32
Optimizer: Adam
LR Scheduler Type: ReduceLROnPlateau with a patience of 5 epochs and a factor of 0.1
Freeze Encoder: Yes
Data Augmentation: Yes

Data Augmentation

Data were augmented using the following transformations :

Train Transforms

PreProcess: No additional parameters
Resize: probability=1.00
RandomHorizontalFlip: probability=0.25
RandomVerticalFlip: probability=0.25
ColorJiggle: probability=0.25
RandomPerspective: probability=0.25
Normalize: probability=1.00

Val Transforms

PreProcess: No additional parameters
Resize: probability=1.00
Normalize: probability=1.00

Training results

Epoch	Validation Loss	Accuracy	F1 Macro	F1 Micro	Learning Rate
1	nan	0.0	0.0	0.0	0.001
2	nan	0.000693000693000693	0.00031409501374165687	0.00040576181781294376	0.001
3	nan	0.0017325017325017325	0.0007850525985241011	0.0010049241282283187	0.001
4	nan	0.0	0.0	0.0	0.001
5	nan	0.0010395010395010396	0.00047177229124076113	0.0006430178973314757	0.001
6	nan	0.0003465003465003465	0.00015712153350616704	0.000206782464846981	0.001
7	nan	0.0	0.0	0.0	0.0001
8	nan	0.0003465003465003465	0.00015710919088766695	0.0002061218179944347	0.0001
9	nan	0.0	0.0	0.0	0.0001
10	nan	0.000693000693000693	0.00031441597233139445	0.0004230565838180856	0.0001
11	nan	0.0	0.0	0.0	0.0001

CO2 Emissions

The estimated CO2 emissions for training this model are documented below:

Emissions: 0.12280230273705112 grams of CO2
Source: Code Carbon
Training Type: fine-tuning
Geographical Location: Brest, France
Hardware Used: NVIDIA Tesla V100 PCIe 32 Go

Framework Versions

Transformers: 4.41.1
Pytorch: 2.3.0+cu121
Datasets: 2.19.1
Tokenizers: 0.19.1

lombardata
/

resnet-50-2024_09_13-batch-size32_epochs150_freeze