p1atdev commited on
Commit
3a31d34
1 Parent(s): 543e4a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -4
README.md CHANGED
@@ -2,6 +2,7 @@
2
  license: apache-2.0
3
  tags:
4
  - generated_from_trainer
 
5
  metrics:
6
  - accuracy
7
  - f1
@@ -9,6 +10,7 @@ base_model: google/siglip-base-patch16-512
9
  model-index:
10
  - name: siglip-tagger-test-2
11
  results: []
 
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -24,15 +26,66 @@ It achieves the following results on the evaluation set:
24
 
25
  ## Model description
26
 
27
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ## Intended uses & limitations
30
 
31
- More information needed
 
 
 
 
 
 
 
32
 
33
  ## Training and evaluation data
34
 
35
- More information needed
 
36
 
37
  ## Training procedure
38
 
@@ -79,4 +132,4 @@ The following hyperparameters were used during training:
79
  - Transformers 4.37.2
80
  - Pytorch 2.1.2+cu118
81
  - Datasets 2.16.1
82
- - Tokenizers 0.15.0
 
2
  license: apache-2.0
3
  tags:
4
  - generated_from_trainer
5
+ - siglip
6
  metrics:
7
  - accuracy
8
  - f1
 
10
  model-index:
11
  - name: siglip-tagger-test-2
12
  results: []
13
+ pipeline_tag: image-classification
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
26
 
27
  ## Model description
28
 
29
+ This model is an experimental model that predicts danbooru tags of images.
30
+
31
+ ## Example
32
+
33
+ ```py
34
+ from PIL import Image
35
+
36
+ import torch
37
+ from transformers import (
38
+ AutoModelForImageClassification,
39
+ AutoImageProcessor,
40
+ )
41
+ import numpy as np
42
+
43
+ MODEL_NAME = "p1atdev/siglip-tagger-test-2"
44
+
45
+ model = AutoModelForImageClassification.from_pretrained(
46
+ MODEL_NAME, torch_dtype=torch.bfloat16, trust_remote_code=True
47
+ )
48
+ model.eval()
49
+ processor = AutoImageProcessor.from_pretrained(MODEL_NAME)
50
+
51
+ image = Image.open("sample.jpg") # load your image
52
+ inputs = processor(image, return_tensors="pt").to(model.device, model.dtype)
53
+
54
+ logits = model(**inputs).logits.detach().cpu().float()[0]
55
+ logits = np.clip(logits, 0.0, 1.0)
56
+
57
+ results = {
58
+ model.config.id2label[i]: logit for i, logit in enumerate(logits) if logit > 0
59
+ }
60
+ results = sorted(results.items(), key=lambda x: x[1], reverse=True)
61
+
62
+ for tag, score in results:
63
+ print(f"{tag}: {score*100:.2f}%")
64
+ # 1girl: 100.00%
65
+ # outdoors: 100.00%
66
+ # sky: 100.00%
67
+ # solo: 100.00%
68
+ # school uniform: 96.88%
69
+ # skirt: 92.97%
70
+ # day: 89.06%
71
+ # ...
72
+ ```
73
 
74
  ## Intended uses & limitations
75
 
76
+ This model is for research use only and is not recommended for production.
77
+
78
+ Please use wd-v1-4-tagger series by SmilingWolf:
79
+
80
+ - [SmilingWolf/wd-v1-4-moat-tagger-v2](https://huggingface.co/SmilingWolf/wd-v1-4-moat-tagger-v2)
81
+ - [SmilingWolf/wd-v1-4-swinv2-tagger-v2](https://huggingface.co/SmilingWolf/wd-v1-4-swinv2-tagger-v2)
82
+
83
+ etc.
84
 
85
  ## Training and evaluation data
86
 
87
+ High quality 5000 images from danbooru. They were shulled and split into train:eval at 4500:500.
88
+
89
 
90
  ## Training procedure
91
 
 
132
  - Transformers 4.37.2
133
  - Pytorch 2.1.2+cu118
134
  - Datasets 2.16.1
135
+ - Tokenizers 0.15.0