File size: 4,756 Bytes
0b7b08a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
import webdataset as wds
import glob
import os
from tqdm import tqdm
import orjson as json
import itertools
from PIL import Image
import numpy as np
from typing import List
import cv2
import random
from tqdm.contrib.concurrent import process_map
from copy import deepcopy

class Generator():
    def __init__(self, dataset_name):
        self.dataset_name = dataset_name
        self.is_end = False


class VisualGenomeGenerator(Generator):
    def __init__(self, root: str):
        super().__init__(dataset_name="vg")
        data = json.loads(open(os.path.join(root, "region_descriptions.json")).read())
        image_data = json.loads(open(os.path.join(root, "image_data.json")).read())
        self.image_id_to_filename = {}
        self.image_id_to_wh = {}
        for image in image_data:
            image_id = image["image_id"]
            subfolder, filename = image['url'].split("/")[-2:]
            self.image_id_to_filename[image_id] = os.path.join(root, subfolder, filename)
            self.image_id_to_wh[image_id] = (image["width"], image["height"])
        self.regions = []
        total = 0
        total_image = 0
        used_image = 0
        for xx in tqdm(data):
            total_image += 1
            flag = False
            for region in xx['regions']:
                total += 1
                region_w = int(region["width"])
                region_h = int(region["height"])
                x = int(region["x"])
                y = int(region["y"])
                image_w = self.image_id_to_wh[region["image_id"]][0]
                image_h = self.image_id_to_wh[region["image_id"]][1]
                region_w /= image_w
                region_h /= image_h
                x /= image_w
                y /= image_h
                if region_w * region_h < 1 / (16*16*4):
                    continue
                if " is" in region["phrase"] or " are" in region["phrase"]:
                    continue
                region["norm_xywh"] = (x, y, region_w, region_h)
                self.regions.append(region)
                flag = True
            if flag:
                used_image += 1
        random.shuffle(self.regions)
        print("valid region", len(self.regions), total, len(self.regions) / total)
        print("valid image", used_image, total_image, used_image / total_image)

    def __len__(self):
        return len(self.regions)

    def __iter__(self):
        for region in self.regions:
            image_id = region["image_id"]
            phrase = region["phrase"]
            try:
                image = Image.open(self.image_id_to_filename[image_id])
            except:
                continue
            image = image.resize((224, 224))
            x, y, region_w, region_h = region["norm_xywh"]
            x1 = int(x * 224)
            y1 = int(y * 224)
            x2 = int(x1 + region_w * 224)
            y2 = int(y1 + region_h * 224)
            yield [self.dataset_name, image, phrase, np.array([x1, y1, x2, y2]), image_id]
        self.is_end = True


def handle(args):
    dataset_name = "vg"
    iii, regions, image_id_to_filename = args
    if iii == 0:
        print(regions[:10])
    os.makedirs(os.path.join(OUT_DIR, str(iii)), exist_ok=True)
    with wds.ShardWriter(os.path.join(OUT_DIR, str(iii), "%06d.tar"), maxcount=8500) as sink:
        sink.verbose = False
        for i, region in enumerate(tqdm(regions, disable=(iii != 0))):
            image_id = region["image_id"]
            phrase = region["phrase"]
            image = Image.open(image_id_to_filename[image_id])
            image = image.resize((224, 224))
            x, y, region_w, region_h = region["norm_xywh"]
            x1 = int(x * 224)
            y1 = int(y * 224)
            x2 = int(x1 + region_w * 224)
            y2 = int(y1 + region_h * 224)
            dataset_name, image, caption, xyxy, image_id = [dataset_name, image, phrase, np.array([x1, y1, x2, y2]), image_id]
            sink.write({"__key__": f"{dataset_name}_{i}_containBox", "jpg": image, "txt": caption, "boxes.pyd": xyxy, "logits.pyd": xyxy})
            if i % 200 == 0 and iii == 0:
                tqdm.write(f"{caption} {xyxy}")


if __name__ == "__main__":
    OUT_DIR = "/gpfs/u/home/LMCG/LMCGljnn/scratch-shared/junyan/raw/vg_0826"
    os.makedirs(OUT_DIR, exist_ok=True)
    visual_genome_generator = VisualGenomeGenerator("/gpfs/u/home/LMCG/LMCGljnn/scratch/datasets/raw/vg")
    N_PROC = 150
    data_list = []
    for i in range(N_PROC):
        data_list.append([i, [], deepcopy(visual_genome_generator.image_id_to_filename)])
    for i, region in enumerate(visual_genome_generator.regions):
        data_list[i % N_PROC][1].append(region)
    process_map(handle, data_list, max_workers=N_PROC, disable=True)