jzju
/

dit-doclaynet

Image Segmentation

Inference Endpoints

Model card Files Files and versions Community

dit-doclaynet / README.md

jzju's picture

fix commit

f148c75 8 months ago

|

736 Bytes

	---
	library_name: transformers
	pipeline_tag: image-segmentation
	tags:
	- vision
	- image-segmentation
	- dit
	datasets:
	- ds4sd/DocLayNet-v1.1
	widget:
	- src: >-
	https://upload.wikimedia.org/wikipedia/commons/c/c3/LibreOffice_Writer_6.3.png
	example_title: Wiki
	---

	Trained for 4 epochs.

	```
	model = BeitForSemanticSegmentation.from_pretrained("microsoft/dit-base", num_labels=11)
	ds = load_dataset("ds4sd/DocLayNet-v1.1")
	mask = np.zeros([11, 1025, 1025])
	for b, c in zip(d["bboxes"], d["category_id"]):
	b = [np.clip(int(bb), 0, 1025) for bb in b]
	mask[c - 1][b[1]:b[1]+b[3], b[0]:b[0]+b[2]] = 1
	mask = [cv2.resize(a, dsize=(56, 56), interpolation=cv2.INTER_AREA) for a in mask]
	d["label"] = np.stack(mask)
	```