KB-VQA

Running

KB-VQA / my_model /config /captioning_config.py

Update my_model/config/captioning_config.py

ccefb55 verified 8 months ago

1.07 kB

	import torch

	# Configuration parameters

	MODEL_TYPE = "i_blip"

	MAX_IMAGE_SIZE = 1024

	MIN_LENGTH = 150

	MAX_NEW_TOKENS = 400

	MODEL_PATH = "m7mdal7aj/captioner"

	LOAD_IN_8BIT = False

	LOAD_IN_4BIT = True

	TORCH_DTYPE = torch.float16

	DEVICE_MAP = "auto"

	LOW_CPU_MEM_USAGE = True

	SKIP_SPECIAL_TOKENS = True


	PROMPT = 'Provide a comprehensive and detailed description of the following image. Focus on identifying and describing every element in the scene, including all people (along with their gender, age, color and any prominent feature), animals along with their breed and all objects, their count, their positions, and any actions or interactions taking place. Pay special attention to the positioning of limbs and hands, and any objects they might be holding or interacting with. Describe texts, colors, textures, setting, atmosphere, mood, and any indicators of the time of day, such as the quality of light, shadows. Ensure to capture both the obvious and subtle elements for a complete understanding of the image. Answer as if you were looking at the image'