import subprocess import tempfile import time from pathlib import Path import cv2 import gradio as gr from inferer import Inferer pipeline = Inferer("nateraw/yolov6s", device='cuda') print(f"GPU on? {'🟢' if pipeline.device.type != 'cpu' else '🔴'}") def fn_image(image, conf_thres, iou_thres): return pipeline(image, conf_thres, iou_thres) def fn_video(video_file, conf_thres, iou_thres, start_sec, duration): start_timestamp = time.strftime("%H:%M:%S", time.gmtime(start_sec)) end_timestamp = time.strftime("%H:%M:%S", time.gmtime(start_sec + duration)) suffix = Path(video_file).suffix clip_temp_file = tempfile.NamedTemporaryFile(suffix=suffix) subprocess.call( f"ffmpeg -y -ss {start_timestamp} -i {video_file} -to {end_timestamp} -c copy {clip_temp_file.name}".split() ) # Reader of clip file cap = cv2.VideoCapture(clip_temp_file.name) # This is an intermediary temp file where we'll write the video to # Unfortunately, gradio doesn't play too nice with videos rn so we have to do some hackiness # with ffmpeg at the end of the function here. with tempfile.NamedTemporaryFile(suffix=".mp4") as temp_file: out = cv2.VideoWriter(temp_file.name, cv2.VideoWriter_fourcc(*"MP4V"), 120, (1280, 720)) num_frames = 0 max_frames = duration * 120 while cap.isOpened(): try: ret, frame = cap.read() if not ret: break except Exception as e: print(e) continue out.write(pipeline(frame, conf_thres, iou_thres)) num_frames += 1 print("Processed {} frames".format(num_frames)) if num_frames == max_frames: break out.release() # Aforementioned hackiness out_file = tempfile.NamedTemporaryFile(suffix="out.mp4", delete=False) subprocess.run(f"ffmpeg -y -loglevel quiet -stats -i {temp_file.name} -c:v libx264 {out_file.name}".split()) return out_file.name image_interface = gr.Interface( fn=fn_image, inputs=[ "image", gr.Slider(0, 1, value=0.5, label="Confidence Threshold"), gr.Slider(0, 1, value=0.5, label="IOU Threshold"), ], outputs=gr.Image(type="file"), examples=[["example_1.jpg", 0.5, 0.5], ["example_2.jpg", 0.25, 0.45], ["example_3.jpg", 0.25, 0.45]], title="Human Detection", description=( "Gradio demo for Human detection on images. To use it, simply upload your image or click one of the" " examples to load them." ), allow_flagging=False, allow_screenshot=False, ) video_interface = gr.Interface( fn=fn_video, inputs=[ gr.Video(type="file"), gr.Slider(0, 1, value=0.25, label="Confidence Threshold"), gr.Slider(0, 1, value=0.45, label="IOU Threshold"), gr.Slider(0, 120, value=0, label="Start Second", step=1), gr.Slider(0, 120 if pipeline.device.type != 'cpu' else 60, value=120, label="Duration", step=1), ], outputs=gr.Video(type="file", format="mp4"), examples=[ ["example_1.mp4", 0.25, 0.45, 0, 2], ["example_2.mp4", 0.25, 0.45, 5, 3], ["example_3.mp4", 0.25, 0.45, 6, 3], ["classroom.mp4", 0.25, 0.45, 5, 3], ], title="Human Detection", description=( "Gradio demo for Human detection on videos. To use it, simply upload your video or click one of the" " examples to load them." ), allow_flagging=False, allow_screenshot=False, ) webcam_interface = gr.Interface( fn_image, inputs=[ gr.Image(source='webcam', streaming=True), gr.Slider(0, 1, value=0.5, label="Confidence Threshold"), gr.Slider(0, 1, value=0.5, label="IOU Threshold"), ], outputs=gr.Image(type="file"), live=True, title="Human Detection", description=( "Gradio demo for Human detection on real time webcam. To use it, simply allow the browser to access" " your webcam." ), allow_flagging=False, allow_screenshot=False, ) if __name__ == "__main__": gr.TabbedInterface( [video_interface, image_interface, webcam_interface], ["Run on Videos!", "Run on Images!", "Run on Webcam!"], ).launch()