I'm working with Python 3.8.10, OpenCV version 4.3.0 and Cuda 10.2 on Ubuntu 20.04. I generated a weights file with Yolov3 for 23 objects that I want to detect in my images. It all works fine and I can draw beautiful boxes in Python around objects whose detection confidence lies above a certain threshold value.
However, it takes more than half a second to loop through all outputs provided by
outputs = net.forward(outputLayers)
when filtering for results above a certain confidence level.
Here's my loop:
boxes = []
confs = []
class_ids = []
for output in outputs: 
     for detect in output:
            scores = detect[5:]
            class_id = np.argmax(scores)
            conf = scores[class_id]
            if conf > 0.7:
                center_x = int(detect[0] * width)
                center_y = int(detect[1] * height)
                w = int(detect[2] * width)
                h = int(detect[3] * height)
                x = int(center_x - w/2)
                y = int(center_y - h / 2)
                boxes.append([x, y, w, h])
                confs.append(float(conf))
                class_ids.append(class_id)
The reason why it takes so long is due to the size of outputs. It seems like all possible detections, no matter of confidence, are returned when calling net.forward(outputLayers). In my case, these are more than 30000 elements that I have to loop through.
Is there any way to throw out detections below a certain confidence level while the model still resides on the GPU? net.forward() doesn't seem to allow any filtering, as far as I could find out. Any ideas would be highly appreciated!