Template matching from screen capture

Question

I'm new to Python but want to learn it a bit so I decided to create a program with template matching from desktop input.

Can any one help with this ? How to write template matching with stream from desktop ?

import time

import cv2
import mss
import numpy

template = cv2.imread('template.jpg', 0)
w, h = template.shape[::-1]

with mss.mss() as sct:
    # Part of the screen to capture
    monitor = {"top": 40, "left": 0, "width": 800, "height": 640}

    while "Screen capturing":
        last_time = time.time()

        # Get raw pixels from the screen, save it to a Numpy array
        img = numpy.array(sct.grab(monitor))

        # Display the picture
        # cv2.imshow("OpenCV/Numpy normal", img)

        # Display the picture in grayscale
        cv2.imshow('OpenCV/Numpy grayscale', cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY))

        # Print fps
        print("fps: {}".format(1 / (time.time() - last_time)))

        # Search template in stream

        # Press "q" to quit
        if cv2.waitKey(25) & 0xFF == ord("q"):
            cv2.destroyAllWindows()
            break

score 5 · Accepted Answer · answered Sep 28 '20 at 20:42

The first thing I noticed that you did not apply any edge-detection to your template image. The edge-detection is not necessary but useful for finding the features of the template image.

Assume I have the following image:

To detect the above template image precisely I should be applying an edge detection algorithm.

template = cv2.imread("three.png")
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
template = cv2.Canny(template, 50, 200)

I should also apply edge detection to the stream from desktop.

img = sct.grab(mon)
gray = cv2.cvtColor(np.array(img), cv2.COLOR_BGR2GRAY)
edged = cv2.Canny(gray, 50, 200)

Check if the template matches with the captured image

result = cv2.matchTemplate(edged, template, cv2.TM_CCOEFF)
(_, maxVal, _, maxLoc) = cv2.minMaxLoc(result)

If template image matched in the stream from desktop then get the coordinates.

(_, maxLoc, r) = found
(startX, startY) = (int(maxLoc[0] * r), int(maxLoc[1] * r))
(endX, endY) = (int((maxLoc[0] + w) * r), int((maxLoc[1] + h) * r))

Finally draw the rectangle for displaying the location:

cv2.rectangle(img, (startX, startY), (endX, endY), (180, 105, 255), 2)

Result:

From above we see that the our template 3 value is matched on the stream from desktop.

Code:

import time
import cv2
import numpy as np
import imutils
from mss import mss

template = cv2.imread("three.png")
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
template = cv2.Canny(template, 50, 200)
(h, w) = template.shape[:2]

start_time = time.time()
mon = {'top': 200, 'left': 200, 'width': 200, 'height': 200}
with mss() as sct:
    while True:
        last_time = time.time()
        img = sct.grab(mon)
        img = np.array(img)
        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        edged = cv2.Canny(gray, 50, 200)
        
        found = None

        for scale in np.linspace(0.2, 1.0, 20)[::-1]:
            resized = imutils.resize(gray, width=int(gray.shape[1] * scale))
            r = gray.shape[1] / float(resized.shape[1])

            if resized.shape[0] < h or resized.shape[1] < w:
                break

            edged = cv2.Canny(resized, 50, 200)
            cv2.imwrite("canny_image.png", edged)
            result = cv2.matchTemplate(edged, template, cv2.TM_CCOEFF)
            (_, maxVal, _, maxLoc) = cv2.minMaxLoc(result)

            if found is None or maxVal > found[0]:
                found = (maxVal, maxLoc, r)

        (_, maxLoc, r) = found
        (startX, startY) = (int(maxLoc[0] * r), int(maxLoc[1] * r))
        (endX, endY) = (int((maxLoc[0] + w) * r), int((maxLoc[1] + h) * r))

        cv2.rectangle(img, (startX, startY), (endX, endY), (180, 105, 255), 2)

        print('The loop took: {0}'.format(time.time()-last_time))
        cv2.imshow('test', np.array(img))

        if cv2.waitKey(25) & 0xFF == ord('q'):
            cv2.destroyAllWindows()
            break

Can you elaborate on why is canny (edge detection) necessary here? If I am understanding correctly, canny helps you remove noise so you only keep the relevant pixels of the image you are matching. So what it does here in a sense is only removing the background as per the threshold of your choice? — ferreiradev, May 21 '22 at 02:33
Canny is not necessary here. If you re-read my answer you can see the line "The edge-detection is not necessary but useful for finding the features of the template image." You can use other edge-detection algorithms like gaussian-blur, sobel, scharr etc. My aim was to find the features of the current selected area. Therefore I applied edge detection. For more https://docs.opencv.org/4.x/d2/d96/tutorial_py_table_of_contents_imgproc.html — Ahmet, May 21 '22 at 17:39

CopyPasteIt · Answer 2 · 2019-11-05T14:09:09.310

Let untitled.png be a file storing the image

Here is a working program. I used the following to put it together,

OpenCV Org: Template Matching

Taking screenshots with OpenCV and Python

OpenCV python: ValueError: too many values to unpack

import os
import cv2               as cv
import numpy             as np
import pyautogui
import time
import winsound          # for sound
from matplotlib          import pyplot as plt

os.chdir("C:\\Users\\Mike\\\Desktop")
img = cv.imread('untitled.png',0)
img_piece = cv.cvtColor(img, cv.COLOR_RGB2BGR)
c, w, h = img_piece.shape[::-1]

while 1:
    pic = pyautogui.screenshot()
    template  = cv.cvtColor(np.array(pic), cv.COLOR_RGB2BGR)
    meth = 'cv.TM_CCOEFF'
    method = eval(meth)
    # Apply template Matching
    res = cv.matchTemplate(img_piece,template,method)
    min_val, max_val, min_loc, max_loc = cv.minMaxLoc(res)
    top_left = max_loc
    bottom_right = (top_left[0] + w, top_left[1] + h)
    cv.rectangle(img,top_left, bottom_right, 255, 2)
    if max_val > 66000000.0:
        print(max_val, top_left, bottom_right)
        winsound.Beep(888, 111)
        if 1:
            plt.subplot(121),plt.imshow(res,cmap = 'gray')
            plt.title('Matching Result'), plt.xticks([]), plt.yticks([])
            plt.subplot(122),plt.imshow(img,cmap = 'gray')
            plt.title('Detected Point'), plt.xticks([]), plt.yticks([])
            plt.suptitle(meth)
            plt.show()        
        break
    time.sleep(1)

score 1 · Answer 3 · answered May 22 '19 at 20:21

Below is the basic code to perform a match template. Place it below img = numpy.array(sct.grab(monitor)) and it will run every frame.

# create grayscale of image - because template is also grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# perform match
res = cv2.matchTemplate(gray,template ,cv2.TM_CCOEFF)
# get coordinates of best match
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
top_left = min_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
# draw red rectangle over original screen capture
cv2.rectangle(img,top_left, bottom_right,(0,0,255),3)
# display image
cv2.imshow('Result',img)

You can find some more info on matchTemplate here

Template matching from screen capture

3 Answers3

Linked