Get image(rect on a plane) with perspective correction from camera (2d image)

Question

I want to get the original image from my camera.

This is the image that my camera get. The image that i want is the purple rectangle.

I want to crop the purple rectangle and correct the prespective. This is the image i expect to get.

The image size is unknown. It can be wide or tall.

How can I do this in OpenCV? Any tips, guides? Note that for each marker, I already have the coordinates of each marker corner.(this info might help)

Edit. Some progress.

I learnt that the function I need are getPerspectiveTransform and warpPerspective.

I use both methods with this.

    if (ids.size() == 4)
    {
        array<Point2f, 4> srcCorners;           // corner that we want 
        array<Point2f, 4> srcCornersSmall; 
        array<Point2f, 4> dstCorners;           // destination corner   

        //id  8 14 18 47
        for (size_t i = 0; i < ids.size(); i++)
        {
            // first corner
            if (ids[i] == 8)
            {
                srcCorners[0] = corners[i][0];      // get the first point
                srcCornersSmall[0] = corners[i][2];
            }
            // second corner
            else if (ids[i] == 14)
            {
                srcCorners[1] = corners[i][1];      // get the second point
                srcCornersSmall[1] = corners[i][3];
            }
            // third corner
            else if (ids[i] == 18)
            {
                srcCorners[2] = corners[i][2];      // get the thirt point
                srcCornersSmall[2] = corners[i][0];
            }
            // fourth corner
            else if (ids[i] == 47)
            {
                srcCorners[3] = corners[i][3];      // get the fourth point
                srcCornersSmall[3] = corners[i][1];
            }
        }

        dstCorners[0] = Point2f(0.0f, 0.0f);
        dstCorners[1] = Point2f(256.0f, 0.0f);
        dstCorners[2] = Point2f(256.0f, 256.0f);
        dstCorners[3] = Point2f(0.0f, 256.0f);

        // get perspectivetransform
        Mat M = getPerspectiveTransform(srcCorners, dstCorners);

        // warp perspective
        Mat dst;
        Size dsize = Size(cvRound(dstCorners[2].x), cvRound(dstCorners[2].y));
        warpPerspective(imageCopy, dst, M, dsize);

        // show
        imshow("perspective transformed", dst);

    }

While I do get the image that I want(almost), the image is not in the correct width/height ratio.

This is the output that I get.

How do I correct the width height ratio?

Do you have all of the corners of at least one of the corners squares? If so you can get the homography that maps that squished rectangle from the corner to an actual square, and apply that to the whole image. — alkasm, Jan 04 '19 at 18:59
Yes. I have all the corners of the marker. I see what you mean. That also means i dont have to use getPerspectiveTransform right? Im gonna try that now. — Syaiful Nizam Yahya, Jan 04 '19 at 19:05
@AlexanderReynolds Ok. I have the homography matrix now. What do I do with it? How do I get the aspect ratio? — Syaiful Nizam Yahya, Jan 04 '19 at 20:00
You can do two things; you can simply check the aspect ratio of the box with the corners (just take the h/w) and then stretch the image so that those values are square. For e.g. if a corner square has a height of 10 pixels and a width of 15 pixels, then you can just use `resize()` and stretch the height by 1.5x. Or shrink the width by 0.66x. So you can multiply/divide by your aspect ratio. Alternatively, if you have the homography that maps a rectangle to a square, you *could* just use `warpPerspective()` on the image with that homography, but you'd need to know the destination size. — alkasm, Jan 04 '19 at 20:46
You can calculate the destination size too...but resize might be the better option here for simplicity? I would personally measure the (pixel) distance from two opposite corners (i.e. the diagonal of the sheet) and use that to scale the final image so there's not tons extra or fewer pixels. — alkasm, Jan 04 '19 at 20:46
I would like to try to avoid measuring the pixel distance and would prefer working on points. Reason being, the actual situation is not like the sample image I provided. There might be pictures inside the 4 markers. I have uploaded my code here https://gist.github.com/syaifulnizamyahya/12deb740a31880923b1512e16d054362. — Syaiful Nizam Yahya, Jan 04 '19 at 21:03
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/186205/discussion-between-syaiful-nizam-yahya-and-alexander-reynolds). — Syaiful Nizam Yahya, Jan 04 '19 at 21:07

score 0 · Answer 1 · answered Jan 07 '19 at 04:53

Finally got it. The idea is to draw the marker as white box on a black image. Then crop the image that we want and draw it in a new image. Since the correct size for the new image is unknown, we just set the size as square. The new image should be black image with white boxes at the corner. Starting from (0,0) we then cross the image and check for the pixel value. The pixel value should be white. If the pixel value is black, we are outside the white box. Trace back the pixel value along x and y because the white box might be tall or wide. Once we find the bottom right of the white box, we have the size of the white box. Rescale this white box to square. Use the same function to rescale the image.

This is the image captured by camera

Draw the marker as white box in a black image.

Crop and warped into a square.

Get the width and height of the white box in top left corner. Once we have the scale function, apply it.

In case anyone interested, here are the codes.

// Get3dRectFrom2d.cpp : This file contains the 'main' function. Program execution begins and ends there.
//

#include "pch.h"
#include <iostream>
#include <opencv2/opencv.hpp>
#include <opencv2/aruco.hpp>

#define CAMERA_WINDOW "Simple ArUco"

using namespace std;
using namespace cv;

static bool readCameraParameters(string filename, Mat &camMatrix, Mat &distCoeffs) {
    FileStorage fs(filename, FileStorage::READ);
    if (!fs.isOpened())
        return false;
    fs["camera_matrix"] >> camMatrix;
    fs["distortion_coefficients"] >> distCoeffs;
    return true;
}

int main()
{
    Mat camMatrix, distCoeffs;
    string cameraSettings = "camera.txt";
    bool estimatePose = false;
    bool showRejected = true;
    if (readCameraParameters(cameraSettings, camMatrix, distCoeffs))
    {
        estimatePose = true;
    }
    Ptr<aruco::Dictionary> dictionary =
        aruco::getPredefinedDictionary(aruco::PREDEFINED_DICTIONARY_NAME(aruco::DICT_4X4_50));
    Ptr<aruco::DetectorParameters> detectorParams = aruco::DetectorParameters::create();
    float markerLength = 3.75f;
    float markerSeparation = 0.5f;
    double totalTime = 0;
    int totalIterations = 0;
    VideoCapture inputVideo(0);

    if (!inputVideo.isOpened())
    {
        cout << "cannot open camera";
    }

    double prevW = -1, prevH = -1;
    double increment = 0.1;

    while (inputVideo.grab())
    {
        Mat image, imageCopy;
        inputVideo.retrieve(image);

        double tick = (double)getTickCount();

        vector< int > ids;
        vector< vector< Point2f > > corners, rejected;
        vector< Vec3d > rvecs, tvecs;

        // detect markers and estimate pose
        aruco::detectMarkers(image, dictionary, corners, ids, detectorParams, rejected);
        if (estimatePose && ids.size() > 0)
            aruco::estimatePoseSingleMarkers(corners, markerLength, camMatrix, distCoeffs, rvecs,
                tvecs);

        double currentTime = ((double)getTickCount() - tick) / getTickFrequency();
        totalTime += currentTime;
        totalIterations++;
        if (totalIterations % 30 == 0) {
            cout << "Detection Time = " << currentTime * 1000 << " ms "
                << "(Mean = " << 1000 * totalTime / double(totalIterations) << " ms)" << endl;
        }

        // draw results
        image.copyTo(imageCopy);
        if (ids.size() > 0) {
            aruco::drawDetectedMarkers(imageCopy, corners, ids);

            if (estimatePose) {
                for (unsigned int i = 0; i < ids.size(); i++)
                    aruco::drawAxis(imageCopy, camMatrix, distCoeffs, rvecs[i], tvecs[i],
                        markerLength * 0.5f);
            }
        }

        if (ids.size() == 4)
        {
            if (true)
            {
                // process the image
                array<Point2f, 4> srcCorners;           // corner that we want 
                array<Point2f, 4> dstCorners;           // destination corner   
                vector<Point> marker0;          // marker corner
                vector<Point> marker1;          // marker corner
                vector<Point> marker2;          // marker corner
                vector<Point> marker3;          // marker corner

                //id  8 14 18 47
                for (size_t i = 0; i < ids.size(); i++)
                {
                    // first corner
                    if (ids[i] == 8)
                    {
                        srcCorners[0] = corners[i][0];      // get the first point
                        //srcCornersSmall[0] = corners[i][2];

                        marker0.push_back(corners[i][0]);
                        marker0.push_back(corners[i][1]);
                        marker0.push_back(corners[i][2]);
                        marker0.push_back(corners[i][3]);
                    }
                    // second corner
                    else if (ids[i] == 14)
                    {
                        srcCorners[1] = corners[i][1];      // get the second point
                        //srcCornersSmall[1] = corners[i][3];

                        marker1.push_back(corners[i][0]);
                        marker1.push_back(corners[i][1]);
                        marker1.push_back(corners[i][2]);
                        marker1.push_back(corners[i][3]);
                    }
                    // third corner
                    else if (ids[i] == 18)
                    {
                        srcCorners[2] = corners[i][2];      // get the thirt point
                        //srcCornersSmall[2] = corners[i][0];

                        marker2.push_back(corners[i][0]);
                        marker2.push_back(corners[i][1]);
                        marker2.push_back(corners[i][2]);
                        marker2.push_back(corners[i][3]);
                    }
                    // fourth corner
                    else if (ids[i] == 47)
                    {
                        srcCorners[3] = corners[i][3];      // get the fourth point
                        //srcCornersSmall[3] = corners[i][1];

                        marker3.push_back(corners[i][0]);
                        marker3.push_back(corners[i][1]);
                        marker3.push_back(corners[i][2]);
                        marker3.push_back(corners[i][3]);
                    }
                }

                // create a black image with the same size of cam image
                Mat mask = Mat::zeros(imageCopy.size(), CV_8UC1);
                Mat dstImage = Mat::zeros(imageCopy.size(), CV_8UC1);

                // draw white fill on marker corners
                {
                    int num = (int)marker0.size();
                    if (num != 0)
                    {
                        const Point * pt4 = &(marker0[0]);
                        fillPoly(mask, &pt4, &num, 1, Scalar(255, 255, 255), 8);
                    }
                }
                {
                    int num = (int)marker1.size();
                    if (num != 0)
                    {
                        const Point * pt4 = &(marker1[0]);
                        fillPoly(mask, &pt4, &num, 1, Scalar(255, 255, 255), 8);
                    }
                }
                {
                    int num = (int)marker2.size();
                    if (num != 0)
                    {
                        const Point * pt4 = &(marker2[0]);
                        fillPoly(mask, &pt4, &num, 1, Scalar(255, 255, 255), 8);
                    }
                }
                {
                    int num = (int)marker3.size();
                    if (num != 0)
                    {

                        const Point * pt4 = &(marker3[0]);
                        fillPoly(mask, &pt4, &num, 1, Scalar(255, 255, 255), 8);
                    }
                }

                // draw the mask
                imshow("black white lines", mask);

                // we dont have the correct size/aspect ratio
                double width = 256.0f, height = 256.0f;
                dstCorners[0] = Point2f(0.0f, 0.0f);
                dstCorners[1] = Point2f(width, 0.0f);
                dstCorners[2] = Point2f(width, height);
                dstCorners[3] = Point2f(0.0f, height);

                // get perspectivetransform
                Mat M = getPerspectiveTransform(srcCorners, dstCorners);

                // warp perspective
                Mat dst;
                Size dsize = Size(cvRound(dstCorners[2].x), cvRound(dstCorners[2].y));
                warpPerspective(mask, dst, M, dsize);

                // show warped image
                imshow("perspective transformed", dst);

                // get width and length of the first marker
                // start from (0,0) and cross 
                int cx = 0, cy = 0; // track our current coordinate
                Scalar v, vx, vy; // pixel value at coordinate
                bool cont = true;
                while (cont)
                {
                    v = dst.at<uchar>(cx, cy); // get pixel value at current coordinate
                    if (cx > 1 && cy > 1) 
                    {
                        vx = dst.at<uchar>(cx - 1, cy);
                        vy = dst.at<uchar>(cx, cy - 1);
                    }

                    // if pixel not black, continue crossing
                    if ((int)v.val[0] != 0)
                    {
                        cx++;
                        cy++;
                    }

                    // current pixel is black
                    // if previous y pixel is not black, means that we need to walk the pixel right
                    else if ((int)((Scalar)dst.at<uchar>(cx, cy - 1)).val[0] != 0)
                    {
                        cx = cx + 1;
                    }
                    // if previous x pixel is not black, means that we need to walk the pixel down
                    else if ((int)((Scalar)dst.at<uchar>(cx - 1, cy)).val[0] != 0)
                    {
                        cy = cy + 1;
                    }

                    // the rest is the same with previous 2, only with higher previous pixel to check
                    // need to do this because sometimes pixels is jagged
                    else if ((int)((Scalar)dst.at<uchar>(cx, cy - 2)).val[0] != 0)
                    {
                        cx = cx + 1;
                    }
                    else if ((int)((Scalar)dst.at<uchar>(cx - 2, cy)).val[0] != 0)
                    {
                        cy = cy + 1;
                    }

                    else if ((int)((Scalar)dst.at<uchar>(cx, cy - 3)).val[0] != 0)
                    {
                        cx = cx + 1;
                    }
                    else if ((int)((Scalar)dst.at<uchar>(cx - 3, cy)).val[0] != 0)
                    {
                        cy = cy + 1;
                    }

                    else if ((int)((Scalar)dst.at<uchar>(cx, cy - 4)).val[0] != 0)
                    {
                        cx = cx + 1;
                    }
                    else if ((int)((Scalar)dst.at<uchar>(cx - 4, cy)).val[0] != 0)
                    {
                        cy = cy + 1;
                    }

                    else if ((int)((Scalar)dst.at<uchar>(cx, cy - 5)).val[0] != 0)
                    {
                        cx = cx + 1;
                    }
                    else if ((int)((Scalar)dst.at<uchar>(cx - 5, cy)).val[0] != 0)
                    {
                        cy = cy + 1;
                    }

                    else
                    {
                        cx = cx - 1;
                        cy = cy - 1;
                        cont = false;
                    }

                    // reached the end of the picture
                    if (cx >= dst.cols)
                    {
                        cont = false;
                    }
                    else if (cy >= dst.rows)
                    {
                        cont = false;
                    }
                }

                if (cx == cy)
                {
                    //we have perfect square
                }
                if (cx > cy)
                {
                    // wide
                    width = (height * ((double)cx / (double)cy));
                }
                else
                {
                    // tall
                    height = (width * ((double)cy / (double)cx));
                }

                // we dont want the size varied too much every frame, 
                // so limits the increment or decrement for every frame
                // initialize first usage
                if (prevW<0)
                {
                    prevW = width;
                }
                if (prevH<0)
                {
                    prevH = height;
                }

                if (width > prevW + increment)
                {
                    width = prevW + increment;
                }
                else if (width < prevW - increment)
                {
                    width = prevW - increment;
                }
                prevW = width;

                if (height > prevH + increment)
                {
                    height = prevH + increment;
                }
                else if (height < prevH - increment)
                {
                    height = prevH - increment;
                }
                prevH = height;

                // show resized image
                Size s(width, height);
                Mat resized;
                resize(dst, resized, s);
                imshow("resized", resized);
            }
        }

        if (showRejected && rejected.size() > 0)
            aruco::drawDetectedMarkers(imageCopy, rejected, noArray(), Scalar(100, 0, 255));

        imshow("out", imageCopy);
        if (waitKey(1) == 27) {
            break;
        }
    }
    cout << "Hello World!\n";
    cin.ignore();
    return 0;
}

I'm more interested in a mathematical solution but for now, this suffice. If you guys know a much better approach(faster) let me know.

Did you find a mathematical solution? I'm facing the same problem. — Timo, Jul 06 '23 at 00:54

Get image(rect on a plane) with perspective correction from camera (2d image)

1 Answers1

Linked