在没有关于相机的信息的情况下从2个图像进行三维重建

3d reconstruction from 2 images without info about the camera

本文关键字：情况下信息 2个图像三维重建于相机相机更新时间：2023-10-16

我是这个领域的新手，我正在尝试用2d图像中的3d来建模一个简单的场景，但我没有任何关于相机的信息。我知道有三种选择：

我有两张图像，我知道我从XML加载的相机模型（内部），例如loadXMLFromFile()=>stereoRectify()=>reprojectImageTo3D()
我没有，但我可以校准我的相机=>stereoCalibrate()=>stereoRectify()=>reprojectImageTo3D()
我无法校准相机（这是我的情况，因为我没有拍摄过这两张图像的相机，然后我需要用SURF、SIFT在两张图像上找到成对的关键点（实际上我可以使用任何斑点检测器），然后计算这些关键点的描述符，然后根据它们的描述符匹配图像右侧和图像左侧的关键点，然后从中找到基本矩阵。处理要困难得多，可能是这样的：
1. 检测关键点（SURF、SIFT）=>
2. 提取描述符（SURF、SIFT）=>
3. 比较和匹配描述符（BruteForce、基于Flann的方法）=>
4. 从这些对中找到基本矩阵（findFundamentalMat()）=>
5. stereoRectifyUncalibrated()=>
6. reprojectImageTo3D()

我使用最后一种方法，我的问题是：

1）对吗？

2）如果可以的话，我对最后一步stereoRectifyUncalibrated()=>reprojectImageTo3D()有疑问。reprojectImageTo3D()函数的签名为：

void reprojectImageTo3D(InputArray disparity, OutputArray _3dImage, InputArray Q, bool handleMissingValues=false, int depth=-1 )
cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true) (in my code)

参数：

disparity–输入单通道8位无符号、16位有符号、32位有符号或32位浮点视差图像
_3dImage–输出与disparity大小相同的3通道浮点图像。CCD_ 16的每个元素包含根据视差图计算的点CCD_ 17的3D坐标
Q–用stereoRectify()可以得到的4x4透视变换矩阵
handleMissingValues–指示函数是否应处理缺失值（即未计算视差的点）。如果是handleMissingValues=true，则具有与异常值相对应的最小视差的像素（参见StereoBM::operator()）被转换为具有非常大的Z值（当前设置为10000）的3D点
ddepth–可选的输出阵列深度。如果为-1，则输出图像将具有CV_32F深度。ddepth也可以设置为CV_16S、CV_32S或"CV_32F"

如何获取Q矩阵？是否可以用F、H1和H2或以其他方式获得Q矩阵？

3）有没有其他方法可以在不校准相机的情况下获得xyz坐标？

我的代码是：

#include <opencv2/core/core.hpp>
#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/contrib/contrib.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <stdio.h>
#include <iostream>
#include <vector>
#include <conio.h>
#include <opencv/cv.h>
#include <opencv/cxcore.h>
#include <opencv/cvaux.h>

using namespace cv;
using namespace std;
int main(int argc, char *argv[]){
    // Read the images
    Mat imgLeft = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
    Mat imgRight = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );
    // check
    if (!imgLeft.data || !imgRight.data)
            return 0;
    // 1] find pair keypoints on both images (SURF, SIFT):::::::::::::::::::::::::::::
    // vector of keypoints
    std::vector<cv::KeyPoint> keypointsLeft;
    std::vector<cv::KeyPoint> keypointsRight;
    // Construct the SURF feature detector object
    cv::SiftFeatureDetector sift(
            0.01, // feature threshold
            10); // threshold to reduce
                // sensitivity to lines
                // Detect the SURF features
    // Detection of the SIFT features
    sift.detect(imgLeft,keypointsLeft);
    sift.detect(imgRight,keypointsRight);
    std::cout << "Number of SURF points (1): " << keypointsLeft.size() << std::endl;
    std::cout << "Number of SURF points (2): " << keypointsRight.size() << std::endl;
    // 2] compute descriptors of these keypoints (SURF,SIFT) ::::::::::::::::::::::::::
    // Construction of the SURF descriptor extractor
    cv::SurfDescriptorExtractor surfDesc;
    // Extraction of the SURF descriptors
    cv::Mat descriptorsLeft, descriptorsRight;
    surfDesc.compute(imgLeft,keypointsLeft,descriptorsLeft);
    surfDesc.compute(imgRight,keypointsRight,descriptorsRight);
    std::cout << "descriptor matrix size: " << descriptorsLeft.rows << " by " << descriptorsLeft.cols << std::endl;
    // 3] matching keypoints from image right and image left according to their descriptors (BruteForce, Flann based approaches)
    // Construction of the matcher
    cv::BruteForceMatcher<cv::L2<float> > matcher;
    // Match the two image descriptors
    std::vector<cv::DMatch> matches;
    matcher.match(descriptorsLeft,descriptorsRight, matches);
    std::cout << "Number of matched points: " << matches.size() << std::endl;

    // 4] find the fundamental mat ::::::::::::::::::::::::::::::::::::::::::::::::::::
    // Convert 1 vector of keypoints into
    // 2 vectors of Point2f for compute F matrix
    // with cv::findFundamentalMat() function
    std::vector<int> pointIndexesLeft;
    std::vector<int> pointIndexesRight;
    for (std::vector<cv::DMatch>::const_iterator it= matches.begin(); it!= matches.end(); ++it) {
         // Get the indexes of the selected matched keypoints
         pointIndexesLeft.push_back(it->queryIdx);
         pointIndexesRight.push_back(it->trainIdx);
    }
    // Convert keypoints into Point2f
    std::vector<cv::Point2f> selPointsLeft, selPointsRight;
    cv::KeyPoint::convert(keypointsLeft,selPointsLeft,pointIndexesLeft);
    cv::KeyPoint::convert(keypointsRight,selPointsRight,pointIndexesRight);
    /* check by drawing the points
    std::vector<cv::Point2f>::const_iterator it= selPointsLeft.begin();
    while (it!=selPointsLeft.end()) {
            // draw a circle at each corner location
            cv::circle(imgLeft,*it,3,cv::Scalar(255,255,255),2);
            ++it;
    }
    it= selPointsRight.begin();
    while (it!=selPointsRight.end()) {
            // draw a circle at each corner location
            cv::circle(imgRight,*it,3,cv::Scalar(255,255,255),2);
            ++it;
    } */
    // Compute F matrix from n>=8 matches
    cv::Mat fundemental= cv::findFundamentalMat(
            cv::Mat(selPointsLeft), // points in first image
            cv::Mat(selPointsRight), // points in second image
            CV_FM_RANSAC);       // 8-point method
    std::cout << "F-Matrix size= " << fundemental.rows << "," << fundemental.cols << std::endl;
    /* draw the left points corresponding epipolar lines in right image
    std::vector<cv::Vec3f> linesLeft;
    cv::computeCorrespondEpilines(
            cv::Mat(selPointsLeft), // image points
            1,                      // in image 1 (can also be 2)
            fundemental,            // F matrix
            linesLeft);             // vector of epipolar lines
    // for all epipolar lines
    for (vector<cv::Vec3f>::const_iterator it= linesLeft.begin(); it!=linesLeft.end(); ++it) {
        // draw the epipolar line between first and last column
        cv::line(imgRight,cv::Point(0,-(*it)[2]/(*it)[1]),cv::Point(imgRight.cols,-((*it)[2]+(*it)[0]*imgRight.cols)/(*it)[1]),cv::Scalar(255,255,255));
    }
    // draw the left points corresponding epipolar lines in left image
    std::vector<cv::Vec3f> linesRight;
    cv::computeCorrespondEpilines(cv::Mat(selPointsRight),2,fundemental,linesRight);
    for (vector<cv::Vec3f>::const_iterator it= linesRight.begin(); it!=linesRight.end(); ++it) {
        // draw the epipolar line between first and last column
        cv::line(imgLeft,cv::Point(0,-(*it)[2]/(*it)[1]), cv::Point(imgLeft.cols,-((*it)[2]+(*it)[0]*imgLeft.cols)/(*it)[1]), cv::Scalar(255,255,255));
    }
    // Display the images with points and epipolar lines
    cv::namedWindow("Right Image Epilines");
    cv::imshow("Right Image Epilines",imgRight);
    cv::namedWindow("Left Image Epilines");
    cv::imshow("Left Image Epilines",imgLeft);
    */
    // 5] stereoRectifyUncalibrated()::::::::::::::::::::::::::::::::::::::::::::::::::
    //H1, H2 – The output rectification homography matrices for the first and for the second images.
    cv::Mat H1(4,4, imgRight.type());
    cv::Mat H2(4,4, imgRight.type());
    cv::stereoRectifyUncalibrated(selPointsRight, selPointsLeft, fundemental, imgRight.size(), H1, H2);

    // create the image in which we will save our disparities
    Mat imgDisparity16S = Mat( imgLeft.rows, imgLeft.cols, CV_16S );
    Mat imgDisparity8U = Mat( imgLeft.rows, imgLeft.cols, CV_8UC1 );
    // Call the constructor for StereoBM
    int ndisparities = 16*5;      // < Range of disparity >
    int SADWindowSize = 5;        // < Size of the block window > Must be odd. Is the 
                                  // size of averaging window used to match pixel  
                                  // blocks(larger values mean better robustness to
                                  // noise, but yield blurry disparity maps)
    StereoBM sbm( StereoBM::BASIC_PRESET,
        ndisparities,
        SADWindowSize );
    // Calculate the disparity image
    sbm( imgLeft, imgRight, imgDisparity16S, CV_16S );
    // Check its extreme values
    double minVal; double maxVal;
    minMaxLoc( imgDisparity16S, &minVal, &maxVal );
    printf("Min disp: %f Max value: %f n", minVal, maxVal);
    // Display it as a CV_8UC1 image
    imgDisparity16S.convertTo( imgDisparity8U, CV_8UC1, 255/(maxVal - minVal));
    namedWindow( "windowDisparity", CV_WINDOW_NORMAL );
    imshow( "windowDisparity", imgDisparity8U );

    // 6] reprojectImageTo3D() :::::::::::::::::::::::::::::::::::::::::::::::::::::
    //Mat xyz;
    //cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true);
    //How can I get the Q matrix? Is possibile to obtain the Q matrix with 
    //F, H1 and H2 or in another way?
    //Is there another way for obtain the xyz coordinates?
    cv::waitKey();
    return 0;
}

StereoRectifyUncalibrated计算的只是平面透视变换，而不是对象空间中的校正变换。有必要将这种平面变换转换为对象空间变换来提取Q矩阵，我认为这需要一些相机校准参数（如相机内部）。这个主题可能有一些研究课题正在进行中。

您可能需要添加一些步骤来估计相机内部，并提取相机的相对方向，以使您的流正常工作。我认为，如果没有使用主动照明方法，相机校准参数对于提取场景的正确三维结构至关重要。

此外，需要基于束块调整的解决方案来将所有估计值细化为更准确的值。

我觉得这个程序还可以。
据我所知，关于基于图像的3D建模，相机是显式校准或隐式校准的。你不想明确地校准相机。你无论如何都会利用这些东西。匹配对应的点对无疑是一种被大量使用的方法。

我认为您需要使用StereoRectify来校正图像并获得Q。此函数需要两个参数（R和T），即两个摄影机之间的旋转和平移。因此，可以使用solvePnP计算参数。该函数需要特定对象的一些三维真实坐标和图像中的二维点及其对应点