离散傅里叶变换

2023-05-16

英文链接：Discrete Fourier Transform

目标

什么是傅里叶变换，为什么要用它?
在OpenCV中怎么做?
使用诸如:copyMakeBorder()、merge()、dft()、getOptimalDFTSize()、log()和normalize()等函数。

源码

#include "opencv2/core.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/highgui.hpp"
#include <iostream>
using namespace cv;
using namespace std;
static void help(char ** argv)
{
    cout << endl
        <<  "This program demonstrated the use of the discrete Fourier transform (DFT). " << endl
        <<  "The dft of an image is taken and it's power spectrum is displayed."  << endl << endl
        <<  "Usage:"                                                                      << endl
        << argv[0] << " [image_name -- default lena.jpg]" << endl << endl;
}
int main(int argc, char ** argv)
{
    help(argv);
    const char* filename = argc >=2 ? argv[1] : "lena.jpg";
    Mat I = imread( samples::findFile( filename ), IMREAD_GRAYSCALE);
    if( I.empty()){
        cout << "Error opening image" << endl;
        return EXIT_FAILURE;
    }
    Mat padded;                            //expand input image to optimal size
    int m = getOptimalDFTSize( I.rows );
    int n = getOptimalDFTSize( I.cols ); // on the border add zero values
    copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));
    Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
    Mat complexI;
    merge(planes, 2, complexI);         // Add to the expanded another plane with zeros
    dft(complexI, complexI);            // this way the result may fit in the source matrix
    // compute the magnitude and switch to logarithmic scale
    // => log(1 + sqrt(Re(DFT(I))^2 + Im(DFT(I))^2))
    split(complexI, planes);                   // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
    magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
    Mat magI = planes[0];
    magI += Scalar::all(1);                    // switch to logarithmic scale
    log(magI, magI);
    // crop the spectrum, if it has an odd number of rows or columns
    magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
    // rearrange the quadrants of Fourier image  so that the origin is at the image center
    int cx = magI.cols/2;
    int cy = magI.rows/2;
    Mat q0(magI, Rect(0, 0, cx, cy));   // Top-Left - Create a ROI per quadrant
    Mat q1(magI, Rect(cx, 0, cx, cy));  // Top-Right
    Mat q2(magI, Rect(0, cy, cx, cy));  // Bottom-Left
    Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
    Mat tmp;                           // swap quadrants (Top-Left with Bottom-Right)
    q0.copyTo(tmp);
    q3.copyTo(q0);
    tmp.copyTo(q3);
    q1.copyTo(tmp);                    // swap quadrant (Top-Right with Bottom-Left)
    q2.copyTo(q1);
    tmp.copyTo(q2);
    normalize(magI, magI, 0, 1, NORM_MINMAX); // Transform the matrix with float values into a
                                            // viewable image form (float between values 0 and 1).
    imshow("Input Image"       , I   );    // Show the result
    imshow("spectrum magnitude", magI);
    waitKey();
    return EXIT_SUCCESS;
}

解释

傅里叶变换将图像分解成正弦分量和余弦分量。换句话说，它将图像从空间域转换到频率域。它的思想是：任何函数都可以被精确地近似为无限个正弦函数和余弦函数的和。傅里叶变换就是一种方法。二维图像傅里叶变换在数学上为:
F ( k , l ) = ∑ i = 0 N − 1 ∑ j = 0 N − 1 f ( i , j ) e − i 2 π ( k i N + l j N ) F(k,l) = \displaystyle\sum\limits_{i=0}^{N-1}\sum\limits_{j=0}^{N-1} f(i,j)e^{-i2\pi(\frac{ki}{N}+\frac{lj}{N})} F(k,l)=i=0∑N−1j=0∑N−1f(i,j)e−i2π(Nki+Nlj)
e i x = cos ⁡ x + i sin ⁡ x e^{ix} = \cos{x} + i\sin {x} eix=cosx+isinx
这里 f f f是空间域的图像值， F F F是频率域的值。转换后的结果是复数。可以通过 实域和复域 或 幅域和相域来显示。然而，在整个图像处理算法中，只有幅值图像（即幅域）是有趣的，因为它包含了我们需要的关于图像几何结构的所有信息。然而，如果你打算对这种形式的图像做一些修改，然后再重新逆变换它。那么你需要保留这两种形式的图像。

在这个例子中，我将展示如何计算和展示傅里叶变换的幅值图像。在数字情况下图像是离散的。这意味着它们可以从给定的域值中获取值。例如，基本灰度图像的值通常在0到255之间。因此傅里叶变换也需要是离散型的，从而得到离散傅里叶变换(DFT)。当您需要从几何角度确定图像的结构时，您将希望使用此方法。以下是接下来的步骤(对于灰度输入图像 I I I):

将图像扩展到最佳大小

DFT的性能取决于图像的大小。当图像大小是数字2、3和5的倍数时，它往往是最快的。为了获得最大的性能，通常将边界值填充到图像中，得到具有这些尺寸特征的的图像。getOptimalDFTSize()返回这个最佳大小，我们可以使用copyMakeBorder()函数来扩展图像的边界(附加的像素被初始化为零)

    Mat padded;                            //expand input image to optimal size
    int m = getOptimalDFTSize( I.rows );
    int n = getOptimalDFTSize( I.cols ); // on the border add zero values
    copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));

为实域和复域的值，申请空间

傅里叶变换的结果是复数。这意味着对于每个图像值，结果对应两个图像值(每个组件一个)。此外，它的频域范围要比它相对应的空域大得多。因此，我们通常至少以浮点格式存储它们。因此，我们将把输入图像转换为这种类型，并增加一个通道扩展它，以保存复数值:

    Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
    Mat complexI;
    merge(planes, 2, complexI);         // Add to the expanded another plane with zeros

进行离散傅里叶变换

有可能就地进行计算(输入对象即输出对象):

    dft(complexI, complexI);            // this way the result may fit in the source matrix

实域和复域的值转换到幅域

一个复数有实部(Re)和复部(虚部- Im)。DFT的结果是复数。DFT的 幅值部分 为:
M = R e ( D F T ( I ) ) 2 + I m ( D F T ( I ) ) 2 2 M = \sqrt[2]{ {Re(DFT(I))}^2 + {Im(DFT(I))}^2} M=2Re(DFT(I))2+Im(DFT(I))2
翻译成OpenCV代码:

    split(complexI, planes);                   // planes[0] = Re(DFT(I)), planes[1] = Im(DFT(I))
    magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
    Mat magI = planes[0];

切换到对数尺度

结果表明，傅里叶系数的动态范围太大，无法在屏幕上显示。我们无法观察到较低和较高的变量。因此，高值将全部变成白点，而低值将变成黑色。要将灰度值用于可视化，我们可以将线性比例转换为对数比例：
M 1 = log ⁡ ( 1 + M ) M_1 = \log{(1 + M)} M1=log(1+M)
翻译成OpenCV代码:

    magI += Scalar::all(1);                    // switch to logarithmic scale
    log(magI, magI);

裁剪和重新布局

在第一步，我们扩展了图像。现在是时候抛弃新引入的值了。
为了可视化的目的，我们还可以重新排列结果的象限，以便原点（0，0）与图像中心相对应。

    // crop the spectrum, if it has an odd number of rows or columns
    magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
    // rearrange the quadrants of Fourier image  so that the origin is at the image center
    int cx = magI.cols/2;
    int cy = magI.rows/2;
    Mat q0(magI, Rect(0, 0, cx, cy));   // Top-Left - Create a ROI per quadrant
    Mat q1(magI, Rect(cx, 0, cx, cy));  // Top-Right
    Mat q2(magI, Rect(0, cy, cx, cy));  // Bottom-Left
    Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
    Mat tmp;                           // swap quadrants (Top-Left with Bottom-Right)
    q0.copyTo(tmp);
    q3.copyTo(q0);
    tmp.copyTo(q3);
    q1.copyTo(tmp);                    // swap quadrant (Top-Right with Bottom-Left)
    q2.copyTo(q1);
    tmp.copyTo(q2);

标准化

这也是为了可视化的目的。我们现在有幅值，但这仍然超出我们的图像显示范围0到1。我们使用cv::normalize()函数将值规范化到这个范围。

    normalize(magI, magI, 0, 1, NORM_MINMAX); // Transform the matrix with float values into a
                                            // viewable image form (float between values 0 and 1).

结果

一个应用的想法是确定图像中呈现的几何方向。例如，让我们看看一个文本是否水平? 看一些文本，你会发现文本行也有水平线，字母也有垂直线。在傅立叶变换的情况下，文本片段的这两个主要部分也可以看到。让我们用这个水平和旋转的图像来表示文本。

如果是水平文本：
在这里插入图片描述

如果是旋转文本：
在这里插入图片描述
您可以看到，频域最有影响力的分量(幅值图像上最亮的点)跟随图像上对象的几何旋转。由此，我们可以计算偏移量，并执行图像旋转来纠正错过的对准。

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)