I am looking for a simple algorithm for fast DCT and IDCT of matrix [NxM]

Question

I am looking for a simple algorithm to perform fast DCT (type 2) of a matrix of any size [NxM], and also an algorithm for the inverse transformation IDCT (also called DCT type 3).

I need a DCT-2D algorithm, but even a DCT-1D algorithm is good enough because I can use DCT-1D to implement DCT-2D (and IDCT-1D to implement IDCT-2D ).

PHP code is preferable, but any algorithm that is clear enough will do.

My current PHP script for implementing DCT/IDCT is very slow whenever matrix size is more than [200x200].

I was hopping to find a way to preform DCT of up to [4000x4000] within less than 20 seconds. Does anyone know how to do it?

If it's not too many lines, could you show us what you've done so far? — Teepeemm, Mar 31 '14 at 18:19
So far, I am, more or less, calculating DCT and IDCT by their formal definition, which is very slow. — algo-rithm, Mar 31 '14 at 18:31
There is one called `kiss-fft` which is reputably smaller (less code) than other implementations. — rwong, Apr 01 '14 at 03:58
rwong, thanks, I will look at it. Do you know a fast/simple way to convert DFT to DCT and vice versa? — algo-rithm, Apr 01 '14 at 04:35
which DCT's you want (1,2,3,4) ? I recommend to use FFT to compute DCT because fast DCT is extremly complicated to code often even slower then through FFT. also you want direct or normalized ones ? also DFCT algorithms are very poorly documented !!! (never saw functional implementation just incomplete equations) — Spektre, Apr 01 '14 at 06:49
also you can find many fast cosine transforms on the net but all of them are just for constant size like 8x8 which does not help at all ... to Implement direct DFCT or iDFCT you have to implement also DST, iDST and the recursion is not so easily separable as in DFT,iDFT you will get 3 therms instead of 2 ... — Spektre, Apr 01 '14 at 07:33

score 2 · Accepted Answer · edited Jun 20 '20 at 09:12

Here is mine computation of 1D FDCT and IFDCT by FFT with the same length:

//---------------------------------------------------------------------------
void DFCTrr(double *dst,double *src,double *tmp,int n)
    {
    // exact normalized DCT II by N DFFT
    int i,j;
    double nn=n,a,da=(M_PI*(nn-0.5))/nn,a0,b0,a1,b1,m;
    for (j=  0,i=n-1;i>=0;i-=2,j++) dst[j]=src[i];
    for (j=n-1,i=n-2;i>=0;i-=2,j--) dst[j]=src[i];
    DFFTcr(tmp,dst,n);
    m=2.0*sqrt(2.0);
    for (a=0.0,j=0,i=0;i<n;i++,j+=2,a+=da)
        {
        a0=tmp[j+0]; a1= cos(a);
        b0=tmp[j+1]; b1=-sin(a);
        a0=(a0*a1)-(b0*b1);
        if (i) a0*=m; else a0*=2.0;
        dst[i]=a0;
        }
    }
//---------------------------------------------------------------------------
void iDFCTrr(double *dst,double *src,double *tmp,int n)
    {
    // exact normalized DCT III = iDCT II by N iDFFT
    int i,j;
    double nn=n,a,da=(M_PI*(nn-0.5))/nn,a0,m,aa,bb;
    m=1.0/sqrt(2.0);
    for (a=0.0,j=0,i=0;i<n;i++,j+=2,a+=da)
        {
        a0=src[i];
        if (i) a0*=m;
        aa= cos(a)*a0;
        bb=+sin(a)*a0;
        tmp[j+0]=aa;
        tmp[j+1]=bb;
        }
    m=src[0]*0.25;
    iDFFTrc(src,tmp,n);
    for (j=  0,i=n-1;i>=0;i-=2,j++) dst[i]=src[j]-m;
    for (j=n-1,i=n-2;i>=0;i-=2,j--) dst[i]=src[j]-m;
    }
//---------------------------------------------------------------------------

dst is destination vector [n]
src is source vector [n]
tmp is temp vector [2n]

These arrays should not overlap !!! It is taken from mine transform class so I hope did not forget to copy something.

XXXrr means destination is real and source is also real domain
XXXrc means destination is real and source is complex domain
XXXcr means destination is complex and source is real domain

All data are double arrays, for complex domain first number is Real and second Imaginary part so array is 2N size. Both functions use FFT and iFFT if you need code also for them comment me. Just to be sure I added not fast implementation of them below. It is much easier to copy that because fast ones use too much of the transform class hierarchy

slow DFT,iDFT implementations for testing:

//---------------------------------------------------------------------------
void transform::DFTcr(double *dst,double *src,int n)
    {
    int i,j;
    double a,b,a0,_n,q,qq,dq;
    dq=+2.0*M_PI/double(n); _n=2.0/double(n);
    for (q=0.0,j=0;j<n;j++,q+=dq)
        {
        a=0.0; b=0.0;
        for (qq=0.0,i=0;i<n;i++,qq+=q)
            {
            a0=src[i];
            a+=a0*cos(qq);
            b+=a0*sin(qq);
            }
        dst[j+j  ]=a*_n;
        dst[j+j+1]=b*_n;
        }
    }
//---------------------------------------------------------------------------
void transform::iDFTrc(double *dst,double *src,int n)
    {
    int i,j;
    double a,a0,a1,b0,b1,q,qq,dq;
    dq=+2.0*M_PI/double(n);
    for (q=0.0,j=0;j<n;j++,q+=dq)
        {
        a=0.0;
        for (qq=0.0,i=0;i<n;i++,qq+=q)
            {
            a0=src[i+i  ]; a1=+cos(qq);
            b0=src[i+i+1]; b1=-sin(qq);
            a+=(a0*a1)-(b0*b1);
            }
        dst[j]=a*0.5;
        }
    }
//---------------------------------------------------------------------------

So for testing just rewrite the names to DFFTcr and iDFFTrc (or use them to compare to your FFT,iFFT) when the code works properly then implement your own FFT,iFFT For more info on that see:

How to compute Discrete Fourier Transform?

2D DFCT

resize src matrix to power of 2

by adding zeros, to use fast algorithm the size must be always power of 2 !!!
allocate NxN real matrices tmp,dst and 1xN complex vector t
transform lines by DFCTrr
```
 DFCT(tmp.line(i),src.line(i),t,N)
```
transpose tmp matrix
transform lines by DFCTrr
```
 DFCT(dst.line(i),tmp.line(i),t,N)
```
transpose dst matrix
normalize dst by multiply matrix by 0.0625

2D iDFCT

Is the same as above but use iDFCTrr and multiply by 16.0 instead.

[Notes]

Be sure before implementing your own FFT and iFFT that they give the same result as mine otherwise the DCT/iDCT will not work properly !!!

Spektre, many thanks, but I really need [NxM]. Do you know how I can modify your code [NxN] to get [NxM]? — algo-rithm, Apr 01 '14 at 13:17
to apply any kind of Fast algorithm both N and M must be power of 2, you can do the 2D transform on NxM the same way as on NxN just adjust the transpose function ... (only thing that could change are the normalization constants ... most algorithm s use NM,1/NM but I like 16,1/16 more because it does not change the magnitude of signal) — Spektre, Apr 01 '14 at 15:02
I've tried to compile and run the DFTcr function, but it seems that dst[j+j+1]=b*_n causes an exception at some point inside the loop, when the dst array size is exceeded. — algo-rithm, Apr 02 '14 at 16:21
didn't you forget to allocate the complex array twice as long then real array ? — Spektre, Apr 02 '14 at 16:25
when your DCT/iDCT will work then do not forget to implement fast FT,iFT so your DCT,iDCT will became also fast — Spektre, Apr 03 '14 at 06:14

I am looking for a simple algorithm for fast DCT and IDCT of matrix [NxM]

1 Answers1

Linked