Numpy Functions:
Well in this case, since dct is a numpy function, it has the functionality built-in to apply it over a particular axis. Nearly all numpy functions operate on complete arrays or can be told to operate on a particular axis (row or column).
So just by leveraging the axis parameter for dct function:
dct( X, axis=2)
you will get an equivalent result:
>>> ( dct(X, axis=2) == np.array(map(dct, X)) ).all()
True
which is also >35 times faster than using the map function in our case of (625,4,4) matrix:
%timeit dct(X, axis=2)
1000 loops, best of 3: 157 µs per loop
%timeit np.array(map(dct, X))
100 loops, best of 3: 5.76 ms per loop    
General Python Functions:
In other cases, you can vectorize a python function using either np.vectorize or np.frompyfunc functions. For instance if you have a demo function that performs a scalar operation:
def foo(x): # gives an error if passed in an array
    return x**2
>>> X = np.arange(8, dtype=np.float32).reshape(-1,2,2)
>>> foo_arr = np.vectorize( foo)
>>> foo_arr(X)
array([[[  0.,   1.],
        [  4.,   9.]],
       [[ 16.,  25.],
        [ 36.,  49.]]])
Discussion here might also be helpful for you. As they say, vectorizing your non-numpy function doesn't actually make it any faster though.