Compare the speed of execution of NumPy's np.fft.fft
algorithm and that of the direct implementation of the equation
$$ F_k = \sum_{m=0}^{n-1}f_m\exp\left( -\frac{2\pi i m k}{n} \right), \quad k=0,1,2,\cdots, n-1 $$
Hints: treat the direct equation as a matrix multiplication (dot product) of an array of $n$ function values (random ones will do) with the $n \times n$ array with entries $\exp(-2\pi i m k / n)$ ($m, k = 0, 1, \cdots, n-1$). Use IPython's %timeit
magic.
The two methods for calculating the DFT can be timed using the IPython %timeit
magic function:
In [x]: import numpy as np
In [x]: n = 512
In [x]: # Our input function is just random numbers
In [x]: f = np.random.rand(n)
In [x]: # Time the NumPy (Cooley-Tukey) DFT algorithm
In [x]: %timeit np.fft.fft(f)
100000 loops, best of 3: 13.1 us per loop
In [x]: # Now calculate the DFT by direct summation
In [x]: k = np.arange(n)
In [x]: m = k.reshape((n, 1))
In [x]: w = np.exp(-2j * np.pi * m * k / n)
In [x]: %timeit np.dot(w, f)
1000 loops, best of 3: 354 us per loop
In [x]: # Check the two methods produce the same result
In [x]: ftfast = np.fft.fft(f)
In [x]: ftslow = np.dot(w, f)
In [x]: np.allclose(ftfast, ftslow)
Out[x]: True
The Cooley-Tukey algorithm implemented by np.fft.fft
is found to be almost 30 times faster than the direct method. In fact, this algorithm can be shown to scale as $\mathcal{O}(n\log n)$ compared with $\mathcal{O}(n^2)$ for direct summation.