The Discrete Fourier Transform: Definition as an inner product


Sampled discrete time signals

Consider an N-length finite duration discrete time signal, x[n], specified at integer sample time indices n.

The signal x[n] comprises N samples, and therefore runs from n = 0, 1, 2, \cdots, N-1. In general, x[n] may be complex valued, i.e., x \in \mathbb{C}_N, though in most audio-related applications it will be purely real.

The Discrete Fourier Transform: Definition as an inner product

The DFT, X[k], of the signal x[n] is obtained by evaluating the inner product of x[n] with an orthogonal set of N complex exponentials \phi_k

(1)   \begin{align*} X[k] = \langle x,\phi_k \rangle , \hspace{0.5cm} \phi_k = e^{2\pi i kn/N} \end{align*}

where integer k = 0, 1, 2, \cdots, N-1, and i is the imaginary number (i = \sqrt{-1}).

Using the definition on the Mathematical tools page, we can evaluate X[k] via

(2)   \begin{align*} X[k] = \sum_{n=0}^{N-1} x[n] e^{-2\pi i k n/N}, \hspace{0.5cm} k = 0, 1, 2, ... , N-1 \end{align*}

for sample time index n, frequency bin index k, and number of time samples included in the transformation N (which is also equal to the number of frequency bins). Note that:

  • n, k, and N are all integers.
  • A conjugation operation acts on \phi_{k}, within the summation of equation 2 (as expected).
  • There is no explicit definition of `frequency’ when the DFT is defined in this way — i.e., it is agnostic to sample rate (see below for more on this).
  • For real valued signals x[n], only half of the spectrum X[k] is unique — see here[LINK] for more on this.

The values X[k] are sometimes referred to as complex amplitudes, since in general they are complex valued for a real valued signal x[n] — i.e., each X[k] value has a magnitude and phase, or equivalently, a real and imaginary part. Analogous with earlier examples, each value of X[k] tells us something about `how much’ of the given complex exponential \phi_{k} is present within x[n].