Chapter 2

Discovering the DFT

Shift invariance as commutativity, simultaneous diagonalization, the constructive derivation of the Fourier basis, and the DFT as a consequence of the algebra.

In this chapter
  1. 5Commutativity and Shift Invariance
  2. 6Simultaneous Diagonalization
  3. 7Eigenvectors of S*
  4. 8The DFT Emerges

5Commutativity and Shift Invariance

In Chapter 1 we showed that every circulant is a polynomial in $S$, and therefore all circulants commute with $S$ and with each other. Now we prove the converse: commuting with $S$ is not just a consequence of being circulant — it is a characterization.

Shift-invariance characterization
$$ SM = MS \iff M \text{ is circulant} \tag{7} $$

Proof sketch (Bamieh Lemma 3.1). The forward direction ($\Leftarrow$) is immediate: if $M = p(S)$ for some polynomial $p$, then $SM = Sp(S) = p(S)S = MS$ since polynomials in the same variable commute.

For the reverse direction ($\Rightarrow$), suppose $SM = MS$. Let $e_0 = (1, 0, \ldots, 0)^T$ and write $h = Me_0$ (the first column of $M$). We claim $M = C_h$. To see this, note that $e_k = S^k e_0$ for every $k$, so:

$$ Me_k = M S^k e_0 = S^k M e_0 = S^k h $$

The $k$-th column of $M$ is $S^k h$ — a shifted copy of $h$. This is exactly the column structure of a circulant matrix. Therefore $M = C_h$.

The equation $SM = MS$ has a direct interpretation. Recall that $S$ implements the group action of $\ZN$ on signals: $S$ shifts by 1, $S^2$ shifts by 2, and so on. The condition $SM = MS$ says: "shifting the input first and then applying $M$ gives the same result as applying $M$ first and then shifting." In other words, $M$ is shift-invariant (or translation-equivariant) — it treats every position on the ring identically.

Key idea: Convolutions are exactly the linear operators that “don’t care where you are on the ring.” They treat every position the same way. This is the group-theoretic characterization: a linear operator is a convolution if and only if it commutes with the group action. Since the faithful representation $k \mapsto S^k$ (§2) lets $\ZN$ act on $\C^N$ through powers of the single generator $S$, commuting with $S$ alone suffices.

This characterization is deeper than the polynomial one. It says that convolutions are not just an algebraic convenience — they are the natural linear operators on signals with translation symmetry. If your domain has a symmetry and you want your operator to respect it, the result is necessarily a convolution.


6Simultaneous Diagonalization

We have established that all circulant matrices commute. Now we invoke a fundamental theorem of linear algebra to understand the consequences.

Simultaneous diagonalization
$$ \text{If } AB = BA \text{ and } A, B \text{ are diagonalizable, then } \exists \text{ invertible } P \text{ such that } P^{-1}AP \text{ and } P^{-1}BP \text{ are both diagonal.} \tag{8} $$

In words: commuting diagonalizable matrices can be simultaneously diagonalized — they share a common eigenbasis. The matrix $P$ whose columns are the shared eigenvectors diagonalizes both matrices at once.

There is a stronger result that is crucial for us:

Uniqueness from distinct eigenvalues
$$ \text{If } A \text{ has } N \text{ distinct eigenvalues and } AB = BA, \text{ then } B = q(A) \text{ for some polynomial } q. \tag{9} $$

When $A$ has all distinct eigenvalues, each eigenspace is one-dimensional, and the eigenvectors are determined up to scalar multiples. Any matrix $B$ that commutes with $A$ must preserve each eigenspace, and since each is one-dimensional, $B$ acts as a scalar on each. This means $B$ is a polynomial in $A$ (by Lagrange interpolation on the eigenvalues).

Simultaneous Diagonalization

This is the single most important theorem in the course. Everything follows from it. Since all circulants commute, they all share the same eigenvectors. And since $S$ has $N$ distinct eigenvalues (as we will show momentarily), the shared eigenbasis is uniquely determined by the algebra. The shared eigenbasis is the Fourier basis.

Let us state the logical chain explicitly:

  1. All circulants are polynomials in $S$ (Eq. 4).
  2. Therefore all circulants commute with each other (Eq. 6).
  3. Commuting diagonalizable matrices share a common eigenbasis (Eq. 8).
  4. If $S$ has distinct eigenvalues, the eigenbasis is unique (Eq. 9).
  5. Therefore: find the eigenvectors of $S$, and you have found the eigenvectors of every circulant.

This is the plan. We now carry it out.


7Eigenvectors of S*

Our goal is to find the eigenvectors shared by all circulants. Since every circulant is a polynomial in $S$, we just need the eigenvectors of $S$. Following Bamieh, we work with $S^*$ (the left-shift) instead — the eigenvectors are the same (they are eigenvectors of $S$ with conjugate eigenvalues), and the derivation is slightly cleaner.

We seek all vectors $w \in \C^N$ and scalars $\rho \in \C$ satisfying $S^* w = \rho w$. Writing this out component by component:

Eigenvalue equation for S*
$$ S^* w^{(m)} = \rho_m \, w^{(m)} \tag{10} $$

Since $(S^* w)[n] = w[(n+1) \bmod N]$, the eigenvalue equation reads:

$$ w[n+1] = \rho \cdot w[n] \quad \text{for all } n = 0, 1, \ldots, N-2 $$

together with the wraparound condition $w[0] = \rho \cdot w[N-1]$. The recurrence $w[n+1] = \rho \cdot w[n]$ has the unique solution:

$$ w[k] = \rho^k \cdot w[0] $$

Normalize by setting $w[0] = 1$. Then $w = (1, \rho, \rho^2, \ldots, \rho^{N-1})^T$. Now apply the wraparound condition:

$$ w[0] = \rho \cdot w[N-1] \implies 1 = \rho \cdot \rho^{N-1} = \rho^N $$

So $\rho$ must be an $N$-th root of unity: $\rho^N = 1$. There are exactly $N$ such roots:

N-th roots of unity
$$ \rho_m = e^{i 2\pi m / N}, \quad m = 0, 1, \ldots, N-1 \tag{11} $$

These are $N$ points equally spaced on the unit circle in the complex plane. For $\Z_8$: $\rho_0 = 1$, $\rho_1 = e^{i\pi/4}$, $\rho_2 = e^{i\pi/2} = i$, $\rho_3 = e^{i3\pi/4}$, $\rho_4 = e^{i\pi} = -1$, $\rho_5 = e^{i5\pi/4}$, $\rho_6 = e^{i3\pi/2} = -i$, $\rho_7 = e^{i7\pi/4}$.

The corresponding eigenvectors are:

Eigenvectors (Fourier modes)
$$ w^{(m)} = \begin{pmatrix} 1 \\ \rho_m \\ \rho_m^2 \\ \vdots \\ \rho_m^{N-1} \end{pmatrix} = \begin{pmatrix} 1 \\ e^{i2\pi m/N} \\ e^{i2\pi m \cdot 2/N} \\ \vdots \\ e^{i2\pi m(N-1)/N} \end{pmatrix} \tag{12} $$

Since the $N$ roots of unity are all distinct, $S^*$ (and therefore $S$) has $N$ distinct eigenvalues. By our simultaneous diagonalization theorem, the eigenbasis $\{w^{(0)}, w^{(1)}, \ldots, w^{(N-1)}\}$ is uniquely determined (up to normalization of each vector). This is the Fourier basis.

Let us unpack what each eigenvector looks like. The $m$-th eigenvector $w^{(m)}$ has entries that rotate around the unit circle, making $m$ complete revolutions as the index $k$ runs from $0$ to $N-1$:

Key idea: We did not choose the Fourier basis. The eigenvalue equation determines it. The recurrence $w[k+1] = \rho \cdot w[k]$ has only one solution: a geometric sequence. The periodicity condition $\rho^N = 1$ restricts $\rho$ to the $N$-th roots of unity. The Fourier basis is the unique eigenbasis of the shift operator.

Eigenvisual Explorer demo — eigenvectors of S* sweeping from N=8 to N=65, showing spiral patterns
Figure 2.0. Eigenvectors of $S^*$ as $N$ increases from 8 to 65. Each blue dot is one entry of $w^{(m)}$ plotted on the unit circle; the red line is the eigenvalue $\rho_m$. As $m$ increases, the dots trace increasingly tight spirals — higher-frequency modes wrapping the circle multiple times. Inspired by Bamieh (2018), Figure 4.1. Open the interactive app →

8The DFT Emerges

We now have the eigenvectors of $S^*$. But Chapter 1 defined circulants as polynomials in $S$ (the right-shift): $C_a = \sum_k a_k S^k$ (Eq. 4). To find eigenvalues, we need to connect $S$ to $S^*$.

The bridge is simple. $S$ and $S^*$ are inverse shifts on $\ZN$: $S \cdot S^* = I$. Since $S^N = I$ (shifting $N$ times is the identity), we have $S = (S^*)^{-1} = (S^*)^{N-1}$, and more generally:

$$ S^k = (S^*)^{N-k} $$

Intuition: on a ring of $N$ elements, shifting a signal clockwise by $k$ steps lands on the same position as shifting counterclockwise by $N - k$ steps — the ring wraps around. (This is the same duality captured by the reversal operator $R$ from §3: $RSR = S^*$, so right-shifting by $k$ and left-shifting by $N-k$ are conjugate operations.)

Now apply $S^k$ to an eigenvector $w^{(m)}$:

$$ S^k w^{(m)} = (S^*)^{N-k} w^{(m)} = \rho_m^{N-k} \, w^{(m)} $$

The exponent simplifies cleanly: $\rho_m^{N-k} = e^{i2\pi m(N-k)/N} = \underbrace{e^{i2\pi m}}_{=\,1} \cdot e^{-i2\pi mk/N} = e^{-i2\pi mk/N}$. Summing over $k$:

$$ C_a w^{(m)} = \sum_{k=0}^{N-1} a_k S^k w^{(m)} = \sum_{k=0}^{N-1} a_k \, \rho_m^{-k} \, w^{(m)} = \underbrace{\left(\sum_{k=0}^{N-1} a_k \, e^{-i2\pi mk/N}\right)}_{\lambda_m} w^{(m)} $$

This holds for every $m = 0, 1, \ldots, N-1$. Stacking all $N$ eigenvector equations into a single matrix equation:

$$ C_a \underbrace{\begin{pmatrix} \vert & \vert & & \vert \\ w^{(0)} & w^{(1)} & \cdots & w^{(N-1)} \\ \vert & \vert & & \vert \end{pmatrix}}_{V} = \underbrace{\begin{pmatrix} \vert & \vert & & \vert \\ w^{(0)} & w^{(1)} & \cdots & w^{(N-1)} \\ \vert & \vert & & \vert \end{pmatrix}}_{V} \underbrace{\begin{pmatrix} \lambda_0 & & \\ & \lambda_1 & \\ & & \ddots & \\ & & & \lambda_{N-1} \end{pmatrix}}_{\operatorname{diag}(\lambda_0, \ldots, \lambda_{N-1})} $$

This is the standard eigendecomposition $C_a V = V \Lambda$. The diagonal entries $\lambda_m$ are the eigenvalues of $C_a$, and they are:

Eigenvalue = DFT
$$ \hat{a}_m = \lambda_m = \sum_{k=0}^{N-1} a_k \, e^{-i 2\pi mk / N} = \text{DFT}(a)_m \tag{13} $$

This is the Discrete Fourier Transform of the sequence $a$. It appeared not because we postulated it, but because we asked: "what are the eigenvalues of a circulant matrix?" The answer is: the DFT of its defining kernel.

Assembling all the eigenvectors into a matrix $W$ (the DFT matrix, whose precise form we define in the next chapter), the diagonalization reads:

Spectral decomposition of circulants
$$ C_a = W^{-1} \operatorname{diag}(\hat{a}_0, \hat{a}_1, \ldots, \hat{a}_{N-1}) \, W \tag{14} $$

Here $W$ is the DFT matrix — the matrix that transforms a signal from the "spatial" domain to the "frequency" domain. This is simultaneously the diagonalization of $C_a$, the spectral decomposition of the convolution operator, and the definition of the DFT.

Worked example. For $\Z_8$ with our running kernel $h = (3, 1, 0, 0, 0, 0, 0, 2)^T$, the DFT values are $\hat{h}_m = \sum_{k=0}^{7} h_k e^{-i2\pi mk/8}$. Since only $h_0 = 3$, $h_1 = 1$, and $h_7 = 2$ are nonzero:

$$ \hat{h}_m = 3 + e^{-i2\pi m/8} + 2 \, e^{-i2\pi m \cdot 7/8} = 3 + e^{-i\pi m/4} + 2 \, e^{i\pi m/4} $$

where we used $e^{-i2\pi m \cdot 7/8} = e^{-i 7\pi m/4} = e^{i\pi m/4}$ (discarding full-revolution factors). Computing a few values:

The eigenvalue $\hat{h}_4 = 0$ means that the circulant $C_h$ has a nontrivial kernel — the Fourier mode $w^{(4)}$ is annihilated by convolution with $h$. This is a frequency-4 oscillation (alternating $+1, -1, +1, -1, \ldots$) that is invisible to this particular filter. Knowing the eigenvalues tells you exactly which frequencies the filter amplifies, attenuates, or kills.

Key idea: The DFT exists because $S$ has $N$ distinct eigenvalues. It is a consequence of diagonalizing the shift operator. Any time a cyclic group acts on a vector space, simultaneous diagonalization determines the Fourier basis.

Let us step back and appreciate the logic. We started with the simplest possible algebraic structure: a cyclic group $\ZN$ acting on $\C^N$ via the shift $S$. We asked: what are the linear operators that commute with this action? Answer: circulants (polynomials in $S$). We asked: can we simultaneously diagonalize all of them? Answer: yes, because they commute. We asked: what is the shared eigenbasis? Answer: geometric sequences with roots-of-unity ratios — the Fourier basis. We asked: what are the eigenvalues? Answer: the DFT of the kernel.

Each step followed from the previous one. The DFT is the unique answer to the question "how do you diagonalize convolutions on a cyclic group?"

In the next chapter, we package this into a clean framework — the DFT matrix $W$, the spectral filtering recipe, and the three equivalent characterizations of convolution on $\ZN$ — and then ask what happens when we leave the ring behind and move to general graphs.