Constructing Markov Chains for Target Distributions: Gaussian, Cauchy, Gumbel, and Planck’s Law via Detailed Balance

Published on February 11, 2025

אם ירצה ה׳

TL;DR

1. Mathematician

Metropolis-Hastings construction of Markov chains with four target distributions: Gaussian, Planck (geometric), Cauchy, Gumbel. Detailed balance is verified in each case; ergodicity is argued informally. The Gumbel case uses the full Hastings ratio $\min{1, \pi(y)Q(y,x)/\pi(x)Q(x,y)}$ with an asymmetric exponential proposal. The Cauchy case notes that a Gaussian random-walk proposal is not geometrically ergodic on a Cauchy target. The paper is expository, not a research contribution: the point is that one recipe handles targets of very different character (continuous vs. discrete, finite moments vs. none, symmetric vs. skewed).

2. Physicist

Four worked examples of building a Markov chain whose equilibrium distribution is what you want. The useful one is Planck: the chain gives you Bose-Einstein occupation $\langle n \rangle = 1/(e^{\beta h\nu} - 1)$ from detailed balance on absorption and emission transitions, and the classical $8\pi\nu^2/c^3$ mode-counting then produces Planck's law. The Cauchy case is the one to read if you work with heavy-tailed fluctuations and want to see that the standard MCMC machinery still works, with the caveat that tail mixing is slow. The Gaussian case is standard; the Gumbel case is useful for extreme-value modeling.

3. Engineer or practitioner

If you need to sample from a weird distribution, Metropolis-Hastings works. Pick a proposal (symmetric if the target is symmetric, asymmetric if not), apply the Metropolis or Hastings acceptance rule, run the chain. The paper shows this explicitly for four distributions you might actually encounter: normal errors, photon statistics, heavy-tailed returns, extreme-value maxima. Warning on the Cauchy case: a Gaussian proposal samples the bulk fine but struggles in the tails, so budget for long runs or scale your proposal to the target.

4. Curious reader

A Markov chain is a random process that hops between states according to fixed rules. If you set the rules right, the fraction of time the chain spends in each state matches any probability distribution you pick. The paper walks through how to set those rules for four different distributions, including one (Planck's law for blackbody radiation) that was a founding result of quantum mechanics. The method is the same each time; what changes is which features of the target distribution make it tricky, and how the construction accommodates them.

Constructing Transition Matrices for Various Distributions

0. Introduction

Metropolis-Hastings gives a constructive answer to a basic question: given a target probability distribution π, does there exist a Markov chain with π as its stationary measure? The answer is yes, and the construction is explicit. What this paper shows is that the same recipe handles four distributions of very different character: a well-behaved Gaussian, a quantized Planck mode, a heavy-tailed Cauchy with no finite mean, and an asymmetric Gumbel. Each case raises a different obstacle. The Gaussian tests the continuum limit. The Planck case is genuinely discrete. The Cauchy case tests whether a symmetric proposal can track a heavy tail. The Gumbel case forces an asymmetric proposal. Taken together they illustrate that the Metropolis-Hastings recipe is indifferent to these distinctions: detailed balance plus ergodicity is all that is needed.

a. Gaussian Distribution

Step 1: Discrete State Space

States are $x_i = i \cdot \Delta$ for $i \in \mathbb{Z}$. In the limit $\Delta \to 0$ the chain becomes continuous.

Step 2: Target Stationary Distribution

$$\pi(x_i) \propto \exp\left(-\frac{x_i^2}{2\sigma^2}\right).$$

Step 3: Detailed Balance

For transition matrix $B$ with entries $P(x_i \to x_j)$, detailed balance requires

$$\pi(x_i) P(x_i \to x_j) = \pi(x_j) P(x_j \to x_i).$$

Only nearest-neighbor jumps are permitted: from $x_i$ to $x_{i \pm 1}$.

Step 4: Symmetric Proposal, Metropolis Acceptance

Use a symmetric proposal $Q(x_i, x_j) = 1/2$ for neighbors. The Metropolis acceptance rule is

$$A(x_i \to x_j) = \min\left{1, \frac{\pi(x_j)}{\pi(x_i)}\right}.$$

This satisfies detailed balance because $A(x \to y) / A(y \to x) = \pi(y)/\pi(x)$ when the min resolves correctly, which is exactly what detailed balance requires given symmetric $Q$.

Step 5: Stationary Distribution

Detailed balance with the given $\pi$ already guarantees that the Gaussian is the stationary distribution of $B$. No continuum argument is needed for the paper's main claim.

Remark on the continuum limit. A Kramers-Moyal expansion of the transition kernel would recover a Fokker-Planck equation with drift $D_1 \propto -x$ and diffusion $D_2$ constant, giving Gaussian stationary density $\exp(-x^2/2\sigma^2)$. Carrying that calculation out requires computing $\langle \Delta x \rangle$ and $\langle \Delta x^2 \rangle$ per step from the acceptance rule, which is a separate exercise. We state the result and move on.

Step 6: Role of $B$

The matrix $B$ allows only neighbor moves, its acceptance probabilities satisfy detailed balance with respect to $\pi$, and the chain is irreducible and aperiodic, so it converges to the Gaussian.

b. Planck's Law

Two things are happening in this section, and the paper needs to keep them straight. The Markov chain derives the Bose-Einstein occupation statistics for a single cavity mode. Getting from there to Planck's spectral law requires the classical mode-counting in a 3D cavity, which is not Markov chain business. Both steps are shown below, but only the first is what the transition matrix accomplishes.

Step 1: Quantized Energy States

For a photon mode of frequency $\nu$, occupation number $n = 0, 1, 2, \ldots$ gives energy $E_n = nh\nu$.

Step 2: Target Stationary Distribution

$$\pi(n) \propto \exp(-\beta n h\nu), \quad \beta = 1/kT.$$

Normalizing, with $q = \exp(-\beta h\nu)$, gives the geometric distribution $\pi(n) = (1-q) q^n$.

Step 3: Transitions Between Occupation States

Allow absorption ($n \to n+1$) and emission ($n+1 \to n$) with symmetric proposal $Q(n, n+1) = Q(n+1, n)$. Detailed balance requires

$$\pi(n) P(n \to n+1) = \pi(n+1) P(n+1 \to n).$$

Since $Q$ is symmetric, acceptance probabilities must satisfy $A(n+1 \to n) / A(n \to n+1) = \pi(n)/\pi(n+1) = \exp(\beta h\nu)$. Because $\exp(\beta h\nu) > 1$, the Metropolis rule gives $A(n+1 \to n) = 1$ and $A(n \to n+1) = \exp(-\beta h\nu)$.

Step 4: Mean Occupation

The chain converges to $\pi(n) = (1-q) q^n$, and the mean energy per mode is

$$\langle E \rangle = h\nu \sum_{n=0}^{\infty} n (1-q) q^n = h\nu \cdot (1-q) \cdot \frac{q}{(1-q)^2} = h\nu \cdot \frac{q}{1-q}.$$

Now multiply numerator and denominator by $\exp(\beta h\nu)$:

$$\frac{q}{1-q} = \frac{\exp(-\beta h\nu)}{1 - \exp(-\beta h\nu)} = \frac{1}{\exp(\beta h\nu) - 1}.$$

So $\langle E \rangle = h\nu / (\exp(h\nu/kT) - 1)$. This is the Bose-Einstein mean occupation energy. The Markov chain has done its work; everything above follows from detailed balance plus the geometric stationary distribution.

Step 5: From Bose-Einstein to Planck

To get the spectral energy density, count cavity modes. For a 3D cavity the density of states is $g(\nu) = 8\pi\nu^2 / c^3$. This is classical electromagnetism, not Markov chain output. Multiplying:

$$u(\nu) = g(\nu) \langle E \rangle = \frac{8\pi h \nu^3}{c^3} \cdot \frac{1}{\exp(h\nu/kT) - 1}.$$

Planck's law. The transition matrix gave the $1/(e^{h\nu/kT}-1)$ factor; the $8\pi \nu^3/c^3$ factor came from counting modes.

c. Cauchy Distribution

Step 1: State Space and Target

Continuous state space $\mathbb{R}$, with $\pi(x) \propto 1/(1+x^2)$.

Step 2: Transition Kernel

Symmetric Gaussian proposal

$$Q(x, y) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(y-x)^2}{2\sigma^2}\right).$$

Acceptance

$$A(x, y) = \min\left{1, \frac{\pi(y)}{\pi(x)}\right} = \min\left{1, \frac{1+x^2}{1+y^2}\right}.$$

Transition kernel $P(x, dy) = Q(x, y) A(x, y) dy$.

Step 3: Detailed Balance

Since $Q(x,y) = Q(y,x)$, detailed balance reduces to the Metropolis condition, which the acceptance rule above satisfies by construction.

Step 4: Ergodicity

Gaussian proposal has support on all of $\mathbb{R}$, so the chain is irreducible. The proposal density is continuous and acceptance is not identically zero, so the chain is aperiodic.

Caveat. A Gaussian random-walk proposal on a Cauchy target is not geometrically ergodic under standard scaling: the Cauchy tails are heavier than the Gaussian proposal can reach efficiently. The chain converges but slowly in the tails. For practical sampling this matters; for the existence claim here it does not.

Conclusion (Cauchy)

The kernel satisfies detailed balance with the Cauchy as stationary distribution, and the chain is ergodic. Cauchy is useful in robust statistics and in physical or financial systems where extreme values are common and no mean exists.

d. Gumbel Distribution

Step 1: State Space and Target

Continuous $\mathbb{R}$, standard Gumbel for maxima: $\pi(x) \propto \exp(-x - e^{-x})$. Mode at $x = 0$, right tail heavier than left.

Step 2: Asymmetric Proposal

$$Q(x, y) = \begin{cases} \frac{1}{\sigma_1} \exp\left(-\frac{y-x}{\sigma_1}\right), & y > x, \ \frac{1}{\sigma_2} \exp\left(\frac{y-x}{\sigma_2}\right), & y \leq x, \end{cases}$$

with $\sigma_1 \neq \sigma_2$.

Step 3: Hastings Acceptance Probability

The correct Hastings acceptance is

$$A(x, y) = \min\left{1, \frac{\pi(y) Q(y, x)}{\pi(x) Q(x, y)}\right}.$$

This differs from the Metropolis form because $Q$ is not symmetric. The proposal ratio $Q(y,x)/Q(x,y)$ corrects for the asymmetry.

Proposal ratio, $y > x$. Here $Q(x,y) = (1/\sigma_1) \exp(-(y-x)/\sigma_1)$ is the upward branch. The reverse move $y \to x$ has $x < y$, so $Q(y,x) = (1/\sigma_2) \exp((x-y)/\sigma_2) = (1/\sigma_2) \exp(-(y-x)/\sigma_2)$. So

$$\frac{Q(y,x)}{Q(x,y)} = \frac{\sigma_1}{\sigma_2} \exp\left(-(y-x)\left(\frac{1}{\sigma_2} - \frac{1}{\sigma_1}\right)\right).$$

Proposal ratio, $y \leq x$. Here $Q(x,y) = (1/\sigma_2) \exp((y-x)/\sigma_2)$, and the reverse $y \to x$ has $x > y$, so $Q(y,x) = (1/\sigma_1) \exp(-(x-y)/\sigma_1) = (1/\sigma_1) \exp((y-x)/\sigma_1)$. So

$$\frac{Q(y,x)}{Q(x,y)} = \frac{\sigma_2}{\sigma_1} \exp\left((y-x)\left(\frac{1}{\sigma_1} - \frac{1}{\sigma_2}\right)\right).$$

Substituting into the Hastings acceptance gives the full expression in each case.

Step 4: Detailed Balance

By construction of the Hastings ratio, $\pi(x) Q(x,y) A(x,y) = \pi(y) Q(y,x) A(y,x)$. The asymmetry of $Q$ is absorbed by the $Q(y,x)/Q(x,y)$ factor in the acceptance.

Step 5: Ergodicity

The proposal assigns positive density to every $y \in \mathbb{R}$ for every starting $x$, so the chain is irreducible. Continuous proposal density prevents periodicity.

Conclusion (Gumbel)

The asymmetric exponential proposal with Hastings acceptance gives a chain whose stationary distribution is the Gumbel. The asymmetry of the proposal is what lets the chain track the skew of the target without poor mixing in the heavier right tail. Gumbel is the extreme-value distribution for many maxima-of-samples settings: flood levels, drawdowns, equipment failure.

Conclusion

Four distributions, one recipe. The Gaussian case is the textbook warm-up. The Planck case shows that the same machinery works on discrete quantized states and recovers Bose-Einstein statistics (the classical mode-counting then gives Planck's law, but that step is separate). The Cauchy case shows that a symmetric proposal is enough even when the target has no mean, at the cost of slow tail mixing. The Gumbel case shows what changes when skew forces an asymmetric proposal: the Hastings correction $Q(y,x)/Q(x,y)$ appears in the acceptance, and detailed balance still holds.

What the collection demonstrates is that Metropolis-Hastings is indifferent to the features that make each distribution awkward in its own way. Detailed balance is a local condition on pairs of states, and it does not care about tail weight, discreteness, or skew. Ergodicity needs separate attention in each case (tail mixing for Cauchy in particular), but existence of a chain with the target as stationary distribution is never in doubt.