Statistical Mechanics: Ensembles, the Microcanonical Ensemble, and the Canonical Ensemble

4.7.8. Statistical Mechanics: Ensembles, the Microcanonical Ensemble, and the Canonical Ensemble#

This lesson covers the first few sections of Chapter 3 of Chandler, Introduction to Modern Statistical Mechanics.

The central transition is:

Thermodynamics tells us what equilibrium macrostates do. Statistical mechanics explains why, by assigning probabilities to microscopic states.

4.7.8.1. Learning Goals#

After this minute lesson, students will be able to:

Distinguish a microscopic state from a macroscopic thermodynamic state.
Explain the idea of an ensemble and the difference between time averages and ensemble averages.
State the microcanonical postulate: all accessible microstates with fixed \(E\), \(V\), and \(N\) are equally likely.
Define the number of accessible states, \(\Omega(N,V,E)\), and connect it to entropy:

(4.1119)#\[\begin{equation} S = k_B \ln \Omega(N,V,E) \end{equation}\]

Explain how the second law emerges statistically from counting microstates.
Derive the Boltzmann distribution for a system in contact with a heat bath.
Define the canonical partition function:

(4.1120)#\[\begin{equation} Q(\beta,N,V)=\sum_\nu e^{-\beta E_\nu} \end{equation}\]

Use \(Q\) to compute probabilities and average energy in a simple model.

4.7.8.2. Coding Concepts#

The following coding concepts are used in this notebook:

The code is intended to support physical intuition. The statistical mechanical ideas are the main focus.

4.7.8.3. From Thermodynamics to Statistical Mechanics#

In Chapters 1 and 2, we described equilibrium using macroscopic thermodynamic variables such as \(E\), \(S\), \(V\), \(T\), \(p\), and \(\mu\).

But a real molecular system is described microscopically by many more variables. For a classical system of \(N\) particles, a microscopic state is specified by positions and momenta:

(4.1121)#\[\begin{equation} \Gamma = (\mathbf{r}_1,\ldots,\mathbf{r}_N,\mathbf{p}_1,\ldots,\mathbf{p}_N) \end{equation}\]

A trajectory is a path through this high-dimensional state space.

The goal of statistical mechanics is to replace an impossible deterministic calculation with a probabilistic one.

4.7.8.4. Microstates and Macrostates#

A microstate is a detailed microscopic specification of the system.

A macrostate is specified by a much smaller number of variables, such as \(E\), \(V\), and \(N\).

Many microstates can correspond to the same macrostate. For example, an ideal gas with fixed \(E\), \(V\), and \(N\) can have many different arrangements of positions and momenta.

The central statistical mechanical question is:

Given a macrostate, how should we assign probabilities to the compatible microstates?

4.7.8.5. Time Averages and Ensemble Averages#

Suppose we measure an observable \(G\) many times along a trajectory. A time average has the form

(4.1122)#\[\begin{equation} G_{\mathrm{obs}} = \frac{1}{M}\sum_{a=1}^{M}G_a \end{equation}\]

Statistical mechanics replaces this with an ensemble average:

(4.1123)#\[\begin{equation} \langle G \rangle = \sum_\nu P_\nu G_\nu \end{equation}\]

where \(P_\nu\) is the probability of microscopic state \(\nu\).

The assumption that time averages and ensemble averages agree is related to ergodicity.

4.7.8.6. Demonstration: Time Average Approaching an Ensemble Average#

The code below simulates a simple trajectory among discrete states. Each state has a value \(G_\nu\).

As the trajectory samples the states, the running time average approaches the ensemble average.

import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng(123)

# A simple set of states with assigned observable values
G_states = np.array([0.0, 1.0, 2.0, 5.0, 8.0])
P_states = np.array([0.10, 0.20, 0.35, 0.25, 0.10])
P_states = P_states / P_states.sum()

n_steps = 5000
visited_states = rng.choice(len(G_states), size=n_steps, p=P_states)
G_traj = G_states[visited_states]

running_average = np.cumsum(G_traj) / np.arange(1, n_steps + 1)
ensemble_average = np.sum(P_states * G_states)

plt.figure(figsize=(7, 4))
plt.plot(running_average, label="Running time average")
plt.axhline(ensemble_average, linestyle="--", label="Ensemble average")
plt.xlabel("Step")
plt.ylabel("Average value of $G$")
plt.title("Time average approaches ensemble average")
plt.legend()
plt.show()

../../_images/9c5590ce5787afee5a2d8f024e0655ecbc18775fee8ab88b8a09588c41cadc1a.png

4.7.8.6.1. Discussion:#

What assumption is being made when we replace a time average by an ensemble average?
What could go wrong in a molecular simulation that does not sample all important regions of state space?
How is this related to rare conformational transitions in protein simulations?

4.7.9. Part 1: The Microcanonical Ensemble#

The microcanonical ensemble describes an isolated system with fixed:

(4.1124)#\[\begin{equation} E, \qquad V, \qquad N \end{equation}\]

The statistical postulate is:

All accessible microscopic states consistent with fixed \(E\), \(V\), and \(N\) are equally likely.

If there are \(\Omega(N,V,E)\) accessible microstates, then

(4.1125)#\[\begin{equation} P_\nu = \frac{1}{\Omega(N,V,E)} \end{equation}\]

for states \(\nu\) in the ensemble, and \(P_\nu=0\) for inaccessible states.

4.7.9.1. Entropy as Counting#

Chandler defines entropy statistically as

(4.1126)#\[\begin{equation} S = k_B \ln \Omega(N,V,E) \end{equation}\]

This formula explains why entropy is extensive.

If two independent subsystems have numbers of states \(\Omega_A\) and \(\Omega_B\), then the combined system has

(4.1127)#\[\begin{equation} \Omega_{A+B}=\Omega_A\Omega_B \end{equation}\]

Therefore,

(4.1128)#\[\begin{align} S_{A+B} &= k_B \ln(\Omega_A\Omega_B) \\\\ &= k_B \ln \Omega_A + k_B \ln \Omega_B \\\\ &= S_A + S_B \end{align}\]

The logarithm turns multiplication of microscopic possibilities into addition of entropy.

4.7.9.2. Demonstration: A Two-State Spin Model#

Consider \(N\) independent spins. Each spin can be in one of two states. If \(n\) spins are excited, the energy is

(4.1129)#\[\begin{equation} E = n\epsilon \end{equation}\]

The number of microstates with this energy is

(4.1130)#\[\begin{equation} \Omega(n)=\binom{N}{n} \end{equation}\]

The entropy is therefore

(4.1131)#\[\begin{equation} S(n)=k_B\ln\binom{N}{n} \end{equation}\]

from math import comb

N = 60
n_excited = np.arange(N + 1)
Omega = np.array([comb(N, int(n)) for n in n_excited], dtype=float)
S_over_kB = np.log(Omega)
E_over_epsilon = n_excited

plt.figure(figsize=(7, 4))
plt.plot(E_over_epsilon, S_over_kB, marker="o", markersize=3)
plt.xlabel("Energy, $E/\\epsilon$")
plt.ylabel("Entropy, $S/k_B$")
plt.title("Entropy from counting microstates")
plt.show()

../../_images/45091ca34d954e701cf9b028e0da398b8acd437f7590a1b301661bfa9094f9f4.png

4.7.9.2.1. Discussion:#

For what value of \(n\) is \(\Omega(n)\) largest?
Why does this macrostate dominate the equilibrium behavior for large \(N\)?
Does the most probable macrostate have the lowest energy?
What does this example teach us about entropy maximization?

4.7.9.3. Temperature from the Slope of Entropy#

Thermodynamics defines temperature through

(4.1132)#\[\begin{equation} \left(\frac{\partial S}{\partial E}\right)_{N,V}=\frac{1}{T} \end{equation}\]

Using \(S=k_B\ln\Omega\), this becomes

(4.1133)#\[\begin{equation} \beta = \frac{1}{k_B T} = \left(\frac{\partial \ln \Omega}{\partial E}\right)_{N,V} \end{equation}\]

Thus temperature is determined by how rapidly the number of accessible microstates grows with energy.

4.7.9.4. Important Subtlety: Negative Temperature#

For ordinary macroscopic systems, \(\Omega(N,V,E)\) usually increases with \(E\), so \(T>0\).

But for systems with an upper bound on energy, such as a collection of spins in a magnetic field, \(\\Omega\) can eventually decrease as energy increases.

In that case,

(4.1134)#\[\begin{equation} \left(\frac{\partial \ln \Omega}{\partial E}\right)_{N,V}<0 \end{equation}\]

and the formal temperature is negative.

This is not colder than zero. It is hotter than any positive temperature, because adding energy decreases entropy.

# Estimate beta from the numerical derivative of S/kB with respect to E/epsilon
# This gives beta * epsilon, since beta = d ln(Omega) / dE.

dS_dE = np.gradient(S_over_kB, E_over_epsilon)

plt.figure(figsize=(7, 4))
plt.plot(E_over_epsilon, dS_dE, marker="o", markersize=3)
plt.axhline(0.0, linestyle="--")
plt.xlabel("Energy, $E/\\epsilon$")
plt.ylabel("$\\beta \\epsilon = \\partial \\ln \\Omega / \\partial (E/\\epsilon)$")
plt.title("Temperature is related to the slope of entropy")
plt.show()

../../_images/fbd9e7eb4239d8e0f78b16f227b6e1f2399ef80a4eba6cc69bc7ac50858b0e04.png

4.7.10. Part 2: The Canonical Ensemble#

The microcanonical ensemble applies naturally to an isolated system.

Most chemical systems are not isolated. They exchange energy with their surroundings and are more naturally described at fixed:

(4.1135)#\[\begin{equation} T, \qquad V, \qquad N \end{equation}\]

This leads to the canonical ensemble.

The system energy is allowed to fluctuate, but the temperature is fixed by contact with a heat bath.

4.7.10.1. Deriving the Boltzmann Distribution#

Imagine a small system in state \(\nu\) with energy \(E_\nu\), in contact with a large heat bath.

The total energy is fixed:

(4.1136)#\[\begin{equation} E_{\mathrm{tot}} = E_{\mathrm{bath}} + E_\nu \end{equation}\]

If the system has energy \(E_\nu\), the bath has energy

(4.1137)#\[\begin{equation} E_{\mathrm{bath}} = E_{\mathrm{tot}} - E_\nu \end{equation}\]

The probability of the system being in state \(\nu\) is proportional to the number of bath states compatible with that energy:

(4.1138)#\[\begin{equation} P_\nu \propto \Omega_{\mathrm{bath}}(E_{\mathrm{tot}}-E_\nu) \end{equation}\]

4.7.10.2. Taylor Expansion of the Bath Entropy#

Write the probability in terms of the logarithm:

(4.1139)#\[\begin{equation} P_\nu \propto \exp\left[\ln \Omega_{\mathrm{bath}}(E_{\mathrm{tot}}-E_\nu)\right] \end{equation}\]

For a very large bath, \(E_\nu\) is small compared to \(E_{\mathrm{tot}}\), so expand:

(4.1140)#\[\begin{equation} \ln \Omega_{\mathrm{bath}}(E_{\mathrm{tot}}-E_\nu) \approx \ln \Omega_{\mathrm{bath}}(E_{\mathrm{tot}}) - E_\nu \left(\frac{\partial \ln \Omega}{\partial E}\right)_{\mathrm{bath}} \end{equation}\]

Using

(4.1141)#\[\begin{equation} \left(\frac{\partial \ln \Omega}{\partial E}\right)_{\mathrm{bath}} =\beta \end{equation}\]

we obtain

(4.1142)#\[\begin{equation} P_\nu \propto e^{-\beta E_\nu} \end{equation}\]

This is the Boltzmann distribution.

4.7.10.3. The Canonical Partition Function#

Normalization requires

(4.1143)#\[\begin{equation} \sum_\nu P_\nu = 1 \end{equation}\]

Therefore,

(4.1144)#\[\begin{equation} P_\nu = \frac{e^{-\beta E_\nu}}{Q} \end{equation}\]

where

(4.1145)#\[\begin{equation} Q(\beta,N,V)=\sum_\nu e^{-\beta E_\nu} \end{equation}\]

is the canonical partition function.

The partition function is the normalization constant, but it is also much more: it contains the thermodynamics of the system.

4.7.10.4. Ensemble Averages in the Canonical Ensemble#

For any observable \(G\), the canonical ensemble average is

(4.1146)#\[\begin{equation} \langle G \rangle = \sum_\nu P_\nu G_\nu \end{equation}\]

In particular, the average energy is

(4.1147)#\[\begin{align} \langle E \rangle &=\sum_\nu P_\nu E_\nu \\\\ &=\frac{\sum_\nu E_\nu e^{-\beta E_\nu}}{\sum_\nu e^{-\beta E_\nu}} \end{align}\]

This can be written compactly as

(4.1148)#\[\begin{equation} \langle E \rangle = -\left(\frac{\partial \ln Q}{\partial \beta}\right)_{N,V} \end{equation}\]

4.7.10.5. Demonstration: A Two-Level System in the Canonical Ensemble#

Consider one particle with two possible energies:

(4.1149)#\[\begin{equation} E_0=0, \qquad E_1=\epsilon \end{equation}\]

The partition function is

(4.1150)#\[\begin{equation} Q = 1 + e^{-\beta\epsilon} \end{equation}\]

The probability of the excited state is

(4.1151)#\[\begin{equation} P_1 = \frac{e^{-\beta\epsilon}}{1+e^{-\beta\epsilon}} \end{equation}\]

# Work in units where epsilon = 1 and kB = 1.

T = np.linspace(0.1, 10.0, 300)
beta = 1.0 / T
epsilon = 1.0

Q = 1.0 + np.exp(-beta * epsilon)
P_excited = np.exp(-beta * epsilon) / Q
E_avg = epsilon * P_excited

plt.figure(figsize=(7, 4))
plt.plot(T, P_excited, label="$P_1$")
plt.plot(T, E_avg / epsilon, linestyle="--", label="$\\langle E \\rangle/\\epsilon$")
plt.xlabel("Temperature, $T$ in units of $\\epsilon/k_B$")
plt.ylabel("Probability or reduced energy")
plt.title("Two-level system in the canonical ensemble")
plt.legend()
plt.show()

../../_images/47eca7a20726c80d071738cc92439be576b8a49d96370187d5f4cd43cca04296.png

4.7.10.5.1. Discussion Prompt#

What happens to \(P_1\) as \(T\rightarrow 0\)?
What happens to \(P_1\) as \(T\rightarrow \infty\)?
Why does the average energy approach \(\epsilon/2\) at high temperature?
How would the result change if the excited state were degenerate?

4.7.10.6. Free Energy Preview#

Chandler notes that \(\ln Q\) behaves like a thermodynamic function.

The connection is the Helmholtz free energy:

(4.1152)#\[\begin{equation} A = -k_B T \ln Q \end{equation}\]

This is one of the most important bridges between microscopic statistical mechanics and macroscopic thermodynamics.

Once \(Q\) is known, thermodynamic quantities can be obtained from derivatives of \(A\).

4.7.10.7. Summary#

The first few sections of Chapter 3 introduce the statistical foundation of thermodynamics.

The key ideas are:

A microscopic state contains far more information than a thermodynamic macrostate.
An ensemble is a probability distribution over microstates.
The microcanonical ensemble assigns equal probability to all states with fixed \(E\), \(V\), and \(N\).
Entropy is the logarithm of the number of accessible states:

(4.1153)#\[\begin{equation} S = k_B \ln \Omega \end{equation}\]

The second law becomes a statement about counting: equilibrium corresponds to the macrostate compatible with the largest number of microstates.
A system in contact with a heat bath has probabilities

(4.1154)#\[\begin{equation} P_\nu = \frac{e^{-\beta E_\nu}}{Q} \end{equation}\]

The canonical partition function is

(4.1155)#\[\begin{equation} Q=\sum_\nu e^{-\beta E_\nu} \end{equation}\]

and it encodes the thermodynamics.

4.7.10.8. End-of-Class Check#

Students should be able to answer these questions without notes:

What is the difference between a microstate and a macrostate?
What is an ensemble average?
What is the microcanonical ensemble?
Why does \(S=k_B\ln\Omega\) make entropy extensive?
Why does removing an internal constraint increase entropy?
Why does a system in contact with a heat bath have probabilities proportional to \(e^{-\beta E_\nu}\)?
What is the canonical partition function?
How do you compute \(\langle E\rangle\) from \(Q\)?

4.7.10.9. Instructor Notes#

For a one hour lecture, the most important derivation to do live is the canonical distribution:

(4.1156)#\[\begin{equation} P_\nu \propto \Omega_{\mathrm{bath}}(E_{\mathrm{tot}}-E_\nu) \rightarrow P_\nu \propto e^{-\beta E_\nu} \end{equation}\]

This derivation helps students see that the Boltzmann factor is not arbitrary. It comes from applying the microcanonical ensemble to a larger isolated system.

If time is short, skip the negative-temperature discussion and use it as a thought question or homework prompt.