Bayesian Statistics Lab#

In this hands-on lab, we continue our demonstration of “estimating the mass of a new fundamental particle”. We will generate multiple experiments, each giving a noisy measurement of the particle’s mass, and sequentially update our posterior distribution after each experiment. We will then discuss what we should do when new theoretical prior appears.

Physical Setup (Brief Recap)#

Let’s update some notation from the notes.

We have a particle of true mass \(m_\text{true}\), measured in TeV. Each experiment yields an observed mass \(m_\text{obs}\) with Gaussian noise:

(289)#\[\begin{align} m_\text{obs} \;\sim\; \mathcal{N}(m_\text{true},\sigma_\text{expr}^2). \end{align}\]

Here, \(\sigma_\text{expr}\) is the detector resolution or statistical uncertainty.

We know that \(m_\text{true}\) lies in some range \([2,8]\) TeV—our initial theory suggests it cannot be outside this window. Hence, our initial prior:

(290)#\[\begin{align} p(\theta) = \begin{cases} 1/(8 - 2), & 2 \le \theta \le 8, \\ 0, & \text{otherwise}. \end{cases} \end{align}\]

Each measurement modifies our belief (the prior) into a posterior via Bayes’ Theorem:

(291)#\[\begin{align} p(m \mid m_{\text{obs}}) \propto p(m_{\text{obs}} \mid m) \, p(m). \end{align}\]

Here, the likelihood \(p(m_\text{obs} \mid m_\text{true})\) is given by the Gaussian formula:

(292)#\[\begin{align} p(m_{\text{obs}} \mid m) = \frac{1}{\sqrt{2\pi}\,\sigma_\text{expr}} \exp\left[-\frac{(m_{\text{obs}}-m)^2}{2\sigma_\text{expr}^2}\right]. \end{align}\]

Single Experiment Code (Grid Approximation)#

Please rewrite the code in the class note into multiple functions that we can perform numerical experiments.

import numpy as np
import matplotlib.pyplot as plt
# For reproducibility
np.random.seed(42)
# HANDSON: Define range for m (0 to 10 TeV)
ms = pass
  Cell In[3], line 2
    ms = pass
         ^
SyntaxError: invalid syntax
# HANDSON: Gaussian likelihood
def likelihood(x, x0, sigma):
    pass
def run_expr(
    prior,        # Prior before updated by the experiment
    m_true,       # Suppose this is the true mass
    sigma_expr,   # Detector resolution
    seed = None,  # For reproducibility
):
    if seed is not None:
        np.random.seed(seed)

    # HANDSON: By running an experiment, we obtain a single measurement
    # You may use np.random.normal().
    m_obs = pass
    
    # HANDSON: Posterior ~ prior * likelihood
    pass

    # HANDSON: Normalize the posterior
    pass

    return m_obs, post
# Uniform prior in [2, 8]
def prior0(ms, m_min=2, m_max=8):
    # HANDSON: Uniform in [2,8], zero outside
    pass

# Compute prior
prior = prior0(ms)
# Run a numerical experiment
m_true     = 3.6  # Set the groundtruth
sigma_expr = 1.0  # Detector resolution

m_obs, post = run_expr(prior, m_true, sigma_expr)
plt.axvline(m_true, ls=':',  color='k',  label=r"$m_\text{true}="+f"{m_true:.3f}$TeV")
plt.axvline(m_obs,  ls='--', color='C1', label=r"$m_\text{obs}= "+f"{m_obs:.3f}$TeV")
plt.plot(ms, post, 'k', label="Posterior (1 experiment)")
plt.title("Posterior after a single measurement")
plt.xlabel("Mass (TeV)")
plt.ylabel("Probability Density")
plt.legend()

Multiple Experiments#

Now we simulate \(N\) experiments. Each experiment provides \((m_{\text{obs},i}, \sigma_i) = (m_{\text{obs},i}, \sigma_\text{expr})\). We update our posterior step by step:

  1. Start with the prior (initially uniform in \([2,8]\)).

  2. For each experiment \(i\), multiply the current posterior by the new likelihood.

  3. Normalize to get the updated posterior, which becomes the prior for the next experiment.

# For reproducibility
np.random.seed(42)
# HANDSON: Let's simulate multiple experiments
m_true     = 3.6  # Set the groundtruth
sigma_expr = 1.0  # Detector resolution
N          = 10   # Number of experiments

ms_obs = pass # Draw N samples from a normal distribution.  You may use np.random.normal().

print("Simulated experiment results:")
for i, m_obs in enumerate(ms_obs):
    print(f"\tExperiment {i+1}: observed mass = {m_obs:.3f} ± {sigma_expr:.3f}")
# HANDSON: compute the prior from prior0()

pass
# Perform sequential Bayesian updates using the same grid approach

plt.figure(figsize=(8,5))
plt.axvline(m_true, color='k', ls=':', label=r'$m_\text{true}$')
plt.plot(ms, prior, label="Initial Prior", lw=2)

for i, m_obs in enumerate(ms_obs):
    # HANDSON: compute the posterior
    m_obs, post = pass

    # Plot the posterior
    plt.plot(ms, post, label=f"Posterior after Exp {i+1}")

    # HANDSON: posterior becomes prior for next iteration
    pass

plt.title("Sequential Bayesian Updates of Particle Mass")
plt.xlabel("Mass (TeV)")
plt.ylabel("Probability Density")
plt.legend()

You will see each new experiment narrowing or shifting the distribution.

# HANDSON:
# Try to increase `N` to, e.g, 100 and plot the posterior every 10 experiments.
# What do you see?
# HANDSON:
# Try to change `m_true` to a value outside the theory, e.g., 1.5, and plot the posterior.
# What do you see?
# HANDSON:
# Suppose now there are two theories, one suggests the mass range [2,8] TeV, one suggests the mass range [1,5] TeV.
# How would you take this new theory into account?
# Try implement your idea(s) and interpret the results.