Weighted and non-weighted least-squares fitting

Learning Scientific Programming with Python (2nd edition)

E8.25: Weighted and non-weighted least-squares fitting

To illustrate the use of curve_fit in weighted and unweighted least squares fitting, the following program fits the Lorentzian line shape function centered at $x_0$ with halfwidth at half-maximum (HWHM), $\gamma$, amplitude, $A$: $$ f(x) = \frac{A \gamma^2}{\gamma^2 + (x-x_0)^2}, $$ to some artificial noisy data. The fit parameters are $A$, $\gamma$ and $x_0$. The noise is such that a region of the data close to the line centre is much noisier than the rest.

import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt

x0, A, gamma = 12, 3, 5

n = 200
x = np.linspace(1, 20, n)
yexact = A * gamma**2 / (gamma**2 + (x - x0) ** 2)

# Add some noise with a sigma of 0.5 apart from a particularly noisy region
# near x0 where sigma is 3
sigma = np.ones(n) * 0.5
sigma[np.abs(x - x0 + 1) < 1] = 3
noise = np.random.randn(n) * sigma
y = yexact + noise


def f(x, x0, A, gamma):
    """The Lorentzian entered at x0 with amplitude A and HWHM gamma."""
    return A * gamma**2 / (gamma**2 + (x - x0) ** 2)


def rms(y, yfit):
    return np.sqrt(np.sum((y - yfit) ** 2))


# Unweighted fit
p0 = 10, 4, 2
popt, pcov = curve_fit(f, x, y, p0)
yfit = f(x, *popt)
print("Unweighted fit parameters:", popt)
print("Covariance matrix:")
print(pcov)
print("rms error in fit:", rms(yexact, yfit))
print()

# Weighted fit
popt2, pcov2 = curve_fit(f, x, y, p0, sigma=sigma, absolute_sigma=True)
yfit2 = f(x, *popt2)
print("Weighted fit parameters:", popt2)
print("Covariance matrix:")
print(pcov2)
print("rms error in fit:", rms(yexact, yfit2))

plt.plot(x, yexact, label="Exact")
plt.errorbar(
    x,
    y,
    yerr=np.abs(noise),
    elinewidth=0.5,
    c="gray",
    marker="+",
    lw=0,
    label="Noisy data",
)
plt.plot(x, yfit, label="Unweighted fit")
plt.plot(x, yfit2, label="Weighted fit")
plt.ylim(-1, 4)
plt.legend(loc="lower center")
plt.show()

Weighted and unweighted least-squares fitting to a Lorentzian function.

As the figure above shows, the unweighted fit is seen to be thrown off by the noisy region. Data in this region are given a lower weight in the weighted fit and so the parameters are closer to their true values and the fit better. The output is:

Unweighted fit parameters: [12.61968131  2.50893152  5.95025464]
Covariance matrix:
[[ 0.17447292 -0.0042928   0.04156738]
 [-0.0042928   0.03051659 -0.09208499]
 [ 0.04156738 -0.09208499  0.54308245]]
rms error in fit: 3.737100164521528

Weighted fit parameters: [12.12134294  2.96625048  5.05555878]
Covariance matrix:
[[ 0.02135293 -0.00360315  0.0078179 ]
 [-0.00360315  0.01100727 -0.02128371]
 [ 0.0078179  -0.02128371  0.06642836]]
rms error in fit: 0.49516367688465607