Simple nonlinear least squares curve fitting in Python

The problem

Today we are going to test a very simple example of nonlinear least squares curve fitting using the scipy.optimize module.

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

Create data

Let’s assume we have the following points [xdata, ydata] and that we want to fit these data with the following model function using nonlinear least squares:

$F(p_1,p_2,x) = p_1\cos(p_2x) + p_2\sin(p_1x)$

For now, we are primarily interested in the following results:

• The fit parameters
• Sum of squared residuals
xdata = np.array([-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9])
ydata = np.array([0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001])

# Show data points
plt.plot(xdata,ydata,'*')
plt.xlabel('xdata')
plt.ylabel('ydata'); Define fit function

def func(x, p1,p2):
return p1*np.cos(p2*x) + p2*np.sin(p1*x)

Calculate and show fit parameters.

Use a starting guess of $p_1=1$ and $p_2=0.2$

The outputs of the curve_fit function are the following:

• popt : array of optimal values for the parameters so that the sum of the squared error of $f(xdata, *popt) - ydata$ is minimized

• pcov : 2d array of the estimated covariance of popt. The diagonals provide the variance of the parameter estimate. To compute one standard deviation errors on the parameters use $perr = np.sqrt(np.diag(pcov))$. If the Jacobian matrix at the solution doesn’t have a full rank, then ‘lm’ method returns a matrix filled with np.inf, on the other hand ‘trf’ and ‘dogbox’ methods use Moore-Penrose pseudoinverse to compute the covariance matrix.

popt, pcov = curve_fit(func, xdata, ydata,p0=(1.0,0.2))

print("Parameter estimation results:")
print("p1 = ",popt," | p2 = ",popt)
print("--------------------------")
print("Covariance matrix of the estimate:")
print(pcov)
Parameter estimation results:
p1 =  1.881850994  | p2 =  0.700229857403
--------------------------
Covariance matrix of the estimate:
[[  7.52408290e-04   1.00812823e-04]
[  1.00812823e-04   8.37695698e-05]]

Sum of squares of residuals

Since it’s not given by the curve_fit function, we have to compute it by hand

p1 = popt
p2 = popt
residuals = ydata - func(xdata,p1,p2)
fres = sum(residuals**2)

print("Residuals sum of squares:")
print(fres)
Residuals sum of squared:
0.0538126964188

Plot fitted curve along with data

curvex=np.linspace(-2,3,100)
curvey=func(curvex,p1,p2)
plt.plot(xdata,ydata,'*')
plt.plot(curvex,curvey,'r')
plt.xlabel('xdata')
plt.ylabel('ydata');  