![Machine Learning Algorithms](https://wfqqreader-1252317822.image.myqcloud.com/cover/939/36700939/b_36700939.jpg)
A bidimensional example
Let's consider a small dataset built by adding some uniform noise to the points belonging to a segment bounded between -6 and 6. The original equation is: y = x + 2 + n, where n is a noise term.
In the following figure, there's a plot with a candidate regression function:
![](https://epubservercos.yuewen.com/002F8F/19470406301631106/epubprivate/OEBPS/Images/515b53fe-0778-46b4-8ced-d92faa55552a.png?sign=1739052517-V6m73KEyEPDmQxluaEQTaYEzgnLd5tW3-0-9c24454d96504c1ecf207bebec2c74e5)
As we're working on a plane, the regressor we're looking for is a function of only two parameters:
![](https://epubservercos.yuewen.com/002F8F/19470406301631106/epubprivate/OEBPS/Images/093dc831-3451-4ced-bfa4-214e855fed03.png?sign=1739052517-sEf0DPwh7yHpud9TRlxbXYYOgPvxCFrB-0-c45ee568087966eb3d095ae9ae1eb793)
In order to fit our model, we must find the best parameters and to do that we choose an ordinary least squares approach. The loss function to minimize is:
![](https://epubservercos.yuewen.com/002F8F/19470406301631106/epubprivate/OEBPS/Images/3b77c9a2-f13c-4d4f-aff5-7131885d78c3.png?sign=1739052517-MSYVdJVZ7kwfyLkRciKk5OVNTUJmh8kc-0-5759ed203e3772a9da7e8e938f339b29)
With an analytic approach, in order to find the global minimum, we must impose:
![](https://epubservercos.yuewen.com/002F8F/19470406301631106/epubprivate/OEBPS/Images/1cdfb4e6-2148-4e24-97b9-d5fa4ba7a832.png?sign=1739052517-K2sIqClqM556oXTaVEMf0zbIIBufoXNS-0-267cfc4374f5c222467bf8617b0e02ae)
So (for simplicity, it accepts a vector containing both variables):
import numpy as np
def loss(v):
e = 0.0
for i in range(nb_samples):
e += np.square(v[0] + v[1]*X[i] - Y[i])
return 0.5 * e
And the gradient can be defined as:
def gradient(v):
g = np.zeros(shape=2)
for i in range(nb_samples):
g[0] += (v[0] + v[1]*X[i] - Y[i])
g[1] += ((v[0] + v[1]*X[i] - Y[i]) * X[i])
return g
The optimization problem can now be solved using SciPy:
from scipy.optimize import minimize
>>> minimize(fun=loss, x0=[0.0, 0.0], jac=gradient, method='L-BFGS-B')
fun: 9.7283268345966025
hess_inv: <2x2 LbfgsInvHessProduct with dtype=float64>
jac: array([ 7.28577538e-06, -2.35647522e-05])
message: 'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
nfev: 8
nit: 7
status: 0
success: True
x: array([ 2.00497209, 1.00822552])
As expected, the regression denoised our dataset, rebuilding the original equation: y = x + 2.