Linear Regression, Part 5 - Multivariate Solution Implementation in Python

Mon January 3, 2022
machine-learning linear-regression gradient-descent python

In this post we will implement the multivariate linear regression model for 2 features in python. We gave already seen in the last post, that for this case, $a_0$, $a_1$ and $a_2$ can be solved by;

$$a_1 = \frac{ \sum_{i=1}^{n} X_{2i}^2 \sum_{i=1}^{n} X_{1i}y_i - \sum_{i=1}^{n} X_{1i}X_{2i} \sum_{i=1}^{n} X_{2i}y_i } {\sum_{i=1}^{n}X_{1i}^2 \sum_{i=1}^{n}X_{2i}^2 - (\sum_{i=1}^{n} X_{1i}X_{2i})^2}$$

$$a_2 = \frac{ \sum_{i=1}^{n} X_{1i}^2 \sum_{i=1}^{n} X_{2i}y_i - \sum_{i=1}^{n} X_{1i}x_{2i} \sum_{i=1}^{n} X_{1i}y_i } {\sum_{i=1}^{n}X_{1i}^2 \sum_{i=1}^{n}X_{2i}^2 - (\sum_{i=1}^{n} X_{1i}X_{2i})^2}$$

and

$$ a_0 = \bar{\textbf{Y}} - a_1 \bar{\textbf{X}}_1 - a_2 \bar{\textbf{X}}_2$$

where $$ \sum_{i=1}^{n} X_{1i}^2 = \sum_{i=1}^{n} x_{1i}^2 - \frac{\sum_{i=1}^{n} x_{1i}^2}{n}$$

$$ \sum_{i=1}^{n} X_{1i}^2 = \sum_{i=1}^{n} x_{1i}^2 - \frac{\sum_{i=1}^{n} x_{1i}^2}{n}$$

$$ \sum_{i=1}^{n} X_{1i}y_{i} = \sum_{i=1}^{n} x_{1i} \sum_{i=1}^{n} y_{i} - \frac{\sum_{i=1}^{n} x_{1i} \sum_{i=1}^{n} y_{i}}{n}$$

$$ \sum_{i=1}^{n} X_{2i}y_{i} = \sum_{i=1}^{n} x_{2i} \sum_{i=1}^{n} y_{i} - \frac{\sum_{i=1}^{n} x_{2i} \sum_{i=1}^{n} y_{i}}{n}$$

$$ \sum_{i=1}^{n} X_{1i}X_{2i} = \sum_{i=1}^{n} x_{1i} \sum_{i=1}^{n} x_{2i} - \frac{\sum_{i=1}^{n} x_{1i} \sum_{i=1}^{n} x_{2i}}{n}$$

We will start this test by generating a dataset with 2 features and a linear relationship between them. the equation used in this case is:

$$y = 12 + 5x_1 -3x_2$$

The dataset will be generated with the following procedure:

This is accomplished by the following code:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

fig = plt.figure(figsize=(12, 12))
ax = fig.add_subplot(projection='3d')

x1_lower = -10
x1_higher = 10
x1_step = (x1_higher - x1_lower) / 100
x1 = np.arange(x1_lower, x1_higher, x1_step)

x2_lower = 0
x2_higher = 50
x2_step = (x2_higher - x2_lower) / 100
x2= np.arange(x2_lower, x2_higher, x2_step)

# generate the plane
xx1, xx2 = np.meshgrid(x1, x2)
y = 12 + (5 * xx1) + (-3 * xx2)

# add random_multiplier to y
random_multiplier = 5
e = np.random.randn(len(xx1), len(xx2) )*random_multiplier
yy = y + e


df = pd.DataFrame(data=[xx1.ravel(), xx2.ravel(), yy.ravel()]).T
df = df.sample(frac=0.01)
df.columns = ['x1', 'x2', 'y']
df.to_csv("data.csv", header=True, index=False)

# plot the data
y = df.iloc[:,1]
x = df.iloc[:,0]
z = df.iloc[:,2]
ax.scatter(x,y,z, cmap='coolwarm')
plt.show()

A visualization of the data is shown in the figure below.

Dataset used in this example

The implementation of the regression algorithm for this case is as follows:

import pandas as pd

# import data from csv
data = pd.read_csv("data.csv")

data['x1_sq'] = data['x1']**2
data['x2_sq'] = data['x2']**2
data['x1y'] = data['x1']*data['y']
data['x2y'] = data['x2']*data['y']
data['x1x2'] = data['x1']*data['x2']

n = len(data)

sum_X1_sq = data['x1_sq'].sum() - (data['x1'].sum()**2)/n
print(f'sum_X1_sq: {sum_X1_sq}')

sum_X2_sq = data['x2_sq'].sum() - (data['x2'].sum()**2)/n
print(f'sum_x2_sq: {sum_X2_sq}')

sum_X1y = data['x1y'].sum() - (data['x1'].sum()*data['y'].sum())/n
print(f'sum_X1y: {sum_X1y}')

sum_X2y = data['x2y'].sum() - (data['x2'].sum()*data['y'].sum())/n
print(f'sum_X2y: {sum_X2y}')

sum_X1X2 = data['x1x2'].sum() - (data['x1'].sum()*data['x2'].sum())/n
print(f'sum_X1X2: {sum_X1X2}')

mean_y = data['y'].mean()
mean_x1 = data['x1'].mean()
mean_x2 = data['x2'].mean()

n = len(data)

a1 = (sum_X2_sq*sum_X1y - sum_X1X2*sum_X2y)/(sum_X1_sq*sum_X2_sq - sum_X1X2**2)

a2 = (sum_X1_sq*sum_X2y - sum_X1X2*sum_X1y)/(sum_X1_sq*sum_X2_sq - sum_X1X2**2)

a0 = mean_y - a1*mean_x1 - a2*mean_x2

print(f'a0: {a0}, a1: {a1}, a2: {a2}')

The results obtained are; $a_0: 11.511, a_1: 5.029, a2: -2.992$. Close enough when considering the noise in the data and that we only used a very small portion of the data for the test.

Conclusion

It is important to note that there was an explosion in the number of calculations required to solve this problem when compared to the univariate case, so we can conclude that solving this problem with this method for more features will be prohibitively computationally expensive. An approximation that will give reasonably close results will be discussed in a future post.




Logistic Regression

Derivation of logistic regression
machine-learning

Notes about Azure ML, Part 11 - Model Validation in AzureML

March 9, 2023
machine-learning azure ml hyperparameter tuning model optimization

Paper Implementation - Uncertain rule-based fuzzy logic systems Introduction and new directions-Jerry M. Mendel; Prentice-Hall, PTR, Upper Saddle River, NJ, 2001,    555pp., ISBN 0-13-040969-3. Example 9-4, page 261

October 8, 2022
type2-fuzzy type2-fuzzy-library fuzzy python IT2FS paper-workout
comments powered by Disqus


machine-learning 27 python 21 fuzzy 14 azure-ml 11 hugo_cms 11 linear-regression 10 gradient-descent 9 type2-fuzzy 8 type2-fuzzy-library 8 type1-fuzzy 5 cnc 4 dataset 4 datastore 4 it2fs 4 excel 3 paper-workout 3 r 3 c 2 c-sharp 2 experiment 2 hyperparameter-tuning 2 iot 2 model-optimization 2 programming 2 robotics 2 weiszfeld_algorithm 2 arduino 1 automl 1 classifier 1 computation 1 cost-functions 1 development 1 embedded 1 fuzzy-logic 1 game 1 javascript 1 learning 1 mathjax 1 maths 1 mxchip 1 pandas 1 pipeline 1 random_walk 1 roc 1 tools 1 vscode 1 wsl 1