# Linear Regression Program in Python from Scratch

Linear regression is a machine learning algorithm in which a best scalar response is established according to the variables. This scalar has the least error and can be used for prediction tasks.

Linear regression works by finding the best values for the variables for equation y = w0 + w1*x, where w0, and w1 are variables. This can be extended to any number of features.
For example: y = w0 + w1*x1 + w2*x2 + … + wn*xn. The best values are found using an algorithm known as gradient descend. But in this post, we will covariance, variance, and mean to find the values for the variables.

We will write the program for the linear regression implementation. We will use python to write this program and we will not use any libraries.

### Input

We have the dataset given below. It consists of 2D points.

```data = [
[0,1],
[1,2],
[2,4],
[3,6],
[4,7],
[5,8],
[6,9],
[7,9],
[8,10],
[9,12]
]

x = [i for i in data]
y = [i for i in data]

import matplotlib.pyplot as plt
plt.scatter(x,y)
plt.show()```

### Mean, covariance, and variance function

Now, we will define functions to calculate mean, covariance and variance.

```import math

def mean(n):
return sum(n) / float(len(n))

def covariance(x, mx, y, my):
covar = 0.0
for i in range(len(x)):
covar += (x[i] - mx) * (y[i] - my)
return covar

def variance(values, mean):
return sum([(x-mean)**2 for x in values])```

### Algorithm

Now, we will find the variables using the following code:

```x_mean, y_mean = mean(x), mean(y)
w1 = covariance(x, x_mean, y, y_mean) / variance(x, x_mean)
w0 = y_mean - w1 * x_mean
print(w0)
print(w1)```
```1.6181818181818173
1.1515151515151516```

### Plotting the line

After applying the algorithm, we will plot the predicted line.

```import numpy as np
x = np.array(x)
plt.scatter(x,y)
plt.plot(x, w0 + w1*x, linestyle='solid')
plt.show()```

### Calculating error

We will now find the RMSE error for the predicted line.

```def rmse(actual, predicted):
sum_error = 0.0
for i in range(len(actual)):
prediction_error = predicted[i] - actual[i]
sum_error += (prediction_error ** 2)
mean_error = sum_error / float(len(actual))
return (mean_error)**0.5

yPred = [w0 + q*w1 for q in x]
print(rmse(y,yPred))```
`0.6485414871895709`

### Predicting the input

Now, we can predict the y value for any input x.

```print("Enter the x to classify")
test = [int(i) for i in input().split()]
yNew = w0 + w1 * test
print(yNew)```
```Enter the x to classify
5
7.375757575757575```

That’s it. This is the complete implementation of linear regression.

### Complete code

```data = [
[0,1],
[1,2],
[2,4],
[3,6],
[4,7],
[5,8],
[6,9],
[7,9],
[8,10],
[9,12]
]

x = [i for i in data]
y = [i for i in data]

import matplotlib.pyplot as plt
plt.scatter(x,y)
plt.show()

import math

def mean(n):
return sum(n) / float(len(n))

def covariance(x, mx, y, my):
covar = 0.0
for i in range(len(x)):
covar += (x[i] - mx) * (y[i] - my)
return covar

def variance(values, mean):
return sum([(x-mean)**2 for x in values])

x_mean, y_mean = mean(x), mean(y)
w1 = covariance(x, x_mean, y, y_mean) / variance(x, x_mean)
w0 = y_mean - w1 * x_mean
print(w0)
print(w1)

import numpy as np
x = np.array(x)
plt.scatter(x,y)
plt.plot(x, w0 + w1*x, linestyle='solid')
plt.show()

def rmse(actual, predicted):
sum_error = 0.0
for i in range(len(actual)):
prediction_error = predicted[i] - actual[i]
sum_error += (prediction_error ** 2)
mean_error = sum_error / float(len(actual))
return (mean_error)**0.5

yPred = [w0 + q*w1 for q in x]
print(rmse(y,yPred))

print("Enter the x to classify")
test = [int(i) for i in input().split()]
yNew = w0 + w1 * test
print(yNew)```

### Other Machine Learning algorithms:

Let us know in the comments if you are having any questions regarding this machine learning algorithm.