Polynomial Linear Regression

10/20/2018

This is the model you use when you have to deal with variables that are non-linearly related to the target.The degree of the relation is to be provided at the time of making the model object so it is a trial and error method if it is not known from the start.

As you can see in the figure, you get better results as you increase the degree to a certain point and get better predictions. So it is nothing but degree on polynomial added to Multiple Linear Regression.

Let's get down to implementing it. Here is the data that we are going to use.

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
dataset = pd.read_csv('Position_Salaries.csv')
X = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values

# Fitting Linear Regression to the dataset
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X, y)

# Fitting Polynomial Regression to the dataset
from sklearn.preprocessing import PolynomialFeatures
poly_reg = PolynomialFeatures(degree = 4)
X_poly = poly_reg.fit_transform(X)
poly_reg.fit(X_poly, y)
lin_reg_2 = LinearRegression()
lin_reg_2.fit(X_poly, y)

# Visualising the Linear Regression results
plt.scatter(X, y, color = 'red')
plt.plot(X, lin_reg.predict(X), color = 'blue')
plt.title('Truth or Bluff (Linear Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()

# Visualising the Polynomial Regression results
plt.scatter(X, y, color = 'red')
plt.plot(X, lin_reg_2.predict(poly_reg.fit_transform(X)), color = 'blue')
plt.title('Truth or Bluff (Polynomial Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()

# Visualising the Polynomial Regression results (for higher resolution and smoother curve)
X_grid = np.arange(min(X), max(X), 0.1)
X_grid = X_grid.reshape((len(X_grid), 1))
plt.scatter(X, y, color = 'red')
plt.plot(X_grid, lin_reg_2.predict(poly_reg.fit_transform(X_grid)), color = 'blue')
plt.title('Truth or Bluff (Polynomial Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()

# Predicting a new result with Linear Regression
lin_reg.predict(6.5)

# Predicting a new result with Polynomial Regression
lin_reg_2.predict(poly_reg.fit_transform(6.5))

As you can see, the Polynomial Regression uses the Linear regression to predict the results just the data that is given to the model is transformed. The change in the data is shown below after using degree 2 to transform X.

You might notice that there is a constant term, the original term and the squared one. So the model is the same, just the data is transformed. You can also see the difference in graphs for different degrees of polynomial.

Degree 1:

Degree 2:

Degree 3:

Degree 4:

Note:  With increasing degree, the graph fits better and better but the motive is to find out when to stop because with more degrees than required, the graph can overfit on the points making the predictions on unseen data unreliable.

Now that you have seen the graph, you might be thinking that why is it called Polynomial Linear Regression as the features are not linear. Actually the features are not the unknown variables, the coefficients are. And these coefficients are linear to the independent variable.

When the graphs are ready after visualization, you can predict the results for unseen data either from the graph or using the predict method.

Hope you like this post and do tell if you find it useful. Everybody stay Awesome!

Total Hits: hit counter