Contact Form

Name

Email *

Message *

Search This Blog

Image

Ridge and Lasso regression, Explained with Python.

Ridge regression and Lasso regression are two basic techniques for reducing model complexity and avoiding over-fitting that may occur when using simple linear regression.

Ridge regression

Ridge regression is used when the independent variables have a high level of correlation among themselves. In ridge regression, the least squared estimates produce an unbiased value due to the correlation between data. Ridge regression uses a lambda parameter in order to handle the multicollinearity problem. The lambda parameter is called a shrinkage parameter because it is used to fine-tune function and reduce multicollinearity. Ridge regression is represented using the following function,

In the above formula, lower the value of lambda more the model will resemble linear regression. The lambda parameter is used as a penalty for the regression coefficient so that the model can be fine-tuned to produce accuracy.

Multiple regression lines are drawn for various values of alpha or lambda in the graph below. After that, the best-fit line is chosen based on the accuracy, and the alpha value is used to penalize the coefficient.



Graph showing multiple regression lines for different alpha values

Let’s predict Boston housing rates using ridge regression.

  1. Import the libraries
import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsfrom tabulate import tabulateimport warningswarnings.filterwarnings('ignore')
  1. Load the dataset

sklearn library comes with few toy datasets. We will use the Boston Housing dataset to predict the housing rates using ridge regression.

# Load datasetfrom sklearn.datasets import load_bostonboston =load_boston()df_boston = pd.DataFrame(data = boston.data, columns= boston.feature_names)df_boston.head()

Output:



  1. Split dataset into train and test set

In this step, the feature and the target values are separated and divided into train and test datasets.

# Slice the dataframe into features and targetdf_boston_features = df_boston.iloc[:, :-1]df_boston_target = df_boston.iloc[:,-1:]# Spliting dataset into training and testing datafrom sklearn.model_selection import train_test_splitX_train, X_test, Y_train, Y_test = train_test_split(df_boston_features, df_boston_target, train_size = 0.8)
  1. Building Ridge Regression model for alpha = 1.0

Here, the model is instantiated and then trained using the train set.

# Importing libraryfrom sklearn.linear_model import Ridge# Initialize the regressor with default alpha value (alpha = 1.0)regressor_ridge = Ridge(alpha = 1.0)# Fit the model to train setregressor_ridge.fit(X_train, Y_train)
  1. Predicting the target values and comparing with actual values
# Predict valuesridge_pred = regressor_ridge.predict(X_test)ridge_prediction_df = pd.DataFrame(data = ridge_pred, columns=['Predicted rates'])ridge_prediction_df['Actual rates'] = Y_test.valuesridge_prediction_df.head()
6.Retrieving the intercept
# Printing interceptprint(regressor_ridge.intercept_)


7.Retrieving the slopes

# Printing coefficientsprint(regressor_ridge.coef_)




These coefficients are for each of the features in the dataset.
  1. Visualizing the best fit line for alpha = 1.0
#Visualize the ridge regression on testing datasetplt.figure(figsize=(12,6))plt.scatter(Y_test, ridge_pred, color = 'r', alpha = 0.5)plt.plot(Y_test, Y_test, color = 'r')plt.ylabel('Predicted House Rate')plt.xlabel('Actual House Rate')plt.show()

Output:



  1. Building Ridge Regression model for alpha = 0.5 and predicting the values
# Initialize the regressor with default alpha value (alpha = 0.5)reg_ridge = Ridge(alpha =0.5)# Fit the model to train setreg_ridge.fit(X_train, Y_train)# Predict valuesridge_pred_alpha_mod = reg_ridge.predict(X_test)ridge_prediction_df_alpha_mod = pd.DataFrame(data = ridge_pred_alpha_mod, columns=['Predicted rates'])ridge_prediction_df_alpha_mod['Actual rates'] = Y_test.valuesridge_prediction_df_alpha_mod.head()

Output:


  1. Visualizing the best fit line for alpha = 0.5
# Visualize the linear regression on testing datasetplt.figure(figsize=(12,6))plt.scatter(Y_test, ridge_pred_alpha_mod, color = 'b', alpha = 0.5)plt.plot(Y_test, Y_test, color = 'b')plt.ylabel('Predicted House Rate')plt.xlabel('Actual House Rate')plt.show()

Output:



  1. Comparing the evaluation metrics for both ridge models
# Evaluating the prediction with metrics# Importing the libraries for evaluating the metricsfrom sklearn.metrics import mean_squared_error, mean_absolute_error# Metrics for ridge regression model with alpha = 1.0MSE = mean_squared_error(Y_test, ridge_pred)MAE = mean_absolute_error(Y_test, ridge_pred)RMSE = mean_squared_error(Y_test, ridge_pred, squared=False)# Metrics for ridge regression model with alpha = 0.5MSE_alpha_mod = mean_squared_error(Y_test, ridge_pred_alpha_mod)MAE_alpha_mod = mean_absolute_error(Y_test, ridge_pred_alpha_mod)RMSE_alpha_mod = mean_squared_error(Y_test, ridge_pred_alpha_mod, squared=False)# Tabulating the values of both the modelsridge_metrics = ['Ridge', regressor_ridge.alpha, MSE, MAE, RMSE]ridge_metrics_alpha_mod = ['Ridge', reg_ridge.alpha, MSE_alpha_mod, MAE_alpha_mod, RMSE_alpha_mod]ridge_table = [ridge_metrics, ridge_metrics_alpha_mod]print(tabulate(ridge_table, headers=('Model', 'Alpha', 'MSE', 'MAE', 'RMSE')))

Output:



In the table above, the error values of the ridge model with alpha 0.5 is lower than that of the ridge model with alpha 1.0, thus, the model with an alpha value of 0.5 would predict more precise results as compared to that of the model with alpha value 1.0

Lasso regression

This regression is similar to ridge regression. The only difference is that in Lasso regression the function uses the absolute value of the regression coefficient rather than the square of the values. Lasso regression uses feature selection in which it selects a set of required features from the dataset to build the model and all the other features are made zero. If there is high collinearity between the variables then only one variable is used and other variables are reduced to zero. Lasso uses regularization along with feature selection. The equation for lasso regression is given as follows,

In lasso regression as well, numerous regression lines can be generated for varying values of alpha or lambda. After which, the best-fit line is picked based on accuracy, and the coefficient is penalized using the alpha value. Multiple regression lines for various alpha values are shown in the graph below.


Graph showing multiple regression lines for different alpha values

Let’s predict Boston housing rates using lasso regression.

  1. Import the libraries
import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsfrom tabulate import tabulateimport warningswarnings.filterwarnings('ignore')
  1. Load the dataset

sklearn library comes with few toy datasets. We will use the Boston Housing dataset to predict the housing rates using lasso regression.

# Load datasetfrom sklearn.datasets import load_bostonboston =load_boston()df_boston = pd.DataFrame(data = boston.data, columns= boston.feature_names)df_boston.head()

Output:



  1. Split dataset into train and test set

In this step, the feature and the target values are separated and divided into train and test datasets.

# Slice the dataframe into features and targetdf_boston_features = df_boston.iloc[:, :-1]df_boston_target = df_boston.iloc[:,-1:]# Spliting dataset into training and testing datafrom sklearn.model_selection import train_test_splitX_train, X_test, Y_train, Y_test = train_test_split(df_boston_features, df_boston_target, train_size = 0.8)
  1. Building Lasso Regression model for alpha = 1.0

Here, the model is instantiated and then trained using the train set.

# Import libraryfrom sklearn.linear_model import Lasso# Initialize the regressor with alpha = 1.0regressor_lasso = Lasso(alpha = 1.0)# Fit the regressorregressor_lasso.fit(X_train, Y_train)
  1. Predicting the target values and comparing with actual values
# Predicting rates for testing datalasso_pred = regressor_lasso.predict(X_test)lasso_prediction_df = pd.DataFrame(data = lasso_pred, columns=['Predicted rates'])lasso_prediction_df['Actual rates'] = Y_test.valueslasso_prediction_df.head()



  1. Retrieving the intercept
# Printing interceptprint(regressor_lasso.intercept_)



  1. Retrieving the slopes
# Printing coefficientsprint(regressor_lasso.coef_)

Output:



These coefficients are for each of the features in the dataset. In this, we can observe that the model has reduced a few of the least important features to zero and selected only the features that would help to increase the performance of the model.
  1. Visualizing the best fit line for alpha = 1.0
# Visualize the lasso regression on testing datasetplt.figure(figsize=(12,6))plt.scatter(Y_test, lasso_pred, color = 'r', alpha = 0.5)plt.plot(Y_test, Y_test, color = 'r')plt.ylabel('Predicted House Rate')plt.xlabel('Actual House Rate')plt.show()

Output:



  1. Building Lasso Regression model for alpha = 0.5 and predicting the values
# Initialize the regressor with alpha = 0.5regressor_lasso_alpha_mod = Lasso(alpha = 0.5)# Fit the regressorregressor_lasso_alpha_mod.fit(X_train, Y_train)# Predicting rates for testing datalasso_pred_alpha_mod = regressor_lasso_alpha_mod.predict(X_test)lasso_prediction_df_alpha_mod = pd.DataFrame(data = lasso_pred_alpha_mod, columns=['Predicted rates'])lasso_prediction_df_alpha_mod['Actual rates'] = Y_test.valueslasso_prediction_df_alpha_mod.head()



  1. Visualizing the best fit line for alpha = 0.5


  1. Comparing the evaluation metrics for both lasso models
# Evaluating the prediction with metrics# Metrics for lasso regression model with alpha = 1.0lasso_MSE = mean_squared_error(Y_test, lasso_pred)lasso_MAE = mean_absolute_error(Y_test, lasso_pred)lasso_RMSE = mean_squared_error(Y_test, lasso_pred, squared=False)# Metrics for lasso regression model with alpha = 0.5lasso_MSE_alpha_mod = mean_squared_error(Y_test, lasso_pred_alpha_mod)lasso_MAE_alpha_mod = mean_absolute_error(Y_test, lasso_pred_alpha_mod)lasso_RMSE_alpha_mod = mean_squared_error(Y_test, lasso_pred_alpha_mod, squared=False)# Tabulating the values of both the modelslasso_metrics = ['Lasso', regressor_lasso.alpha, lasso_MSE, lasso_MAE, lasso_RMSE]lasso_metrics_alpha_mod = ['Lasso', regressor_lasso_alpha_mod.alpha, lasso_MSE_alpha_mod, lasso_MAE_alpha_mod, lasso_RMSE_alpha_mod]lasso_table = [lasso_metrics, lasso_metrics_alpha_mod]print(tabulate(lasso_table, headers=('Model', 'Alpha', 'MSE', 'MAE', 'RMSE')))

Output:


In the table above, the error values of the lasso model with alpha 0.5 is lower than that of the lasso model with alpha 1.0, thus, the model with an alpha value of 0.5 would predict more precise results as compared to that of the model with alpha value 1.0

Click here to understand the evaluation metrics used for ridge and lasso regression.

Click here to get access to the complete code.

Click here to view other topics that might excite you.

Comments