Regression analysis is a form of predictive modelling technique that establishes the relationship between single or multiple independent variable/s and one dependent variable. This technique is used for forecasting or finding the cause-and-effect relationship between the variables. Thehttps://datapatrons.com/ sign of a regression coefficient tells us whether there is a positive or negative correlation between each independent variable and the dependent variable. A positive coefficient indicates that as the value of the independent variable increases, the mean of the dependent variable also tends to increase.

There are various benefits of using Regression Analysis.

  1. It indicates the significant relationships between dependent variable and independent variable/s.
  2. It indicates the strength of impact of multiple independent variables on a dependent variable.

There are various kinds of Regression techniques available to make predictions or to understand the cause-and-effect relationship.

Types of Regression Techniques

  1. Linear Regression: In statistics, linear regression is a linear approach to modelling the relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. It is used extensively and is perhaps the first topic one would want to learn if learning predictive modelling techniques. In Linear Regression the dependent variable is continuous while the independent variable can be continuous or discrete. The regression line is linear and is considered line of best fit that aims to minimize the error or residual. The line is represented by the equation y=b0+b1x1+b2x2+….+bnxn+e ;where y is the dependent variable, x1,x2…xn are the independent variables, e is the epsilon, b0 is the slope and b1,b2..bn are the coefficients for the respective independent variables.
  • Logistic Regression: In statistics, the logistic model is used to model the probability of a certain class or event existing such as pass/fail, win/lose, alive/dead or healthy/sick. This can be extended to model several classes of events. Here the dependent variable is categorical and independent variable can be categorical or continuous. The model predicts the probability of occurrence of an event by fitting data to a logit function. The Logit Function is formed as p=e^y/(1+e^y) where p is the probability and y = b0+b1x1+b2x2+…+bnxn. The Logistic Regression line takes the shape of a Sigmoid curve where on the lower end the curve asymptotically reaches 0 and on the upper end asymptotically reaches 1 but never goes below 0 or above 1.
  • Polynomial Regression: Polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an nth degree polynomial in x. A regression equation is a polynomial regression equation if the power of independent variable is more than 1. The equation below represents a polynomial equation. The equation is formed as y= b0+b1x1^2+b2x2^2+…+bnxn^2. In Polynomial regression the best fit line is a curved line that fits into the data points.
  • Stepwise Regression: This form of regression is used when we deal with multiple independent variables. The aim of this model is to maximize prediction using minimum number of independent variables. In this technique, the selection of independent variables is done with the help of an automatic process. Some of the commonly used Stepwise Regression methods are Standard Stepwise Regression, Forward Selection and Backward Elimination.
  • Ridge Regression: Ridge regression is a model tuning method that is used to analyze any data that suffers from multicollinearity. This method performs L2 regularization. When the issue of multicollinearity occurs, least-squares are unbiased, and variances are large, this results in predicted values to be far away from the actual values. Ridge regression solves the multicollinearity problem through shrinkage parameter λ (lambda). This model shrinks the value of coefficients but doesn’t reaches zero.
  • Lasso Regression:  Lasso is a regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the resulting statistical model. Similar to Ridge Regression, Lasso (Least Absolute Shrinkage and Selection Operator) also penalizes the absolute size of the regression coefficients. In addition, it is capable of reducing the variability and improving the accuracy of linear regression models. Lasso regression differs from ridge regression in a way that it uses absolute values in the penalty function, instead of squares. This leads to penalizing values which causes some of the parameter estimates to turn out exactly zero. Larger the penalty applied, further the estimates get shrunk towards absolute zero. This results to variable selection out of given n variables.
  • ElasticNet Regression:  Elastic Net is an extension of linear regression that adds regularization penalties to the loss function during training. This method is hybrid of Lasso and Ridge Regression techniques. Elastic-net is useful when there are multiple features which are correlated. Lasso is likely to pick one of these at random, while elastic-net is likely to pick both.

Leave a Comment

What Our Student Say

Data Science Course Feedback

Karuna is an extremely knowledgeable teacher who cares about her students and puts a ton of preparation into her courses and materials. Karuna's approach to teaching data science is logical, well-structured and accessible. I highly recommend undertaking classes from her.

Abhijit Chowdhury

Course Feedback

Attended Karuna's online sessions and was very happy with her way of teaching specially for those doubt clearing sessions and her availability when ever required. I would highly recommend her for end to end data analytics knowledge and excellent teaching skills.

Debdutt Pandey

Capstone Project – Facebook comments

It was a great experience understanding the whole architecture of the project starting from the very beginning to the end of the project and the assistance provided along the way. It made the journey so easy that I couldn't believe I was able to complete the project within time and with great results. I would definitely recommend these sessions to anyone interested in data science field.

Deepak Goel
Data Patrons

Advance Statistics, Predictive Modelling, Data Mining, Time Series

Hi everyone, I would like to share my learning experience and outcome on the courses i had taken from Data Patrons led by Karuna. To start with Karuna, she is simply the best Mentor I had come across. She has a strong teaching experience in the field of Data Science. Karuna is very approachable and always available to clarify any doubts beyond your mentor sessions.  I am very fortunate to get trained by Karuna on the following Data Science topics: Advance Statistics, Predictive Modelling, Data Mining, Time Series. The training content is really good and it deals with real world data science problems. I had taken few courses outside, but this one is well structured and very relative to the work I do. I have gained in-depth knowledge, which helped me to transition my career from BI to Data Science with 2x Salary growth.

Dinesh Kumar Ravichandran
Data Patrons

Intro to Python for Data Science, Basic & advanced analytics, Data Mining, Predictive Modelling, Time Series Analysis, Machine Learning, Tableau

Karuna ma'am is a great teacher. She teaches any complex concept in a way that's easily understandable for us. She tries to keep the classes really engaging and interesting. I developed a deep interest in data science after attending her classes. She has also helped me in progressing in my career by giving very helpful tips and advice. She's an excellent guide and mentor. I hope her knowledge and experience help many more in the future and wish her all the very best!    

Ravisekhar R