Monday, December 18, 2017

Dealing with a simple linear regression question using Minitab

Today I am going to discuss about the steps that you need to follow of a standard linear regression question using MINITAB.
This will be a very good guide as most of the elementary statistics classes may teach linear regression. And using minitab, you can solve questions very quickly so that you can save time. The question is as follows. Here  I am using MINITAB 17. I believe that the menu bars of every version is relatively same. If you find it difficult to find, please send me a quick message and i will help you. 

Ex: The accompanying table shows the total square footages (in billions) of retailing space at shopping centers, the numbers (in thousands) of shopping centers, and the sales (in billions of dollars) for shopping centers for eight years.







Identify the independent and dependent variables

This is the first step. In most of the questions, it has specified which variables are independent and which is the dependent. If that is not the case, you should identify it by yourself. For example , in the above question you will be clearly see that the sales will depend on the total square footage and the number of shopping centers.

Fitting the regression line

After identifying the dependent and independent variables, the next step is to fit the regression model. since you are using a computer software (in this case minitab) this is relatively easy. But you should know the steps. 

First import the data to minitab. you can do this very easily. if you using websites like "pearson mystatlab" you can easily copy data to minitab.

Then go to StatRegressionFit Regression Model

Then you will get a window like below.

Under the Responses, drag and drop the dependent variable. Also under Continuous predictors, put the independent variables like in the above window.

since you are fitting a basic model , you don't need to do anything. (So in the questions in Pearson Mystatlab, you dont need to do anything else). Then press OK.

Then in the MINITAB Session window, you will see a output like below.


This output will contain the everything that you need  to answer the questions. Lets analyze the output.


  • In the Analysis of Variance section you will see the partitions of sums of squares.
  • The overall F statistic for the regression model is 350.98 and it is significant as the p value is 0.000.
  • The sums of squares of Regression ,SSR(773075 , with degrees of freedom 2) can be partitioned in to two  parts due to two independent variables. Basically the the degrees of freedom of the SSR equals to the number of independent variables. (in this case 2).
  • The sums of squares of  error ,SSE is 8810 with degrees of freedom 8. The formula for the degrees of freedom of SSE is n-p, where p is the number of parameters.  


        No of parameters = No of independent variables +1

so in this example , it is 3. so the degrees of freedom of SSE is 11-3 =8.

  • The sums of squares of total, SST is 781885.
  • The s in the output refers to the standard error of the regression model, which is also equal to the square root of the MSE.
  • The R squared value  and the R squared adjusted values of the model is 98.87% and 98.59% respectively. According to the R squared , about 98.87% of the total variation in y can be explained using the fitted regression model.
  • Using the coefficients table , the significance of the each coefficient can be determined. This is same as the partial F statistic of the analysis of variance table.Here we have to check the p value of the each variable. we can see that the p value of the x2 variable is large. so it is not significant to the model.
  • The final model is, 
        Sales = -224.0 + 50.5*Square Footage  + 20.14 *Shopping centers


















No comments:

Post a Comment