Saturday, May 20, 2017

Statistics Lesson 1 - One Sample t test using Minitab

Today We are going to discuss how to do a one sample t test using Minitab. To do a one sample t test, certain assumptions should be satisfied. They are as follows

  • The population Standard deviation is unknown
  • The Population which the sample was taken should be normally distributed 
  • if the population is not normally distributed, then the sample size should be greater than 30
The third assumption is based on the central limit theorem.
As the first step of the one sample t test, correct hypothesis should be identified.
\(H_0:\ \mu\ =\ m_0\) \(H_1:\ \mu\ \ne\ m_0\)    (two-tailed) \(H_1:\ \mu\ >\ m_0\)    (upper-tailed) \(H_1:\ \mu\ <\ m_0\)    (lower-tailed)
Then to perform a t test in MINITAB first select STAT and then BASIC STATISTICS. In that menu select 1-SAMPLE T. Then the following window will appear.
If you have the sample data, then first type the data in one of the column. After that select one or more samples, each in a column. If you have the summarized data, then select the option summarized data.
Mean Standard deviation Sample Size
Consider this example . A random sample of 22 fifth grade pupils have a grade point average of 5.0 in maths with a standard deviation of 0.452. Assume the claim is that the average GPA is less than 5.5. Also we are going to test this claim at 5% significance level.So in order to test this hypothesis first data is entered in the MINITAB like this.
Sample size Sample mean Standard deviation Hypothesized mean
since the claim that going to test is that the average GPA less than 5.5, the value 5.5 has entered under the hypothesized mean. After typing the data, then select the button option.

Two taied one tailed right tailed
Since the claim includes the less than keyword, the hypothesis is a left tailed hypothesis. So under the alternative hypothesis, Mean< hypothesized mean should be selected. Then Press OK. Then the results can be seen as the following output.

T test, P value , Statistic Mean
According to the above output, the test statistic of  the t test is -5.19. And the p value is zero and it is less than the significance level (0.05). So the null hypothesis is rejected. So there is sufficient evidence to claim that the   average GPA is less than 5.5.

Friday, May 19, 2017

Statistics Homework Guide


Question 

It is believed that 4of children have a gene that may be linked to juvenile diabetes. Researchers at a firm would like to test new monitoring equipment for diabetes. Hoping to have 19 children with the gene for their study, the researchers test 729 newborns for the presence of the gene linked to diabetes. What is the probability that they find enough subjects for their study?


It is believed that 4​% of children have a gene that may be linked to juvenile diabetes. Researchers at a firm would like to test new monitoring equipment for diabetes. Hoping to have 19 children with the gene for their​ study, the researchers test 729 newborns for the presence of the gene linked to diabetes. What is the probability that they find enough subjects for their​ study?

Begin by checking that the randomization condition, 10% condition, and success/failure condition are all satisfied.

To satisfy the randomization condition, the sampling method must not be biased and the data must be representative of the population. This condition is satisfied because the sample of newborns is a random sample of the population of newborns.

To satisfy the 10% condition, the sample size, n, must be no larger than 10% of the population. This condition is satisfied because it is safe to assume that there are more than
7100 newborns available to be randomly tested.

To satisfy the success/failure condition, the sample size must be big enough so that both the number of "successes," np, and the number of "failures," nq, are expected to be at least 10. 

np = 28.4 and nq = 681.6


Since both np and nq are greater than 10, the success/failure condition is satisfied.

Since the conditions are satisfied, use a normal model for the sampling distribution of phat .Remember that the researchers are hoping to obtain at least 18 children for the experiment. To find the value that phat must be greater than or equal to for the researchers to find enough subjects, divide the number of desired subjects by the sample size.

phat = 18/710 = 0.0254

This means that the objective is to find P where (phat >0.0254)

mean of phat = 0.04 and the standard deviation is (√(0.04*0.96/710) = 0.0074
 So Z = (0.0254-0.04)/0.0074 = -1.97

Therefore using a standard normal table,  P(phat > 0.0254) = P(Z>-1.97) = 0.976



Statistics Homework Guide


Question 1

The histogram shows the December charges (in $) for 5000 customers from one marketing segment from a credit card company. (Negative values indicate customers who received more credits than charges during the month.) Use the histogram to complete parts a) through c) below.


The histogram shows the December charges​ (in $) for 5000 customers from one marketing segment from a credit card company.​ (Negative values indicate customers who received more credits than charges during the​ month.) Use the histogram to complete parts​ a) through​ c) below.

In this question, range is calculated by subtracting the lowest value from the highest value. So lowest value is -500. And the highest value is 5000. Therefore the range is 5000 -(-500) = 5500

The unusual feature is the inclusion of some negative values.


The shape of the histogram  is positively skewed or right skewed. Because the right tail is greater than the left tail. If you want to know more about the skewness, please check the following figure.



comparing mean median and mode due to skewness

Since the graph is right skewed, the mean is greater than the median. As a hint, the median is always at the middle. Depending on the skewness, mean can be less than or greater than the median.


To get help in more questions like this, Please contact me using the following  link :