When I first started tutoring I'd explain that it depends on the problem, and start rambling on about the central limit theorem until their eyes glazed over. Then I realized, it's easier to understand if I just make a flowchart. So, here it is! When you're working on a statistics word problem, these are the things you need to look for. Even for relatively small samples, the distributions are virtually the same.
Sign up to join this community. The best answers are voted up and rise to the top. Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Learn more. When do I use a z-score vs a t-score for confidence intervals?
Ask Question. Asked 6 years, 4 months ago. Here, mu1 is the mean cholesterol of the male population and mu2 is the mean cholesterol of the female population. As per the problem statement above the alternative hypothesis can be set as the mean cholesterol of the male population is less than the mean cholesterol of the female population. The significance level alpha is 0. It is very common to use a t-statistic to compare two means in statistics:.
The formula of the degree of freedom may look scary. But do not worry we will get the t-statistic and degree of freedom from the t. Calculate the t-statistic and the p-value. First import the Heart dataset in the RStudio. Now, use the t. As per the output above, the t-statistic is You will get almost a similar result with a z-test.
Look we got the same test-statistic and very close p-value as before. All the parameters passed in the z. Setting mu as 0 may be a bit confusing. It is zero because our null hypothesis is the mean cholesterol level of the male and female population is equal. That means the difference between the two means is zero. The mu parameter takes the difference in mean in the case of two mean comparisons.
Decision rule. If the p-value is less than the alpha we will reject the null hypothesis and otherwise, we will not reject the null hypothesis. As the p-value came out to be smaller than the alpha level, we have enough evidence to reject the null hypothesis. So, the cholesterol level in the male population is less than the cholesterol level in the female population. In my next example, I will not go through the 5-step process because it is getting a bit repetitive.
This example will show how to perform a two-sided z-test of mean and calculate a confidence interval using R. Example 4. Using the data from the Heart dataset, check if the population mean of the cholesterol level is and also construct a confidence interval around the mean Cholesterol level of the population. Use a significance level of 0.
As we are checking if the cholesterol level is the null hypothesis is:. Here in the problem, there is no mention of less than or greater than. We will only check if the mean cholesterol level is or not. So, the alternative hypothesis should be:. We will use the z-test here as demonstrated in example 3. That will give us the z-statistic, p-value, and confidence interval everything in one simple line of code. Because we are not comparing the two means here, we will only pass one data here and the second one will be set as zero.
Mean can be greater or less. So, it is two-sided. Look the p-value is 0. So, we do not have enough evidence to reject the null hypothesis here. So the mean cholesterol level of the population is From the output above, the confidence interval is It is common to use a z. We only dealt with the problems of means in all the previous examples.
Here We will work on the population proportions in my next two examples. The concept of the test for proportion is not too different than the tests for means. It tests how far the population proportion of a larger population from a sample proportion.
Suppose we want to test the proportion of children who had some swimming lessons when they were less than 10 years old. We cannot go ask all the children in the world if they had swimming lessons when they were less than So we will take a sample of , , , or the number that is affordable to us and infer the information about the large population from that sample.
The test for proportion is only valid if the sample size is large enough. The formula for z-statistic is:. In this formula, p-hat is the claimed population proportion and p0 is the population proportion under the null hypothesis.
The equation for the confidence interval is:. The standard error is calculated as:. I wanted to show all the formulas briefly but as I mentioned in the beginning, this article will focus on working on R.
So, I will work on two examples in R. Example Set the hypothesis and alpha level first. You can read more about different ways to write intervals here: Three ways to write a confidence interval. The much more realistic scenario is using a t-interval to estimate an unknown population mean. This interval relies on our sample standard deviation in calculating the margin of error. All this means for us is that the formula will be very similar, but the critical value will no longer come from the normal distribution.
Suppose that a sample of 38 employees at a large company were surveyed and asked how many hours a week they thought the company wasted on unnecessary meetings. The mean number of hours these employees stated was As before, since we are estimating a mean with a confidence interval, we know it will either be a t-interval or a z-interval. The standard deviation of 5. Before we can do that however, we need to look up the critical value.
0コメント