STA
6166 UNIT 2 Section 1 Answers
|
Welcome | < | Begin | < | < | Unit 2 Section 1 Answers | > | Section 2 |
You can choose to work some or all of the problems listed below. We recommend that you at least work the problems listed in your major area of interest.
General Questions. |
|
For students in agriculture and environmental fields. |
We begin by putting down all that we know using the shorthand notation of the book. H0: m=10l/h HA: m<10l/h Test Statistics: Average fuel consumption based on 15 replications, We will convert the average consumption to a z-score using the prior knowledge that the true population standard deviation is s=1.5l/h. That is z=(mean-m)/(s/Ö(n)). Since this is a one-tailed (left-tailed alternative) test, the critical value that defines the rejection region is given by z0.05 = -1.645. Power is 1 minus the probability of a Type II error. The probability of a Type II error for a specified alternative is given by the equation on page 216. This equation is a little confusing since it is not clear what you plug in for za, The description on pages 214 and 215 is a little vague as well. It turns out that we always use the right tail value (the positive value, in this case 1.645). It has to do with our taking the absolute value of the differences of the mean in the right hand part of the equation. With this understanding we plug in the value for za=1.645, and D = |m0-ma| = {0.5, 1.0, and 1.5), s=1.5, n=15. Answer b(0.5)=P(Z<1.645-1.291) = 0.6787 -> Power = 0.3216 Answer b(1.0)=P(Z<1.645-2.58) = 0.1749 -> Power = 0.8251 Answer b(1.5)=P(Z<1.645-3.873) = 0.0129 -> Power = 0.9871 From this we conclude that at this level of replication and with this statistical test, we have greater than a 90.1% chance of rejecting the null hypothesis that average fuel consumption is 10l/h when it is actually 9.5l/h on average. We are almost certain to show as significant a reduction in fuel consumption of 1l/h or more. Replication Consumption 1 8.99 2 9.16 3 10.20 4 9.82 5 11.54 6 9.48 7 10.65 8 9.71 9 11.43 10 9.97 11 9.11 12 8.24 13 8.44 14 8.50 15 11.34 A SAS program that reads in these data and outputs basic statistics is given here (SAS PROGRAM and OUTPUT). All we really need from this output is the sample mean, so running the program is overkill here. The test statistic is computed as: The rejection region is Z<z0.05=-1.645. Since -0.5887 is not less than -1.645 we do not reject the null hypothesis and conclude that the consumption with the new fuel is not really any different than that with the standard fuel. Just looking at the numbers would you conclude that consumption with the new fuel is any worse. {NO}. I have given you all the information you need to use the equation on page 221 to figure out the needed sample size. Simply plugging in the values we get. The sign test is given on page 246. For this test we only need the number of observations greater than the median stated in the null hypothesis. We will take the target median M0 to be 10 l/h. In this case, B=5 observations greater than the specified median, and the rejection region for the left-tailed test is Reject Ho if B£C0.05,15 from Table 4 Since B =5 is greater than C=3 from the table, we do not reject the null hypothesis that the population median is 10 l/h and conclude that the new fuel is not significantly different from the standard fuel.Note that the SAS output provides three test for central tendency, but here they test that the mean or median are equal to zero, hence all are significant.
Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t 34.77598 Pr > |t| <.0001 Sign M 7.5 Pr >= |M| <.0001 Signed Rank S 60 Pr >= |S| <.0001 To have SAS automatically test for a specific central tendency value (mean or median), create a new variable which is the old variable minus the null hypothesis mean and rerun the Univariate procedure. The test results below suggest that both the t-test and the Sign test would not reject the null hypothesis. Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t -0.81139 Pr > |t| 0.4307 Sign M -2.5 Pr >= |M| 0.3018 Signed Rank S -17 Pr >= |S| 0.3591
|
For students in engineering fields. |
We begin by putting down all that we know using the shorthand notation of the book. H0: m=50 t/d HA: m>50 t/d Test Statistics: Average fuel consumption based on 15 replications, We will convert the average production to a z-score using the prior knowledge that the true population standard deviation is s=25 t/d. That is z=(mean-m)/(s/Ö(n)). Since this is a one-tailed (right-tailed alternative) test, the critical value that defines the rejection region is given by z0.95 = 1.645. Power is 1 minus the probability of a Type II error. The probability of a Type II error for a specified alternative is given by the equation on page 216. This equation is a little confusing since it is not clear what you plug in for za, The description on pages 214 and 215 is a little vague as well. It turns out that we always use the right tail value (the positive value, in this case 1.645). It has to do with our taking the absolute value of the differences of the mean in the right hand part of the equation. With this understanding we plug in the value for za=1.645, and D = |m0-ma| = {10, 20 and 40 t/d), s=25, n=15. Answer b(10)=P(Z<1.645-1.549) = 0.5359 -> Power = 0.4641 Answer b(20)=P(Z<1.645-3.098) = 0.073 -> Power = 0.927 Answer b(40)=P(Z<1.645-6.197) = 0.000003 -> Power > 0.9999 From this we conclude that at this level of replication and with this statistical test, we only have a 46.4% chance of rejecting the null hypothesis that average production is 50 t/d when it is actually 60 t/d on average. We are almost certain to show as significant an increase in production to 70 t/d or more. So, this test is not very powerful in showing only a 10 t/d increase but would be adequate to show a 20 t/d or more increase. Day Yield 1 57.8 2 58.3 3 50.3 4 38.5 5 47.9 6 157.0 7 38.6 8 140.2 9 39.3 10 138.7 11 49.2 12 139.7 13 48.3 14 59.2 15 49.7 A SAS program that reads in these data and outputs basic statistics is given here (SAS PROGRAM and OUTPUT). All we really need from this output is the sample mean, so running the program is overkill here. The test statistic is computed as: The rejection region is Z<z0.05=1.645. Since3.746 is greater than 1.645 we reject the null hypothesis and conclude that the production significantly increases with the new equipment. I have given you all the information you need to use the equation on page 221 to figure out the needed sample size. Simply plugging in the values we get: n=40^2 (1.645+0.84)^2 / 20^2 = 24.7, so take n=25. The sign test is given on page 246. For this test we only need the number of observations greater than the median stated in the null hypothesis. We will take the target median M0 to be 50 t/d. In this case, B=8 observations greater than the specified median, and the rejection region for the right-tailed test is Reject Ho if B³n-C0.05,15 from Table 4 Since B =5 is less than n-C=15-3=12 from the table, we do not reject the null hypothesis that the population median is 50 t/d and conclude that the new equipment does not significantly increse production Thus the formal normal score test rejects the null hypothesis and the non-parameteric Sign Test does not reject the null hypothesis. Why do you think this result occurs. One clue could be the fact that in the normal test we assumed that the population standard deviation was 25 whereas the sample standard deviation was s=44. Notice that the SAS output for the production provides three test for central location. These three test all test that the true mean is zero, hence all are quite significant. Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t 6.502692 Pr > |t| <.0001 Sign M 7.5 Pr >= |M| <.0001 Signed Rank S 60 Pr >= |S| <.0001 To test that the true population is centered around 50, we create a new variable by dividing production by 50 and re-run these tests for location.. Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t 2.119643 Pr > |t| 0.0524 Sign M 0.5 Pr >= |M| 1.0000 Signed Rank S 16.5 Pr >= |S| 0.3666 Now you find that both the Student's T-test and the Sign test produce similar conclusions - do not reject the null hypothesis. |
For students in toxicology and health science fields. |
We begin by putting down all that we know using the shorthand notation of the book. H0: m=14 mg HA: m>14 mg Test Statistics: Average nicotine content based on 20 replications, We will convert the average nicotine content to a z-score using the prior knowledge that the true population standard deviation is s=0.8 mg. That is z=(mean-m)/(s/Ö(n)). Since this is a one-tailed (right-tailed alternative) test, the critical value that defines the rejection region is given by z0.95 = 1.645. Power is 1 minus the probability of a Type II error. The probability of a Type II error for a specified alternative is given by the equation on page 216. This equation is a little confusing since it is not clear what you plug in for za, The description on pages 214 and 215 is a little vague as well. It turns out that we always use the right tail value (the positive value, in this case 1.645). It has to do with our taking the absolute value of the differences of the mean in the right hand part of the equation. With this understanding we plug in the value for za=1.645, and D = |m0-ma| = {.2, .4, and .75 mg), s=0.8, n=20. Answer b(.2)=P(Z<1.645-1.118) = 0.7019 -> Power =0.298 Answer b(.4)=P(Z<1.645-2.236) = 0.2776 -> Power = 0.7224 Answer b(.75)=P(Z<1.645-4.193) = 0.0054 -> Power > 0.9946 From this we conclude that at this level of replication and with this statistical test, we only have a 29.8% chance of rejecting the null hypothesis that average nicotine is 14 mg when it is actually 14.2 mg on average. We have a better chance of showing an increase .4 mg and the statistical test is almost certain to declare a difference of 0.75 mg as significant. . A SAS program that reads in these data and outputs basic statistics is given here (SAS PROGRAM and OUTPUT). All we really need from this output is the sample mean, so running the program is overkill here. The test statistic is computed as:
The rejection region is Z>z0.95=1.645. Since3.242 is greater than 1.645 we reject the null hypothesis and conclude that the nicotine content is significantly greater than the target value. I have given you all the information you need to use the equation on page 221 to figure out the needed sample size. Simply plugging in the values we get. The sign test is given on page 246. For this test we only need the number of observations greater than the median stated in the null hypothesis. We will take the target median M0 to be 14 mg. In this case, B=18 observations greater than the specified median, and the rejection region for the right-tailed test is Reject Ho if B³n-C0.05,20 from Table 4 Since B =18 is greater than n-C=20-5=15 from the table, we reject the null hypothesis that the population median is 14 mg and conclude that average nicotine levels are different from the target. Note that in the SAS output, we have there test for location reported. The resuts reported for the Nicotine variable test whether this value is truely zero. All tests conclude that Nicotine is significantly different from zero. Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t 77.02162 Pr > |t| <.0001 Sign M 12.5 Pr >= |M| <.0001 Signed Rank S 162.5 Pr >= |S| <.0001
But this does not test whether the mean or median
are equal to 14. The same tests applied to a new variable with Nicotine
less 14 gives us the p-value for the two-sided tests. If we divide the
p-value by two, we get the comporable p-values for the one-sided test.
Note that in this case, all test have p-values less than the nominal
0.05 set for the test.
|
For students in community development, education and social services fields. |
We begin by putting down all that we know using the shorthand notation of the book. H0: m=$500 HA: m>$500 Test Statistics: Average costs based on 40 replications, We will convert the average costs to a z-score using the prior knowledge that the true population standard deviation is s=$150. That is z=(mean-m)/(s/Ö(n)). Since this is a one-tailed (right-tailed alternative) test, the critical value that defines the rejection region is given by z0.95 = 1.645. Power is 1 minus the probability of a Type II error. The probability of a Type II error for a specified alternative is given by the equation on page 216. This equation is a little confusing since it is not clear what you plug in for za, The description on pages 214 and 215 is a little vague as well. It turns out that we always use the right tail value (the positive value, in this case 1.645). It has to do with our taking the absolute value of the differences of the mean in the right hand part of the equation. With this understanding we plug in the value for za=1.645, and D = |m0-ma| = {$50, $75 and $100), s=$150, n=40. Answer b(50)=P(Z<1.645-2.236) = 0.2776 -> Power = 0.7223 Answer b(75)=P(Z<1.645-3.354) = 0.0437 -> Power = 0.9563 Answer b(150)=P(Z<1.645-4.472) = 0.0023 -> Power > 0.9977 From this we conclude that at this level of replication and with this statistical test, we have a 72% chance of rejecting the null hypothesis if the average cost is $50 more than currently believed. We are almost certain to find significant differences of $75 an more greater than expected with a sample of 45 students. A SAS program that reads in these data and outputs basic statistics is given here (SAS PROGRAM and OUTPUT). All we really need from this output is the sample mean, so running the program is overkill here. The test statistic is computed as: The rejection region is Z>z0.95=1.645. Since 0.888 is less than 1.645 we do not reject the null hypothesis and conclude that average costs are about what we expected them to be. I have given you all the information you need to use the equation on page 221 to figure out the needed sample size. Simply plugging in the values we get. The sign test is given on page 246. For this test we only need the number of observations greater than the median stated in the null hypothesis. We will take the target median M0 to be $500. In this case, B=24 observations are greater than the specified median, and the rejection region for the right-tailed test is Reject Ho if B³n-C0.05,40 from Table 4 Since B =24 is less than n-C=40-14=26 from the table, we do not reject the null hypothesis that the population median is $500 and conclude that average costs are not different from the target. Note that in the SAS output, we have three tests for location reported. The results reported for the variable Expenditure considers whether the center of the distribution for this value is truely zero. All tests conclude that Expenditure is significantly different from zero. Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t 28.74492 Pr > |t| <.0001 Sign M 20 Pr >= |M| <.0001 Signed Rank S 410 Pr >= |S| <.0001
But this does not test whether the mean or median
are equal to $500. The same tests applied to a new variable with Expenditure
less the $500 gives us the desired tests. Note that now all tests lead
us to conclude that the null hypothesis should not be rejected. But
also notice that each test is a two-sided test and hence is not directly
applicable here. The p-value for the appropriate one-sided test would
be half the p-value reported here. This still suggests that we do not
reject the null hypothesis that the average costs is $500 since these
p-values will still be greater than the nominal 0.05 for the tests.
|
Final Comments |
For this assignment I used the SAS system to perform the analysis. Notice how easy it is to pass on to you the SAS program (code) that I used to perform the analysis. You can easily copy my code and re-run it for yourself. In addition, the SAS output here was a simple text and hence I could send you a copy of the text quite easily. Note also that SAS has a new output delivery system (ODS) tool that lets me output results to html web pages. The last exercise I could have used the following code to produce html output. ods html file="c:/site/sta6166/images/univ.htm"; proc univariate; var Expenditure cExp ; title " Insurance Problem 2 Unit 2 Section 1 "; run; ods html close; The resulting univ.htm files created by this program can be viewed here. |