STA
6166 UNIT 2 Section 1 Answers

Welcome  <  Begin  <  <  Unit 2 Section 1 Answers  >  Section 2 
You can choose to work some or all of the problems listed below. We recommend that you at least work the problems listed in your major area of interest.
General Questions. 

For students in agriculture and environmental fields. 
We begin by putting down all that we know using the shorthand notation of the book. H_{0}: m=10l/h H_{A}: m<10l/h Test Statistics: Average fuel consumption based on 15 replications, We will convert the average consumption to a zscore using the prior knowledge that the true population standard deviation is s=1.5l/h. That is z=(meanm)/(s/Ö(n)). Since this is a onetailed (lefttailed alternative) test, the critical value that defines the rejection region is given by z_{0.05} = 1.645. Power is 1 minus the probability of a Type II error. The probability of a Type II error for a specified alternative is given by the equation on page 216. This equation is a little confusing since it is not clear what you plug in for z_{a}, The description on pages 214 and 215 is a little vague as well. It turns out that we always use the right tail value (the positive value, in this case 1.645). It has to do with our taking the absolute value of the differences of the mean in the right hand part of the equation. With this understanding we plug in the value for z_{a}=1.645, and D = m_{0}m_{a} = {0.5, 1.0, and 1.5), s=1.5, n=15. Answer b_{(0.5)}=P(Z<1.6451.291) = 0.6787 > Power = 0.3216 Answer b_{(1.0)}=P(Z<1.6452.58) = 0.1749 > Power = 0.8251 Answer b_{(1.5)}=P(Z<1.6453.873) = 0.0129 > Power = 0.9871 From this we conclude that at this level of replication and with this statistical test, we have greater than a 90.1% chance of rejecting the null hypothesis that average fuel consumption is 10l/h when it is actually 9.5l/h on average. We are almost certain to show as significant a reduction in fuel consumption of 1l/h or more. Replication Consumption 1 8.99 2 9.16 3 10.20 4 9.82 5 11.54 6 9.48 7 10.65 8 9.71 9 11.43 10 9.97 11 9.11 12 8.24 13 8.44 14 8.50 15 11.34 A SAS program that reads in these data and outputs basic statistics is given here (SAS PROGRAM and OUTPUT). All we really need from this output is the sample mean, so running the program is overkill here. The test statistic is computed as: The rejection region is Z<z_{0.05}=1.645. Since 0.5887 is not less than 1.645 we do not reject the null hypothesis and conclude that the consumption with the new fuel is not really any different than that with the standard fuel. Just looking at the numbers would you conclude that consumption with the new fuel is any worse. {NO}. I have given you all the information you need to use the equation on page 221 to figure out the needed sample size. Simply plugging in the values we get. The sign test is given on page 246. For this test we only need the number of observations greater than the median stated in the null hypothesis. We will take the target median M_{0} to be 10 l/h. In this case, B=5 observations greater than the specified median, and the rejection region for the lefttailed test is Reject Ho if B£C0.05,15 from Table 4 Since B =5 is greater than C=3 from the table, we do not reject the null hypothesis that the population median is 10 l/h and conclude that the new fuel is not significantly different from the standard fuel.Note that the SAS output provides three test for central tendency, but here they test that the mean or median are equal to zero, hence all are significant.
Tests for Location: Mu0=0 Test Statistic p Value Student's t t 34.77598 Pr > t <.0001 Sign M 7.5 Pr >= M <.0001 Signed Rank S 60 Pr >= S <.0001 To have SAS automatically test for a specific central tendency value (mean or median), create a new variable which is the old variable minus the null hypothesis mean and rerun the Univariate procedure. The test results below suggest that both the ttest and the Sign test would not reject the null hypothesis. Tests for Location: Mu0=0 Test Statistic p Value Student's t t 0.81139 Pr > t 0.4307 Sign M 2.5 Pr >= M 0.3018 Signed Rank S 17 Pr >= S 0.3591

For students in engineering fields. 
We begin by putting down all that we know using the shorthand notation of the book. H_{0}: m=50 t/d H_{A}: m>50 t/d Test Statistics: Average fuel consumption based on 15 replications, We will convert the average production to a zscore using the prior knowledge that the true population standard deviation is s=25 t/d. That is z=(meanm)/(s/Ö(n)). Since this is a onetailed (righttailed alternative) test, the critical value that defines the rejection region is given by z_{0.95} = 1.645. Power is 1 minus the probability of a Type II error. The probability of a Type II error for a specified alternative is given by the equation on page 216. This equation is a little confusing since it is not clear what you plug in for z_{a}, The description on pages 214 and 215 is a little vague as well. It turns out that we always use the right tail value (the positive value, in this case 1.645). It has to do with our taking the absolute value of the differences of the mean in the right hand part of the equation. With this understanding we plug in the value for z_{a}=1.645, and D = m_{0}m_{a} = {10, 20 and 40 t/d), s=25, n=15. Answer b_{(10)}=P(Z<1.6451.549) = 0.5359 > Power = 0.4641 Answer b_{(20)}=P(Z<1.6453.098) = 0.073 > Power = 0.927 Answer b_{(40)}=P(Z<1.6456.197) = 0.000003 > Power > 0.9999 From this we conclude that at this level of replication and with this statistical test, we only have a 46.4% chance of rejecting the null hypothesis that average production is 50 t/d when it is actually 60 t/d on average. We are almost certain to show as significant an increase in production to 70 t/d or more. So, this test is not very powerful in showing only a 10 t/d increase but would be adequate to show a 20 t/d or more increase. Day Yield 1 57.8 2 58.3 3 50.3 4 38.5 5 47.9 6 157.0 7 38.6 8 140.2 9 39.3 10 138.7 11 49.2 12 139.7 13 48.3 14 59.2 15 49.7 A SAS program that reads in these data and outputs basic statistics is given here (SAS PROGRAM and OUTPUT). All we really need from this output is the sample mean, so running the program is overkill here. The test statistic is computed as: The rejection region is Z<z_{0.05}=1.645. Since3.746 is greater than 1.645 we reject the null hypothesis and conclude that the production significantly increases with the new equipment. I have given you all the information you need to use the equation on page 221 to figure out the needed sample size. Simply plugging in the values we get: n=40^2 (1.645+0.84)^2 / 20^2 = 24.7, so take n=25. The sign test is given on page 246. For this test we only need the number of observations greater than the median stated in the null hypothesis. We will take the target median M_{0} to be 50 t/d. In this case, B=8 observations greater than the specified median, and the rejection region for the righttailed test is Reject Ho if B³nC0.05,15 from Table 4 Since B =5 is less than nC=153=12 from the table, we do not reject the null hypothesis that the population median is 50 t/d and conclude that the new equipment does not significantly increse production Thus the formal normal score test rejects the null hypothesis and the nonparameteric Sign Test does not reject the null hypothesis. Why do you think this result occurs. One clue could be the fact that in the normal test we assumed that the population standard deviation was 25 whereas the sample standard deviation was s=44. Notice that the SAS output for the production provides three test for central location. These three test all test that the true mean is zero, hence all are quite significant. Tests for Location: Mu0=0 Test Statistic p Value Student's t t 6.502692 Pr > t <.0001 Sign M 7.5 Pr >= M <.0001 Signed Rank S 60 Pr >= S <.0001 To test that the true population is centered around 50, we create a new variable by dividing production by 50 and rerun these tests for location.. Tests for Location: Mu0=0 Test Statistic p Value Student's t t 2.119643 Pr > t 0.0524 Sign M 0.5 Pr >= M 1.0000 Signed Rank S 16.5 Pr >= S 0.3666 Now you find that both the Student's Ttest and the Sign test produce similar conclusions  do not reject the null hypothesis. 
For students in toxicology and health science fields. 
We begin by putting down all that we know using the shorthand notation of the book. H_{0}: m=14 mg H_{A}: m>14 mg Test Statistics: Average nicotine content based on 20 replications, We will convert the average nicotine content to a zscore using the prior knowledge that the true population standard deviation is s=0.8 mg. That is z=(meanm)/(s/Ö(n)). Since this is a onetailed (righttailed alternative) test, the critical value that defines the rejection region is given by z_{0.95} = 1.645. Power is 1 minus the probability of a Type II error. The probability of a Type II error for a specified alternative is given by the equation on page 216. This equation is a little confusing since it is not clear what you plug in for z_{a}, The description on pages 214 and 215 is a little vague as well. It turns out that we always use the right tail value (the positive value, in this case 1.645). It has to do with our taking the absolute value of the differences of the mean in the right hand part of the equation. With this understanding we plug in the value for z_{a}=1.645, and D = m_{0}m_{a} = {.2, .4, and .75 mg), s=0.8, n=20. Answer b_{(.2)}=P(Z<1.6451.118) = 0.7019 > Power =0.298 Answer b_{(.4)}=P(Z<1.6452.236) = 0.2776 > Power = 0.7224 Answer b_{(.75)}=P(Z<1.6454.193) = 0.0054 > Power > 0.9946 From this we conclude that at this level of replication and with this statistical test, we only have a 29.8% chance of rejecting the null hypothesis that average nicotine is 14 mg when it is actually 14.2 mg on average. We have a better chance of showing an increase .4 mg and the statistical test is almost certain to declare a difference of 0.75 mg as significant. . A SAS program that reads in these data and outputs basic statistics is given here (SAS PROGRAM and OUTPUT). All we really need from this output is the sample mean, so running the program is overkill here. The test statistic is computed as:
The rejection region is Z>z_{0.95}=1.645. Since3.242 is greater than 1.645 we reject the null hypothesis and conclude that the nicotine content is significantly greater than the target value. I have given you all the information you need to use the equation on page 221 to figure out the needed sample size. Simply plugging in the values we get. The sign test is given on page 246. For this test we only need the number of observations greater than the median stated in the null hypothesis. We will take the target median M_{0} to be 14 mg. In this case, B=18 observations greater than the specified median, and the rejection region for the righttailed test is Reject Ho if B³nC0.05,20 from Table 4 Since B =18 is greater than nC=205=15 from the table, we reject the null hypothesis that the population median is 14 mg and conclude that average nicotine levels are different from the target. Note that in the SAS output, we have there test for location reported. The resuts reported for the Nicotine variable test whether this value is truely zero. All tests conclude that Nicotine is significantly different from zero. Tests for Location: Mu0=0 Test Statistic p Value Student's t t 77.02162 Pr > t <.0001 Sign M 12.5 Pr >= M <.0001 Signed Rank S 162.5 Pr >= S <.0001 But this does not test whether the mean or median are equal to 14. The same tests applied to a new variable with Nicotine less 14 gives us the pvalue for the twosided tests. If we divide the pvalue by two, we get the comporable pvalues for the onesided test. Note that in this case, all test have pvalues less than the nominal 0.05 set for the test.
Tests for Location: Mu0=0 Test Statistic p Value Student's t t 3.070047 Pr > t 0.0053 Sign M 5.5 Pr >= M 0.0433 Signed Rank S 99.5 Pr >= S 0.0048 
For students in community development, education and social services fields. 
We begin by putting down all that we know using the shorthand notation of the book. H_{0}: m=$500 H_{A}: m>$500 Test Statistics: Average costs based on 40 replications, We will convert the average costs to a zscore using the prior knowledge that the true population standard deviation is s=$150. That is z=(meanm)/(s/Ö(n)). Since this is a onetailed (righttailed alternative) test, the critical value that defines the rejection region is given by z_{0.95} = 1.645. Power is 1 minus the probability of a Type II error. The probability of a Type II error for a specified alternative is given by the equation on page 216. This equation is a little confusing since it is not clear what you plug in for z_{a}, The description on pages 214 and 215 is a little vague as well. It turns out that we always use the right tail value (the positive value, in this case 1.645). It has to do with our taking the absolute value of the differences of the mean in the right hand part of the equation. With this understanding we plug in the value for z_{a}=1.645, and D = m_{0}m_{a} = {$50, $75 and $100), s=$150, n=40. Answer b_{(50)}=P(Z<1.6452.236) = 0.2776 > Power = 0.7223 Answer b_{(75)}=P(Z<1.6453.354) = 0.0437 > Power = 0.9563 Answer b_{(150)}=P(Z<1.6454.472) = 0.0023 > Power > 0.9977 From this we conclude that at this level of replication and with this statistical test, we have a 72% chance of rejecting the null hypothesis if the average cost is $50 more than currently believed. We are almost certain to find significant differences of $75 an more greater than expected with a sample of 45 students. A SAS program that reads in these data and outputs basic statistics is given here (SAS PROGRAM and OUTPUT). All we really need from this output is the sample mean, so running the program is overkill here. The test statistic is computed as: The rejection region is Z>z_{0.95}=1.645. Since 0.888 is less than 1.645 we do not reject the null hypothesis and conclude that average costs are about what we expected them to be. I have given you all the information you need to use the equation on page 221 to figure out the needed sample size. Simply plugging in the values we get. The sign test is given on page 246. For this test we only need the number of observations greater than the median stated in the null hypothesis. We will take the target median M_{0} to be $500. In this case, B=24 observations are greater than the specified median, and the rejection region for the righttailed test is Reject Ho if B³nC0.05,40 from Table 4 Since B =24 is less than nC=4014=26 from the table, we do not reject the null hypothesis that the population median is $500 and conclude that average costs are not different from the target. Note that in the SAS output, we have three tests for location reported. The results reported for the variable Expenditure considers whether the center of the distribution for this value is truely zero. All tests conclude that Expenditure is significantly different from zero. Tests for Location: Mu0=0 Test Statistic p Value Student's t t 28.74492 Pr > t <.0001 Sign M 20 Pr >= M <.0001 Signed Rank S 410 Pr >= S <.0001 But this does not test whether the mean or median are equal to $500. The same tests applied to a new variable with Expenditure less the $500 gives us the desired tests. Note that now all tests lead us to conclude that the null hypothesis should not be rejected. But also notice that each test is a twosided test and hence is not directly applicable here. The pvalue for the appropriate onesided test would be half the pvalue reported here. This still suggests that we do not reject the null hypothesis that the average costs is $500 since these pvalues will still be greater than the nominal 0.05 for the tests.
Tests for Location: Mu0=0 Test Statistic p Value Student's t t 1.162595 Pr > t 0.2521 Sign M 4.5 Pr >= M 0.1996 Signed Rank S 102.5 Pr >= S 0.1551 
Final Comments 
For this assignment I used the SAS system to perform the analysis. Notice how easy it is to pass on to you the SAS program (code) that I used to perform the analysis. You can easily copy my code and rerun it for yourself. In addition, the SAS output here was a simple text and hence I could send you a copy of the text quite easily. Note also that SAS has a new output delivery system (ODS) tool that lets me output results to html web pages. The last exercise I could have used the following code to produce html output. ods html file="c:/site/sta6166/images/univ.htm"; proc univariate; var Expenditure cExp ; title " Insurance Problem 2 Unit 2 Section 1 "; run; ods html close; The resulting univ.htm files created by this program can be viewed here. 