# STAT 5302 Assignment 2

An Ecological Risk Assessment Example

Mercury (Hg) and its compounds have no known metabolic function in biota but their presence in cells of living organisms represents contamination from natural or anthropogenic sources. Mercury is used in a number of industrial processes such as paper manufacturing and as an agricultural fungicide. As part of an ecological risk assessment of an old industrial site suspected of having soils contaminated with mercury, the study team has decided to compare the average levels of mercury in the fur of raccoons (Procyon lotor) living on the study site to those living in a far removed area (background). The assumption is that if the average level in the study site raccoon fur is elevated above that of the background site, we will conclude that there is increased risk of ecological damage from the study site.

Obtaining and processing fur samples from raccoons is not an easy or cheap task. Costs for each sample can run into the thousands of dollars. To minimize the total cost of the project, you decide to first estimate the number of raccoons you will need from each site. Published data indicate that the average level of mercury in raccoon fur is about 1.12 on a natural log scale (e.g. about 3.8 mg/kg concentration) with a standard deviation of about 0.4 on a natural log scale ( or 1.38 mg/kg concentration). Because the natural log of concentration data is typically more normally distributed than the raw untransformed data, we will do all computations and measurements using the natural log transformed values (that is, use 1.12 and 0.4 as the suspected population mean and standard deviation, respectively). We would conclude important ecological effects if we found that the average log concentration between the two sites differed by &Delta=0.56 on the log scale (this is the value of Delta in the sample size calculation). We want to be fairly sure that if the true difference is greater than this amount we will conclude that a difference exists (i.e. accept HA when HA is true). We do not want to initiate cleanup and legal proceedings unless we are sure the difference is real (i.e. reject H0 when H0 is true). For these two reasons, we decide to set the Type I error probability at &alpha=0.01 and the Type II error probability at &beta=0.05 when computing an estimate of the sample size that meets these criteria.

The primary research task of interest here is to test whether the study area has a higher average log Hg concentration than the background area. As a side analysis, you will want to also determine whether the two sites differ in the variability of these concentration levels. Finally, analysis of mercury in fur samples is difficult. You would like to be certain that the results coming back from our primary lab are reliable (log) concentration values. To test this, you will send 3 samples from each site to a second lab for a more accurate and expensive analysis. So, your sample size must be at least 3. You will want to compare the results from the first lab to the results from the second lab.

To answer the above questions perform all 4 tasks below. List the assumptions needed for each inference, comment on their validity, and decide if an alternative analysis is more appropriate in each case.

1. Determine the sample size, n, needed to meet the design criteria for the hypothesis test of task 3 below. [Hint: The sample size must be greater than 3 and less than 20.]
2. Taking the first n data values from Lab 1, test the alternative hypothesis that the two sites have different variances. [Use a Type I error rate of 0.01.]
3. Based on the results of the variances test, perform a two sample t-test to determine if the average (log) Hg concentration in the study area is higher than that in the background area. [Again take only the first n data values from Lab 1; use a Type I error rate of 0.01.]
4. Using the 6 pairs of observations from Labs 1 and 2, and disregarding the location information, perform an appropriate t-test to determine whether the two Labs provided similar results. [Use a Type I error rate of 0.05.]

Instructions:

• Here is the data (log concentration). Use only the first n observations from each location, where n is the sample size you calculated in task 1.
• Typeset your results as a report, using this template. Give a brief INTRODUCTION. In the STATISTICAL METHODS section, describe all statistical computations (including sample size calculations) and analyses performed. Describe your findings in the RESULTS AND CONCLUSIONS section. Place all numeric findings into appropriate tables and refer to them in the RESULTS AND CONCLUSIONS section. Place all graphical material produced as a result of this analysis also into figures and refer to them in the RESULTS AND CONCLUSIONS section. Include in this section also your conclusion regarding the analyses (i.e. interpret the statistical tests).
• Use complete sentences and well-structured paragraphs.
• Do not include findings in the METHODS section. Past tense is ok here. ( eg. "An F-test for equality of variances was run to ...")
• Include all findings, e.g. Tables, Figures and statistical test conclusions only in the RESULTS AND CONCLUSIONS write-up. Try to write this section in the present tense ("Results are ..." rather than "Results were ...." )
• You don't have to restrict your approach to just tools learned in Unit 2. There are some things from Unit 1 that can be used here (e.g. paired boxplots).
• Try to minimize the inclusion of raw computer output in your report.
• This analysis will take some time. Don't put it off too late. Use the computer whenever possible to reduce computation time. You should be spending more time interpreting results than doing computations.

NOTE: This example is totally synthetic. Any similarity to real cases or situations is completely incidental. No claim is made as to the risk from Mercury. No raccoons were harmed as part of this make-believe study.