# STAT 5303 Assignment 6

## Plant Species Diversity in the Galapagos Islands

The Galapagos Islands off the coast of Ecuador provide an excellent
laboratory for studying factors that influence the development and
survival of different species. The data below give the number of
species of plants and related geographic variables for 30 different islands. Counts are
given both for the total number of species, and the number of species that
occur only on that specific island (the endemics). The variables from
left to right are as follows.

- Island: island name
- S: number of species
- E: number of endemics
- A1: area (km^2)
- El: highest elevation (m)
- D1: distance from nearest island (km)
- D2: distance from Santa Cruz (km)
- A2: area of adjacent island (km^2)

On close inspection of the data, you will notice that the number of
species and endemics on the island of Daphne Minor was not recorded (the
periods denote missing observations). Build a simple linear
regression model to **predict** the number of species (S) on Daphne Minor
based on the *single* best geographic variable A1, El, D1, D2, or A2,
indicating your reason for the choice of variable. (Ignore the
endemics in this question.) Specifically:

- Build a simple linear regression model for S. Write down the regression
equation of your fitted model.
- Assess the fit of your model.
- Comment on your reason(s) for the choice of model.
- Use your model to predict the (missing) number of species on Daphne Minor,
and give an appropriate (confidence or prediction)
**95% interval** for the prediction.

**Instructions:**

- Here is the data as an excel file.
- Typeset your results as a report, using the same
format, template, and general instructions as for Assignment 2