STA 6166 Assignment 6
Predicting the Percentage of Body Fat From Simple Body Measurements
A variety of popular health books suggest that readers assess their health, at least in part, by estimating their percentage of body fat. Bailey (1991, The New Fit or Fat, Boston: Houghton-Mifflin, p.18) suggests that "15 percent fat for men and 22 percent fat for women are maximums for good health". In a recent study, age, weight, height, and 10 body circumference measurements were recorded for 252 men. Each man's percentage of body fat was accurately estimated by an underwater weighing technique. A description of the variables appearing in the dataset along with abbreviated names is as follows.
- fat: Percent body fat using Brozek's equation (Density
determined from underwater weighing)
- age: Age (years)
- weight: Weight (lbs)
- height: Height (inches)
- neck: Neck circumference (cm)
- chest: Chest circumference (cm)
- abd: Abdomen 2 circumference (cm)
- hip: Hip circumference (cm)
- thigh: Thigh circumference (cm)
- knee: Knee circumference (cm)
- ankle: Ankle circumference (cm)
- biceps: Biceps (extended) circumference (cm)
- forearm: Forearm circumference (cm)
- wrist: Wrist circumference (cm)
Note that the body fat measurement for the last man was not recorded (the period denotes a missing observation). Using these data, build a multiple regression model to predict percent body fat from the remaining variables, and summarize your findings. In your quest for a suitable model, you should:
- Plot the data in meaningful ways.
- Examine the need for transforming variables.
- Carry out model selection using the techniques discussed in class
(Stepwise, Cp, Adj. R^2, etc.), and decide on one (or a few) candidate model(s).
- Check if there are any violations of the regression assumptions,
including potential problems like influential observations and
- Based on all the preceeding results, decide on a single model for
predicting body fat, and report your findings.
- Using your chosen model, predict the percent body fat for the last man in the dataset giving an appropriate interval for your prediction, and comment on it. Is this man unusual in any way?
- You can analyze a dataset of your own instead of the above, but you must discuss this with me and receive my approval beforehand. Your alternate data set must have a quantitative response, and at least 5 potential predictor variables. Analyze the plausability of a regression analysis by doing some scatterplots before approaching me.
- Typeset your results as a report, using the same
format, template, and general instructions as for previous Assignments, except this time your report should approach the quality of a publishable paper. Therefore, no raw computer output should be included in the main write-up (although you may add an appendix). The report should be organized as follows:
This article gives further guidance on the art of communicating statistical results. And here is an example of a paper in the Journal of Agronomy to give an idea of what a good finished product might look like. In addition to the usual 20 points for accuracy and completeness of the statistical analysis, this assignment will have 5 additional points for quality of presentation.
- Abstract (a summary of the paper).
- Statistical Methods.
- Results and Conclusions.
- References (optional).
- Appendix (optional).
- The length of your report must NOT exceed 7 sheets of
paper (14 pages)! This means you will have to make some hard decisions as to
which plots you should include; typically only extremely compelling ones. (You may however add an appendix, and this can be as long as you wish.)