Section 1 Linear Regression and Correlation

Readings	Ott and Longnecker, Chapter 11,pages (531-616).
Instructor Guidance	We begin this unit with a discussion of what it means to have a linear relationship between two variables. The scatter plot is typically used to visualize the relationship between two variables. If the scatter seems to lie in a line, it is very likely that a linear model with adequately explain the relationship between the two variables. There are some tools, like smoothers and spline fits that help us to visualize the degree of non-linearity in a scatter plot. Transformations of the response or predictor can be useful in changing what initially looks to be a nonlinear relationship into something more linear. The lecture notes attempt to provide more guidance in using transformations than is available in the book. The core methodology of regression model fitting is given in sections 2 and 3. Here we find out how to estimate the slope and intercept terms that define the best fitting regression line. Then using sampling distribution concepts, we develop tests and confidence limits on these parameter estimates. Statistics are developed to test the significance of the fitted regression line and computations are organized into an Analysis of Variance table. Once we have an estimated regression line, we want to use it to make predictions. That is, to determine what the response is expected to be if the predictor is a specific value. Associated with this prediction are confidence bands and prediction bands. You will be expected to know what the difference is between these two. In fitting a linear regression we make some assumptions. Our next task in the regression analysis is to examine these assumptions. Sometimes this is called assessing goodness of fit (as in our book), but I prefer to refer to this as assumption assessment. The lecture notes have some discussion about what to do when the assumptions are not supported. We will not cover the section on inverse regression. Engineers and other scientists may know this as calibration. Read this if you wish, it is not required. Finally, we discuss single-number indices for the strength of a relationship. The Pearson Product Moment correlation coefficient is one such index, and a test for whether this value is significantly different from zero (no relationship) is given on page 596. Another measure, the coefficient of determination or r-square statistic is discussed. This statistic measures the fraction of total variability in the response that is explained by a regression on the predictor variable. Note: This is a long section and may require more than one week to digest. Plan to take the time on this because it is very important that these fundamentals of regression are understood before you proceed to multiple regression.
PPT Lecture	Linear Regression-Basics (PowerPoint and PDF) Linear Regression-Prediction (PowerPoint and PDF) Linear Regression-Lack of Fit and Transformations (PowerPoint and PDF)
Optional Activities	None
Exercises	To check your understanding of the readings and practice these concepts and methods, go to Unit 4 Section 1 Exercises, do the exercises then check your answers from the page provided. Following this continue on to the Unit 4 Section 2.