STAT 5371 -- Regression Analysis -- Spring 2023
Basic Information
Course instructor:
Dr. Alex
Trindade, 233 Mathematics & Statistics Building.
E-mail: alex.trindade"at"ttu.edu.
Course Meets: 11:00-12:20 TR, face-to-face in Math 115.
Office Hours: TWR 1:00-2:00, or by appointment.
Required Books
Linear Models with R, by Julian Faraway, 2nd ed., 2014, CRC Press. ISBN-13: 978-1439887332. (I will abbreviate this book "LMR".)
Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models, by Julian Faraway, 2005, CRC Press.
Useful Books
Generalized Linear Models, by McCullagh and Nelder, 2nd ed., 1989, CRC Press.
An Introduction to Generalized Linear Models, by Dobson and Barnett, 3rd ed., 2008, CRC Press.
Course Objectives and Syllabus
The course will cover theory and methods for linear regression and generalized linear models (GLMs), including also some coverage of nonlinear regression. A full treatment of the linear regression model is given, focusing on results from mathematical statistics making use of matrix algebra. Computational methods and software will be used to analyze datasets based on ``canned routines'' as well as a matrix language. Prerequisite: STAT 5329 (Math-Stat). List of topics:
- Introduction (Ch 1 of LMR).
- Estimation (Ch 2 of LMR).
- Inference (Ch 3 of LMR).
- Prediction (Ch 4 of LMR).
- Explanation (Ch 5 of LMR)
- Diagnostics, problems with predictors and errors, transformations to correct (Chs 6-9 of LMR)
- Model selection (Ch 10 of LMR)
- Intro to nonlinear and nonparametric regression (LM Course Notes)
- Intro to Generalized Linear Models, esp. Logistic Regression (GLM Course Notes)
Note that Chs 1-10 of LMR are essentially covered by the LM Course Notes, which we will follow for the first half of the course. The 2nd half of the course will follow the GLM Course Notes. (See "Notes and Handouts" below.)
Expected Student Learning Outcomes
By the end of the course students will be familiar with the theory (theorems and formulas) and practical aspects (ability to use a statistical package for implementation) of regression modeling. Given a dataset, they will be able to fit a suitable model by considering an appropriate subset of predictors using model selection techniques. This may include searching for appropriate transformations of the variables so as to linearize relationships, diagnosing collinearity, and assessing lack of fit and other potential problems with the model. They will be able to write down a matrix representation for the solution of the least squares criterion used to yield parameter estimates and their standard errors under the normal assumption. They will be able to construct confidence intervals and test hypotheses about linear combinations of model parameters (contrasts), and be able to interpret and summarize their findings. They will be able to embed these methods and results in the framework of the generalized linear model, which will allow extensions to a much larger class of linear models with a variety of response distributions and form for the link function connecting expected response to linear predictor. Finally, they will become proficient in communicating these results in the form of a scientific paper.
Methods of Assessing the Expected Learning Outcomes
The expected learning outcomes for the course will be assessed through a mix of homework assignments (35%), a midterm test (25%), a data analysis project (10%), and a comprehensive final exam (30%). The traditional grading scale will be used:
- A: 90-100%.
- B: 80-89%.
- C: 70-79%.
- D: 60-69%.
- F: 0-59%.
The test schedule is as follows:
- Midterm: Thursday March 9.
- Final Exam: Takehome.
Homework Assignments
There will be weekly Assignment Sets. All work is to be uploaded to Blackboard. No late submissions will be accepted.
- Set 0 (due Jan 14): Read Ch 1 of LMR and do Exercise 1.1. (Not graded.)
- The remaining Hwk Sets are on Blackboard (Hwk 1 is due Sat Jan 21.)
Data Analysis Project
Search the web to find suitable datasets to analyze via linear regression or GLM. You may also have some ideas from your own research. BUT: the data must not already have been analyzed by someone (as far as you know). Make sure there is a sufficiently large pool of predictors (at least 10), so that model selection will have to be used. Likewise, and since we are focusing on classical methods, the corresponding sample size (n) must not be smaller than the number of predictors (p), and should in fact be much larger. Send me a 1-page proposal of what you intend to do, basically describing the data and its source, and what your goal is. Once I OK it, you can proceed. Project due date: Saturday April 15. Project grade (out of 20) will be based on validity (10), quality (5), and clarity (5). Some ideas on finding suitable datasets.
Notes and Handouts
R Demos from old course
Software
I will use R as the primary software tool. SAS is also recommended. Some assignments will require extensive use of a software package of your choice. While we will focus on the theory, the applied data modeling aspect is an important complement that greatly helps in understanding the methodology. For details on R see my statistical computing page, and especially the section on "Linear Models & GLMs".
Policies
- What to do in case of an emergency. If a student encounters a personal problem that affects their ability to attend class or complete their work on time, they should first consult their instructor. In exceptional circumstances, students may be directed to the Dean of Students office, phone (806-742-2984), email (deanofstudents@ttu.edu). The Dean of Students can help with emergencies including COVID, car accidents, death of a family member, inability to afford food, health issues, and more. (In exceptional circumstances, the Dean of Students can authorize exceptions to class policies.) In addition, Title IX reporting and support resources are available here.
- Class Attendance. Your attendance alone will not impact your grade,
but missing exams and assignments will.
- Make-up Exams: These may be granted in exceptional circumstances after you have followed the above protocol on What to do in case of emergency.
- Absence for observance of a religious holy day: See this
link.
- Absence due to officially approved trips: The Texas Tech University Catalog states that the department chairpersons, directors, or others responsible for a student representing the university on officially approved trips should notify the student's instructors of the departure and return schedules in advance of the trip. The instructor so notified must not penalize the student, although the student is responsible for material missed. Students absent because of university business must be given the same privileges as other students.
- ADA accommodations, Academic Integrity, COVID-19. See this
link.
- Civility in the Classroom. It is expected that everyone will behave
in a manner that is conducive to learning. One common disruption is cell
phones. Please turn these off in class.
- Electronic Devices in Tests. In the spirit of keeping costs down, I will permit the usage of apps on smart devices (phones, tablets, laptops, etc.), but any kind of communication or accessing of the web via these devices is forbidden.
- Collaboration. My policies on this are as follows.
- Homeworks: Discussion with peers regarding material/concepts covered in the
course is permitted, and is encouraged since it usually leads to greater comprehension. However, each person must write up his/her own
solution to a particular problem, and not simply have someone else do it for them.
- Tests: Any form of collaboration on tests, including e-device communication or trying to see what the person next to you is writing, is strictly forbidden and will not be tolerated.
top of page