Name:
Andrew ID:
Collaborated with:

This lab is to be done in class (completed outside of class time if need be). You can collaborate with your classmates, but you must identify their names above, and you must submit your own lab as an knitted PDF file on Gradescope, by Friday 9pm, this week.

This week’s agenda: exploratory data analysis, cleaning data, fitting linear/logistic models, and using associated utility functions.

Prostate cancer data set

Below we read in the prostate cancer data set that we looked in previous labs.

pros.df = 
  read.table("https://www.stat.cmu.edu/~arinaldo/Teaching/36350/F22/data/pros.dat")
dim(pros.df)
## [1] 97  9
head(pros.df, 3)
##       lcavol  lweight age      lbph svi       lcp gleason
## 1 -0.5798185 2.769459  50 -1.386294   0 -1.386294       6
## 2 -0.9942523 3.319626  58 -1.386294   0 -1.386294       6
## 3 -0.5108256 2.691243  74 -1.386294   0 -1.386294       7
##   pgg45       lpsa
## 1     0 -0.4307829
## 2     0 -0.1625189
## 3    20 -0.1625189

Q1. Simple exploration and linear modeling

# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE

Q2. Reading in, exploring wage data

# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE

Q3. Wage linear regression modeling

# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE

Q4. Wage logistic regression modeling (optional)

# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE

Q5. Wage generalized additive modeling (optional)

# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE