Name:
Andrew ID:
Collaborated with:

This lab is to be done in class (completed outside of class time if need be). You can collaborate with your classmates, but you must identify their names above, and you must submit your own lab as an knitted PDF file on Gradescope, by Friday 9pm, this week.

This week’s agenda: basic indexing, with a focus on matrices; some more basic plotting; vectorization; using for() loops.

Q1. Back to some R basics

# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
x.list = list(rnorm(6), letters, sample(c(TRUE,FALSE),size=4,replace=TRUE))
# YOUR CODE GOES HERE

Prostate cancer data set

We’re going to look at a data set on 97 men who have prostate cancer (from the book The Elements of Statistical Learning). There are 9 variables measured on these 97 men:

  1. lpsa: log PSA score
  2. lcavol: log cancer volume
  3. lweight: log prostate weight
  4. age: age of patient
  5. lbph: log of the amount of benign prostatic hyperplasia
  6. svi: seminal vesicle invasion
  7. lcp: log of capsular penetration
  8. gleason: Gleason score
  9. pgg45: percent of Gleason scores 4 or 5

To load this prostate cancer data set into your R session, and store it as a matrix pros.dat:

pros.dat =
  as.matrix(read.table("http://www.stat.cmu.edu/~arinaldo/Teaching/36350/F22/data/pros.dat"))

Q2. Basic indexing and calculations

# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE

Q3. Exploratory data analysis with plots

# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE

Q4. A bit of Boolean indexing never hurt anyone

# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE

Q5. Computing standard deviations using iteration

pros.dat.svi.sd = vector(length=ncol(pros.dat))
i = 1
# YOUR CODE GOES HERE
pros.dat.no.svi.sd = vector(length=ncol(pros.dat))
i = 1
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
pros.dat.svi.sd.master = apply(pros.dat.svi, 2, sd)
pros.dat.no.svi.sd.master = apply(pros.dat.no.svi, 2, sd)
all.equal(pros.dat.svi.sd, pros.dat.svi.sd.master, check.names=FALSE)
all.equal(pros.dat.no.svi.sd, pros.dat.no.svi.sd.master, check.names=FALSE)

Q6. Computing t-tests using vectorization

# YOUR CODE GOES HERE
# YOUR CODE GOES HERE
# YOUR CODE GOES HERE

Q7. My plot is at your command (optional)

# YOUR CODE GOES HERE
# YOUR CODE GOES HERE