Alessandro (Ale) Rinaldo - Fall, 2024

SDS 387 is an intermediate graduate course in theoretical statistics for PhD students, covering two separate but interrelated topics: (i) stochastic convergence and (ii) linear regression modeling. The material and style of the course will skew towards the mathematical and theoretical aspects of common models and methods, in order to provide a foundation for those who wish to pursue research in statistical methods and theory.
**This is not an applied regression analysis course.**

Syllabus: Syllabus

Lectures: Tuesday and Thursday, 9:00am - 10:30am, PMA 5.112

TA: Khai Nguyen, khainb@utexas.edu - Office hours: Thursday, 1:30pm - 2:30pm, GDC 7.418 (Poisson Bowl)

Ale's Office hours: by appointment

Homework submission and solutions: use Canvas

Due date | |

Homework 1 | September 17 |

Homework 2 | October 3 |

Final project proposal | October 12 |

Homework 3 | October 17 |

Homework 4 | November 14 |

Lecture 1: Introduction and course logistics. Deterministic convergence and convergence with probability one.

Lecture 2: Lim sup and lim inf of events. Borel Cantelli Lemmas. Convergence in probability and comparison with convergence with probability one. Law of large numbers. Glivenko Cantelli Lemma.

References:

See Ferguson's book, chapters 1, 2 and 4.

For a proof of Glivenko-Cantelli's Lemma see Theorem 19.1 of van der Vaart's book.

A nice webpage summarizing the different modes of stochastic convergence and providing some good examples to illustrate their differences.

Lecture 3: Glivenko Cantelli Theorem, First Borel Cantelli Lemma, more on convergence in probability. For the Glivenko Cantelli Theorem, see Theorem 19.1 in van der Vaart's book.

Lecture 4: Lp convergence, Minkowski, Holder and Jensen inequalities. Relations between Lp convergence and convergence in probability and with probability one. C.d.f.'s in multivariate settings.

Lecture 5: Convergence in distribution. Relation with other forms of convergence. Marginal vs joint convergence in distribution. Portmanteau theorem. For the proof of the claim that convergence in probability implies convergence in distribution, see page 330 of Billingsley's book *Probability and Measure.*

Lecture 6: Portmantreau Theorem, Continuous Mapping Theorem, characteristics functions and Continuity Theorem, Cramer-Wald device. I suggest reading Chapter 3 of Ferguson's book (in particuar, Theorem 3(e) has a neat proof).

Lecture 7: Slutsky's theorem, more on convergence in distribution. Big-oh and little-oh notation.

Lecture 8: More on big-oh and little-oh notation. CLT for i.i.d. variables using characteristic functions. Triangular arrays, Lindeberg Feller and Lyapunov conditions.

Lecture 9: Lindeberg Feller, examples and multivariate extension. Berry-Esseen bounds. A good reference for this lecture and the last is the book *Sums of Independent Random Variables,* by V.V. Petrov, Springer, 1975. Another classic and good reference is *Approximation Theorems of Mathematical Statistics* by Serfling, Wiley, 1980.

Lecture 10: Kolmogorov Smirnov, total variation and Wasserstein distances. Theorem 1.1 about Lindeberg approximations for 3-times continuously differentiable functions.

Lecture 11: Review of linear algebra. See references in the class notes.

Lecture 12: Spectral properties of matrices. Eigendecomposition and singular value decomposition.

Lecture 13: Projections. Vector and matrix norms.

Lecture 14: projection of a random variable onto vector space of random variables. Introduction to linear regression modeling. For the next few lectures, I will be following closely the book Learning Theory from First Principles by Francis Bach

Lecture 15: Inference and prediction in linear regression modeling. Projection parameter, prediction risk decomposition.

Lecture 16: Geometric interpretation of the OLS estimator. Gradient descent convergence guarantee for the OLS.

Lecture 17: Pseudo inverse. Risk decomposition for the estimator of the linear regression parameters for fixed design.

Lecture 18: Gauss Markov Theorem. Ridge regression.

Lecture 19: Optimal tuning for ridge regression and minimax lower bound for OLS.

Lecture 20: Minimax lower bound for OLS. Consistency of the OLS.