Syllabus

Author

Jiaye Xu

Published

February 17, 2024

Course Information

  • Scientific Computing and Programming - Spring 2024

    • 班级编号:13112325001

    • 课程名称:科学计算与编程

  • Instructor: Dr. Jiaye Xu (徐嘉烨)

  • Time: Wednesday, class 7-9. Week 1-12.

  • Location: Main Building 340. (主楼340)

  • Office Hours: Wednesday, class 10, by appointment.

Textbooks

No required textbooks. The course material is self-contained.

Computing

We will use the open source statistical software R, available at http://www.r-project.org, and the open source & productive integrated development environment (IDE) RStudio that can be downloaded from https://www.rstudio.com/. (Rstudio is now Posit and migrates to https://posit.co/)

Assessment

Participation (60%) + Final Project (40%)

Participation (60%): three in-class presentations. Slides are required for presentation.

Final Project (40%): a technical report of an algorithm including motivation, application, theory, main algorithm and its implementation in R using simulated or real data.

Outline of Lectures

Week Topic Notes
1 Introduction A review of probability and statistics
2 Simulation I
3 Simulation II - Monte Carlo
4 Participation – Part I (20%) R basic skills, Visualization, R packages for EDA and modeling, Rmd, Quarto etc.
5 Bayes Inference, MCMC I
6 MCMC II
7 Applications of MCMC Inference for Dynamic Linear Models
8 Participation – Part II (20%) Trailer of your final project presentation
9 Optimization Methods Review of optimization algorithms: e.g., Newton-Raphson
10 Optimization – EM Optimization algorithms continued: Gradient Descent (GD); SGD etc.
11

Bootstrap

R & Python tips: rstanarm; neural network fitting

May 1st (Wed) is Labor Day. A Make-Up Lecture on 2024/04/28 (Sun)
12 Participation – Part III: Presentation of Final Project (20%) Final Report (40%) due on 2024/05/17

Some Interesting Topics on Computation

  1. Simulation

    • Exact Simulation: Standard Distributions; Quantile transform method (Inverse CDF); Rejection Sampling, etc.

    • Monte Carlo Simulation

    • Markov Chain Monte Carlo (MCMC)

      • Metropolis-Hastings Algorithm

      • Random Walk Chains

      • Gibbs Sampling

  2. Bootstrapping

    • Non-parametric Bootstrap

    • Parametric Bootstrap

    • Bootstrap Confidence Intervals

  3. Cross-Validation

  4. Optimization

    • Gradient Descendant

    • Newton’s Method (Newton-Raphson)

    • Newton-Like Methods: Quasi-Newton; Gauss–Newton; Nelder-Mead Algorithm, etc.

    • Penalized Optimization: Ridge Regression, Lasso, Smoothing Splines

    • EM Optimization

  5. Density Estimation and Smoothing

    • Smoothers: Kernel Smoother; Local Regression Smoothing; Spline Smoother, etc.

    • Generalized Additive Models (GAM)

    • Tree-Based Methods: classification and regression trees (CART); Random Forests

References

Part I

Quick-R

R Cookbook

R for Data Science

R Cheatsheets

R Task Views

Linear Models in Ten Lectures using R

Julian J. Faraway (2014). Linear Models with R.

Julian J. Faraway (2004). Extending the Linear Model with R. Chapman and Hall/CRC eBooks.

An Introduction to Statistical Learning with Applications in R

B站-720P全集-Stanford University - Statistical learning -英文字幕

Part II

The Elements of Statistical Learning: Data Mining, Inference, and Prediction.

Probabilistic Machine Learning

Computer Age Statistical Inference: Algorithms, Evidence and Data Science

Giovanni Petris, Sonia Petrone, & Patrizia Campagnoli (2009). Dynamic Linear Models with R.

Peter D. Hoff (2016). A First Course in Bayesian Statistical Methods. Springer texts in statistics.

BDA3

Sheldon M. Ross (2022). Simulation, 6th Edition. Academic Press.

Geof H. Givens, Jennifer A. Hoeting (2012). Computational Statistics, 2nd Edition. Wiley.