1 Introduction

1.1 Module outline

MTH3045, Statistical Computing, is designed to introduce you to some important, advanced topics that, when considered during calculations, can improve using computers to fit statistical models to data. This can allow us to use more data, or to fit a model more quickly and/or more reliably, for example.

1.2 Lectures / practical classes

In-person lectures will be held at the following times:

  • Tuesday 12.35 – 13.25. Laver LT3
  • Thursday 11.35 – 12.25. Old Library 134
  • Friday 13.35 – 14.25.
    • Forum Exploration Lab 1 (weeks 1,2,4,5,6,8,9,11)
    • Streatham Court 0.93 (weeks 3,7,10)

The first lecture takes place on Tuesday 16th January 2024. Lectures will typically involve a mix of me presenting a few slides being to introduce a method to you and then you engaging in hands-on programming so that you experience and confirm understanding of what’s been introduced. To engage with MTH3045 you should be attending lectures in person.

I currently plan to use week 6 as a reading week for you. During week 6 there will be no MTH3045 lectures.

1.3 Office hours

I will hold office hours each week on Tuesdays at 13.30 - 14.30 in my office, Laver 817, or you can schedule an appointment to meet me online during that hour.

1.4 Resources

All material that will be examined for MTH3045 can be found in the lecture notes and exercises provided. A complete set of lecture notes and exercises with solutions can be found in pdf format on the module’s ELE page

https://ele.exeter.ac.uk/course/view.php?id=13566

There is also an identical web-based version of the lecture notes available at

https://byoungman.github.io/MTH3045/

and a web-based version of the exercises available at

https://byoungman.github.io/MTH3045/exercises

Note that you might find the web-based versions easier for your study.

During lectures various challenges will be set. These can be found with skeleton solutions at

https://byoungman.github.io/MTH3045/challenges

However, sometimes you may find it useful to get a different perspective on things. If so, you may want to consult the following books. Specific parts of these books useful for given topics will be highlighted throughout the lecture notes.

  • Banerjee, S. and A. Roy (2014). Linear Algebra and Matrix Analysis for Statistics. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis.
  • Eddelbuettel, D. (2013). Seamless R and C++ Integration with Rcpp. Use R! Springer New York.
  • Gillespie, C. and R. Lovelace (2016). Efficient R Programming: A Practical Guide to Smarter Programming. O’Reilly Media. https://csgillespie.github.io/efficientR/.
  • Monahan, J. F. (2011). Numerical Methods of Statistics (2 ed.). Cambridge University Press.
  • Nocedal, J. and S. Wright (2006). Numerical Optimization (2 ed.). Springer Series in Operations Research and Financial Engineering. Springer New York.
  • Petersen, K. B. and M. S. Pedersen (2012). The Matrix Cookbook. https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf
  • Press, W., S. Teukolsky, W. Vetterling, and B. Flannery (2007). Numerical Recipes: The Art of Scientific Computing (3 ed.). Cambridge University Press.
  • W. N. Venables, D. M. S. and the R Core Team (2021). An Introduction to R (4.1.0 ed.). https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf.
  • Wickham, H. (2019). Advanced R (2 ed.). Chapman & Hall/CRC the R series. CRC Press/Taylor & Francis Group. https://adv-r.hadley.nz/.
  • Wood, S. N. (2015). Core Statistics. Institute of Mathematical Statistics Textbooks. Cambridge University Press. https://www.maths.ed.ac.uk/~swood34/core-statistics.pdf.

1.5 Acknowledgements

Much information used to form these notes came from the above resources. However, Simon Wood’s APTS notes and Charles Geyer’s Statistics 3701 notes have also proved incredibly useful for providing some additional information.

1.6 Assessment

Assessment for MTH3045 will comprise one piece of coursework (worth 50% of the module’s total module mark) and an online practical exam (worth 50% of the module’s total module mark). Your coursework must be your on work.

1.7 Motivating example

Suppose that we want to find \(\bf z\), where \(\mathbf{z} = \mathbf{ABy}\). Let’s try this in R. We can use system.time() to measure how long a command takes to execute. We’ll use \(3000 \times 3000\) matrices, A and B, for \(\bf A\) and \(\bf B\), respectively, which we’ll fill with arbitrary (here Normal(0, 1), henceforth \(N(0, 1)\)) variates, and fill the \(n\)-vector \(\bf y\), called y, similarly.

> n <- 3e3
> A <- matrix(rnorm(n * n), n, n)
> B <- matrix(rnorm(n * n), n, n)
> y <- rnorm(n)
> system.time(z1 <- A %*% B %*% y)
##    user  system elapsed 
##   1.605   1.826   0.255

We’re interested in the total time, elapsed, which is 0.255 seconds. However, the next line gives exactly the same answer

> system.time(z2 <- A %*% (B %*% y))
##    user  system elapsed 
##   0.079   0.211   0.020

which we can check with all.equal()

> all.equal(z1, z2)
## [1] TRUE

yet is about 15 times faster. The point of this example is that, if we understand some of the basics behind computations that we need when fitting statistical models, we can make our fitting much more efficient. Improving efficiency can, for given computational cost, allow us to fit models to more data, for example, from which we should hope to draw more reliable inferences.

What’s happened above? In first approach R has essentially calculated \(\mathbf{C} = \mathbf{AB}\), say, and then \(\mathbf{Cy}\), i.e. multiplied two \(n \times n\) matrices, and then an \(n \times n\) matrix by an \(n\)-vector. In the second approach, however, R has essentially calculated \(\mathbf{x} = \mathbf{By}\) and then \(\mathbf{z} = \mathbf{Ax}\), say, i.e. two multiplications of \(n \times n\) matrices and \(n\)-vectors. By considering the \(n\)-vector as an \(n \times 1\) matrix, we should expect multiplying together two \(n \times n\) matrices to require roughly a factor of \(n\) times more calculations than multiplying an \(n \times n\) matrix by an \(n\)-vector. This is a trivial, but illuminating, example, worth bearing in mind when writing matrix-based code.

1.8 Exploratory and refresher exercises

  1. Generate a sample of \(n = 20\) \(N(1, 3^2)\) random variates, and without using mean(), var() or sd() write R functions to calculate the sample mean, \(\bar x\), sample variance, \(s^2\), and sample standard deviation, \(s\), where \[ \bar x = \dfrac{1}{n} \sum_{i = 1}^n x_i \text{ and } s^2 = \dfrac{1}{n - 1} \sum_{i = 1}^n (x_i - \bar x)^2. \] Note than sum() may be used.

  2. Consider computing \(\text{Pr}(Z \geq z) = 1 - \Phi(z)\) where \(Z \sim \text{Normal}(0, 1)\), or, for short, \(Z \sim N(0, 1)\). For \(z = 0, 0.5, 1, 1.5, 2, \ldots\) compute this in R in three different ways using the following three commands

> pnorm(z, lower.tail = FALSE)
> 1 - pnorm(z)
> pnorm(-z)

and find the lowest value of \(z\) for which the three don’t give the same answer.

  1. The formula \(\text{Var}(Y) = \text{E}(Y^2) - [\text{E}(Y)]^2\) is sometimes called the ‘short-cut’ variance formula, i.e. a short-cut for \(\text{Var}(Y) = \text{E}[Y - \text{E}(Y)]^2\). Compare computing \(\text{Var}(Y)\) using the two formulae above for the samples y1 and y2 below.
> y1 <- 1:10
> y2 <- y1 + 1e9