Chapter 2

Week 1 lecture 2

Consider 101 as a base 2, 3, 4, or 5 representation. What does it represent in base 10?

Solution

# bookwork

Repeat the calculation above using R. [Hint: consider forming a vector for \((a_0, a_1, a_2)^\text{T}\) and another for \((B^0, B^1, B^2)\)]

Solution

conv2dec <- function(x, B) {
a <- as.numeric(strsplit(x, '')[[1]])
pows <- (length(a) - 1):0
sum(a * B^pows)
}

x <- '101'
conv2dec(x, 2)
conv2dec(x, 3)
conv2dec(x, 4)
conv2dec(x, 5)

Repeat the above for 101.101

Solution

conv2dec2 <- function(x, B) {
a0 <- strsplit(x, '[.]')[[1]]
a_left <- as.numeric(strsplit(a0[1], '')[[1]])
a_right <- as.numeric(strsplit(a0[2], '')[[1]])
a <- c(a_left, a_right)
pows <- (length(a_left) - 1):(-length(a_right))
sum(a * B^pows)
}

x2 <- '101.101'
conv2dec2(x2, 2)
conv2dec2(x2, 3)
conv2dec2(x2, 4)
conv2dec2(x2, 5)

Week 1 lecture 3

Challenges I

Find the largest value of \(E\) that can be represented with single-precision arithmetic, given \(E = \sum_{i = 1}^8 b_{i + 1} 2^{8 - i}\) (i.e., find \(E\) when \(b_2 = b_3 = \ldots = b_9 = 1\))

Solution

sum(2^c(0:7))

Find the smallest value of \(F\) greater than zero that can be represented with single-precision arithmetic, given \(F = \sum_{i = 1}^{23} b_{i + 9} 2^{-i}\)

Solution

2^(-23)

Challenges II

Find the largest value of \(E\) that can be represented with double-precision arithmetic, given \(E = \sum_{i = 1}^{11} b_{i + 1} 2^{11 - i}\)

Solution

sum(2^c(0:10))

Find the smallest value of \(F\) greater than zero that can be represented with double-precision arithmetic, given \(F = \sum_{i = 1}^{52} b_{i + 9} 2^{-i}\)

Solution

2^(-52)

Week 2 lecture 1

Challenges I

Find to the nearest order of magnitude the smallest number that R can represent.

[Hint: the following code

a <- .1
for (i in 1:10) {
  a <- .1 * a
  print(a)
}

reduces a by a factor a 10, starting from 0.1, ten times.]

Solution

a <- .1
for (i in 1:330) {
  a <- .1 * a
  print(a)
}

Challenges II

Consider

x <- 1:12

Use the help files for functions %% and %/% together with x above to deduce what both operators do.

Solution

x <- 1:12
# %% 3 divides by three and then gives the remainder
x %% 3
# %/% 3 divides by three and then gives the integer part
x %/% 3

Consider the function range() and the vector

x <- c(1, 2, NA, Inf)

Use its help file to ascertain the difference between the following three calls.

range(x) 
range(x, na.rm = TRUE)
range(x, finite = TRUE)

Solution

x <- c(1, 2, NA, Inf)
range(x)                 # gives NA because x includes NA
range(x, na.rm = TRUE)   # recognises that Inf is big and ignores the NA
range(x, finite = TRUE)  # ignores the NA and Inf, i.e. anything non-finite

Consider the help file for function mean(). What does its argument trim do? Generate a small sample of data and use this to confirm that it does what you expect.

Solution

x <- c(1, 1:10)
mean(x)
mean(x, trim = .1) # calculate the mean from the middle 90% of the data

Week 2 lecture 2

Challenges I

Compute the following to understand how R prioritises different symbols and operators

1:3^2
c(1:3)^2

1+2^3
(1 + 2)^3

1 + 2 * 3
(1 + 2) * 3

Solution

1:3^2
(1:3)^2

1 + 2^3

1 + 2*3
(1 + 2)*3

Calculate \(\mathbf{Ba}\) and \(\mathbf{cB}\) where \[ \mathbf{a} = \left(\begin{array}{c}2\\ 4\\ 6\end{array}\right),~ \mathbf{B} = \left(\begin{array}{ccc} 2 & 3 & 1\\ 4 & 3 & 1\\ 2 & 2 & 3 \end{array}\right) \text{ and } \mathbf{c} = \left(\begin{array}{ccc} 2 & 4 & 6 \end{array}\right). \]

Solution

a <- c(2, 4, 6)
B <- matrix(c(2, 4, 4, 3, 3, 2, 1, 1, 3), 3, 3)
c <- matrix(c(2, 4, 6), 1)

B %*% a     # Ba
c %*% B     # cB
c <- t(a)   # creates c as the tranpose of a
t(a) %*% B  # avoids creating c

Form a list in which the first element is a \(3 \times 6 \times 2\) array, the second is a 4-vector, and the third is a \(4 \times 3\) matrix, each of which are filled with random Uniform([-1, 1]) variates.

Solution

x <- list(
  array(runif(36, -1, 1), c(3, 6, 2)),
  runif(4, -1, 1),
  matrix(runif(12, -1, 1), 4, 3)
)
x

Challenges II

Form a \(10 \times 20\) matrix comprising N(0, 1) variates and find the median over each row.

Solution

x <- matrix(rnorm(200), 10, 20)
apply(x, 1, median)

Complete the following

x <- lapply(1:4, function(i) ...)

to a form a four-element list in which each element comprises a vector of variates generated according to \[ Y_1, \ldots, Y_N \text{ with } N - 1 \sim \text{Poisson}(2)\text{ and }Y_i \sim N(3, 2^2),~i = 1, \ldots, N \]

Solution

x <- lapply(1:4, function(i) rnorm(1 + rpois(1, 2), 3, 2))
lapply(1 + rpois(4, 2), function(x) rnorm(x, 3, 2))

Find the length of each vector in the list.

Solution

sapply(x, length)

Challenges III

Consider the following list

lst2 <- list(list(1:3, 4:6), list(7:9, 10:12))

What’s the difference between the following?

unlist(lst2)
unlist(lst2, recursive = FALSE)

Solution

# THe first returns a list of vectors.
# The second returns a vector, i.e. drops the structure
# of what's in each element of the list.

Week 2 lecture 3

Challenges I

On the website www.mathsgear.co.uk you can buy a crooked die
Let \(X \in \{1, 2, 3, 4, 5, 6\}\) denote the number rolled on a die, where \[ \text{Pr}(X = x) = \left\{\begin{array}{cl} 0.10 & \text{if }x = 1\\ 0.05 & \text{if }x \in \{2, 3\}\\ 0.20 & \text{if }x \in \{4, 5\}\\ 0.40 & \text{if }x = 6\\ \end{array}\right. \]

Simulate 15 rolls of the die in R (perhaps using sample())

Solution

n <- 15

# as a loop
x <- integer(n)
for (i in 1:n) {
  x[i] <- sample(1:6, 1, prob = c(.1, .05, .05, .2, .2, .4))
}
x

# more tidily with sample()
x <- sample(1:6, n, prob = c(.1, .05, .05, .2, .2, .4), replace = TRUE)
x

Challenges II

Now simulate in R the rolling of the crooked die so that it generates two ones.

Solution

x <- integer(0)
while(sum(x == 1) < 2) {
  x <- c(x, sample(1:6, 1, prob = c(.1, .05, .05, .2, .2, .4)))
}
x

Week 3 lecture 1

Challenges I

Consider computing the rolling mean of a vector \((y_1, \ldots, y_n)^\text{T}\). Let’s suppose it’s a seven-day rolling mean, as often used to convey Covid-19 data, so that \[ z_i = \dfrac{1}{7} \sum_{j = 1}^7 y_{i + j - 4}, \text{ for } i = 4, 5, \ldots, n - 3. \] Generate a vector of length n \(= 35\) comprising Gamma(2, 3) random variates, and then compute its rolling mean. Your result should contain NAs where the rolling mean cannot be computed, i.e., \(i = 1, 2, 3, n - 2, n - 1, n\).

Solution

n <- 35
y <- rgamma(n, 3, 2)
z <- rep(NA, n)
for (i in 4:(n - 3)) z[i] <- mean(y[i + seq(-3, 3)])
matplot(seq_len(n), cbind(y, z))

The function filter() can do this in a vectorised way. Obtain the same result using filter(). [Hint: the examples for filter() may help illustrate how its arguments work.]

Solution

z2 <- filter(y, rep(1/7, 7))
all.equal(z, z2)

z3 <- as.vector(z2)

all.equal(z, z3)

Familiarise yourself with the functions any() and all().

Solution

all(runif(10) > .5)
any(runif(10) > .5)

Challenges II

Create the following function

browser_test <- function(x) {
  x1 <- 1
  x2 <- 2
  x3 <- 3
  x4 <- 4
  x5 <- 5
  x6 <- 6
  browser()
  x7 <- 7
  x8 <- log(x)
  c(x1, x2, x3, x4, x5, x6, x7, x8)
}

and then run browser_test(-1).

Then type x1, x2, x3, x4, x5 and x6.
What happens if you type x7?
What happens if you hit Enter twice and then type x7?
What happens if you hit Enter again?

Solution

# No sketch solutions for Q2-5 given.

To exit browser() mode type and execute Q.

Week 3 lecture 2

Challenges I

Confirm that the following give the same answer, and then benchmark which is quicker.

n <- 1e3
y <- runif(n)
sort(y)[1]
min(y)

Solution

library(microbenchmark)

microbenchmark(
  apply(A, 1, sum),
  rowSums(A)
)

# minima
n <- 1e3
y <- runif(1e3)
microbenchmark(
  sort(y)[1],
  min(y)
)

Complete the function based on a for() loop so that it computes \(s = \sum_{i = 1}^n a_i b_i\), for \(i =1, \ldots, n\), given vectors \(\mathbf{a} =\) a and \(\mathbf{b} =\) b.

n <- 1e3
a <- rnorm(n) # using n above
b <- rnorm(n)
ab <- function(a, b) {
s <- 0
for (i in 1:n)
  ... # insert one line of code here
s
}

Solution

a <- rnorm(n)
b <- rnorm(n)
ab <- function(a, b) {
  s <- 0
  for (i in 1:length(a))
    s <- s + a[i] * b[i]
  s
}
ab(a, b)

Then propose a vectorised alternative, and benchmark whether it’s better.

Solution

sum(a * b)
microbenchmark(
  ab(a, b),
  sum(a * b)
)