2 Chapter 2 exercises

    1. Compute \(\pi + \text{e}\), where \(\pi = 3.1415927 \times 10^0\) and \(\text{e} = 2.7182818 \times 10^0\), using floating point addition in base 10, working to five decimal places.


      Solution

      As \(\pi\) and \(\text{e}\) have a common exponent, we sum their mantissas, i.e. \[\begin{align*} \pi + \text{e} &= (3.14159 \times 10^0) + (2.71828 \times 10^0)\\ &= 5.85987 \times 10^0 \end{align*}\]


    2. Now compute \(10^6\pi + \text{e}\), using floating point addition in base 10, now working to seven decimal places.


      Solution

      We first need to put the numbers on a common exponent, which is that of \(10^6 \pi\). \[\begin{align*} 10^6 \pi &= 3.1415927 \times 10^6\\ \text{e} & = 2.7182818 \times 10^0 = 0.0000027 \times 10^6. \end{align*}\] Then summing their mantissas gives \[\begin{align*} 10^6 \pi + \text{e} &= (3.1415927 \times 10^6) + (2.7182818 \times 10^0)\\ &= (3.1415927 \times 10^6) + (0.0000027 \times 10^6)\\ &= (3.1415927 + 0.0000027) \times 10^6\\ &= 3.1415954 \times 10^6. \end{align*}\]


    3. What would happen if we computed \(10^6\pi + \text{e}\) using a base 10 floating point representation, but only worked with five decimal places?


      Solution

      If we put \(2.7182818 \times 10^0\) onto the exponent \(10^6\) we get \(0.00000 \times 10^6\) to five decimal places, and so \(\text{e}\) becomes negligible alongside \(10^6 \pi\) if we only work to five decimal places.


  1. Write \(2\pi\) in binary form using single- and double precision arithmetic. [Hint: I recommend you consider the binary forms for \(\pi\) given in the lecture notes, and you might also use bit2decimal() from the lecture notes to check your answer.]


    Solution

    The key here is to note that we want to calculate \(\pi \times 2\). As we have a number in the form \(S \times (1 + F) \times 2^{E - e}\), then we just need to raise \(E\) by one. For both single- and double-precision, this corresponds to changing the last zero in the 0s and 1s for the \(E\) term to a one.

    bit2decimal <- function(x, e, dp = 20) {
    # function to convert bits to decimal form
    # x: the bits as a character string, with appropriate spaces
    # e: the excess
    # dp: the decimal places to report the answer to
    bl <- strsplit(x, ' ')[[1]] # split x into S, E and F components by spaces
    # and then into a list of three character vectors, each element one bit
    bl <- lapply(bl, function(z) as.integer(strsplit(z, '')[[1]]))
    names(bl) <- c('S', 'E', 'F') # give names, to simplify next few lines
    S <- (-1)^bl$S # calculate sign, S
    E <- sum(bl$E * 2^c((length(bl$E) - 1):0)) # ditto for exponent, E
    F <- sum(bl$F * 2^(-c(1:length(bl$F)))) # and ditto to fraction, F
    z <- S * 2^(E - e) * (1 + F) # calculate z
    out <- format(z, nsmall = dp) # use format() for specific dp
    # add (S, E, F) as attributes, for reference
    attr(out, '(S,E,F)') <- c(S = S, E = E, F = F)
    out
    }
    bit2decimal('0 10000001 10010010000111111011011', 127)
    ## [1] "6.28318548202514648438"
    ## attr(,"(S,E,F)")
    ##           S           E           F 
    ##   1.0000000 129.0000000   0.5707964
    bit2decimal('0 10000000001 1001001000011111101101010100010001000010110100011000', 1023)
    ## [1] "6.28318530717958623200"
    ## attr(,"(S,E,F)")
    ##            S            E            F 
    ##    1.0000000 1025.0000000    0.5707963

  2. Find the calculation error in R of \(b - a\) where \(a = 10^{16}\) and \(b = 10^{16} + \exp(0.5)\).


    Solution
    a <- 1e16
    b <- 1e16 + exp(.5)
    b - a
    ## [1] 2

  3. Create the following: The row vector \[ \mathbf{a} = (2, 4, 6), \] the \(2 \times 3\) matrix \[ \mathbf{B} = \left(\begin{array}{ccc} 6 & 5 & 4\\ 3 & 2 & 1\end{array}\right) \] and a list with first element \(\mathbf{a}\) and second element \(\mathbf{B}\), and an arbitrary (i.e. with whatever values you like) \(5 \times 3 \times 2\) array with a 'name' attribute that is 'array1'. \(~\)
    For each of the above, consider whether your code could be simpler.


    Solution
    a <- t(c(2, 4, 6))
    B <- matrix(6:1, 2, byrow = TRUE)
    l <- list(a, B)
    arr <- array(rnorm(30), c(5, 3, 2))
    attr(arr, 'name') <- 'array1'
    arr
    ## , , 1
    ## 
    ##             [,1]       [,2]        [,3]
    ## [1,]  0.44038977 -0.3312339 -0.97983935
    ## [2,] -0.48545631  0.2669886 -1.11288162
    ## [3,] -0.99019210 -0.2847259 -0.20078256
    ## [4,]  0.09061379 -2.5644729  0.29632056
    ## [5,]  0.95160000 -1.3384981 -0.02875533
    ## 
    ## , , 2
    ## 
    ##            [,1]       [,2]       [,3]
    ## [1,] -0.7632920 -0.4434407 -0.9684879
    ## [2,] -0.7025302 -1.5994440  0.1417252
    ## [3,] -0.4329849 -0.2125288 -0.1967019
    ## [4,] -0.1383790  0.1897634 -0.5693949
    ## [5,] -0.1575441  1.8284851 -0.5403014
    ## 
    ## attr(,"name")
    ## [1] "array1"

  4. Produce a \(3 \times 4 \times 4 \times 2\) array containing Uniform(0, 1) random variates and compute the mean over its second and third margins using apply(..., ..., means) and rowMeans() or colMeans().


    Solution
    arr <- array(NA, c(3, 4, 4, 2))
    arr[] <- runif(prod(dim(arr))) # saves names to get the number right
    apply(arr, 2:3, mean)
    ##           [,1]      [,2]      [,3]      [,4]
    ## [1,] 0.4136282 0.7557304 0.6668092 0.5000309
    ## [2,] 0.4546962 0.5128590 0.6562822 0.6986061
    ## [3,] 0.5331555 0.5549074 0.4516922 0.4725370
    ## [4,] 0.4290135 0.4056826 0.4759882 0.4009538
    rowMeans(aperm(arr, c(2, 3, 1, 4)), dims = 2)
    ##           [,1]      [,2]      [,3]      [,4]
    ## [1,] 0.4136282 0.7557304 0.6668092 0.5000309
    ## [2,] 0.4546962 0.5128590 0.6562822 0.6986061
    ## [3,] 0.5331555 0.5549074 0.4516922 0.4725370
    ## [4,] 0.4290135 0.4056826 0.4759882 0.4009538
    colMeans(aperm(arr, c(1, 4, 2, 3)), dims = 2)
    ##           [,1]      [,2]      [,3]      [,4]
    ## [1,] 0.4136282 0.7557304 0.6668092 0.5000309
    ## [2,] 0.4546962 0.5128590 0.6562822 0.6986061
    ## [3,] 0.5331555 0.5549074 0.4516922 0.4725370
    ## [4,] 0.4290135 0.4056826 0.4759882 0.4009538

    1. Create a 3-element list of vectors comprising Uniform(0, 1) variates of length 3, 5 and 7, respectively.


      Solution
      lst <- list(runif(3), runif(5), runif(7))

    2. Create another list in which the vectors above are sorted into descending order.


      Solution
      lst2 <- lapply(lst, sort, decreasing = TRUE)

    3. Then create a vector comprising the minimum of each vector in the list, and another stating which element is the minimum. [Hint: for the latter you may want to consult the ‘See Also’ part of the min() function’s help file.]


      Solution
      sapply(lst, min)
      ## [1] 0.13135663 0.10704882 0.04943936
      sapply(lst, which.min)
      ## [1] 3 5 2

  5. Use a for() loop to produce the following. [Hint: the function paste() might be useful.]

    ## [1] "iteration 1"
    ## [1] "iteration 2"
    ## [1] "iteration 3"
    ## [1] "iteration 4"
    ## [1] "iteration 5"
    ## [1] "iteration 6"
    ## [1] "iteration 7"
    ## [1] "iteration 8"
    ## [1] "iteration 9"
    ## [1] "iteration 10"

    Solution
    for (i in 1:10) print(paste('iteration', i))
    ## [1] "iteration 1"
    ## [1] "iteration 2"
    ## [1] "iteration 3"
    ## [1] "iteration 4"
    ## [1] "iteration 5"
    ## [1] "iteration 6"
    ## [1] "iteration 7"
    ## [1] "iteration 8"
    ## [1] "iteration 9"
    ## [1] "iteration 10"

  6. Consider the following two infinite series that represent \(\pi\).\[\begin{align*} \pi &= 4 \left[1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} + \frac{1}{9} - \frac{1}{11} + \frac{1}{13} -\cdots\right]\\ \pi &= 3 + \frac{4}{2 \times 3 \times 4} - \frac{4}{4 \times 5 \times 6} + \frac{4}{6 \times 7 \times 8} - \frac{4}{8 \times 9 \times 10} + \cdots \end{align*}\] Use a while() loop to find which converges to pi in R to within \(\epsilon_m^{1/3}\) using the fewest terms, where \(\epsilon_m\) is R’s machine tolerance.


    Solution

    Here’s the first approximation…

    quarter_pi <- 1
    terms1 <- 1
    denom <- 3
    multiplier <- -1
    my_pi <- 4 * quarter_pi
    while(abs(my_pi - pi) > .Machine$double.eps^(1/3)) {
      terms1 <- terms1 + 1
      quarter_pi <- quarter_pi + multiplier / denom
      my_pi <- 4 * quarter_pi
      denom <- denom + 2
      multiplier <- -1 * multiplier
    }
    terms1
    ## [1] 165141

    …and here’s the second approximation…

    my_pi <- 3
    terms2 <- 2
    denoms <- c(2, 3, 4)
    multiplier <- 1
    while(abs(my_pi - pi) > .Machine$double.eps^(1/3)) {
      my_pi <- my_pi + multiplier * 4 / prod(denoms)
      denoms <- denoms + 2
      multiplier <- -1 * multiplier
      terms2 <- terms2 + 1
    }
    terms2
    ## [1] 36

    1. Write a function based on a for() loop to calculate the cumulative sum of a vector, \(\bf y\), i.e. so that its \(i\)th value, \(y_i\) say, is \[y_i = \sum_{j = 1}^i x_j.\]


      Solution

      Either of the following two functions are options (although others exist).

      my_cumsum <- function(x) {
      # function 1 to calculate cumulative sum of a vector
      # x is a vector
      # returns a vector of length length(x)
      out <- numeric(length(x))
      for (i in 1:length(x)) {
        out[i] <- sum(x[1:i])
      }
      out
      }
      
      my_cumsum2 <- function(x) {
      # function 2 to calculate cumulative sum of a vector
      # x is a vector
      # returns a vector of length length(x)
      out <- x
      for (i in 2:length(x)) {      
        out[i] <- out[i] + out[i - 1]
      }
      out
      }

      We see that both perform the same calculation.

      x <- runif(10)
      cumsum(x)
      ##  [1] 0.7730106 0.8196643 0.9074696 1.8174672 2.1947269 2.6669396 3.0328056
      ##  [8] 3.4399834 4.4189238 5.4120759
      my_cumsum(x)
      ##  [1] 0.7730106 0.8196643 0.9074696 1.8174672 2.1947269 2.6669396 3.0328056
      ##  [8] 3.4399834 4.4189238 5.4120759
      my_cumsum2(x)
      ##  [1] 0.7730106 0.8196643 0.9074696 1.8174672 2.1947269 2.6669396 3.0328056
      ##  [8] 3.4399834 4.4189238 5.4120759

    2. Then benchmark its execution time against R’s vectorised function cumsum() for summing 1000 iid \(Uniform[0, 1]\) random variables.


      Solution
      x <- runif(1e3)
      microbenchmark::microbenchmark(
        cumsum(x),
        my_cumsum(x),
        my_cumsum2(x)
      )
      ## Unit: nanoseconds

    1. To start a board game, a player must throw three sixes using a conventional die. Write a function to simulate this, which returns the total number of throws that the player has taken.


      Solution
      sixes1 <- function() {
      # function to simulate number of throws needed
      # to reach three sixes
      # n is an integer
      # returns total number of throws taken
      out <- sample(1:6, 3, replace = TRUE)
      while(sum(out == 6) < 3) {
        out <- c(out, sample(1:6, 1))
      }
      return(length(out))
      }

    2. Then use 1000 simulations to empirically estimate the distribution of the number of throws needed.


      Solution
      samp1 <- replicate(1e3, sixes1())
      table(samp1)
      ## samp1
      ##  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 
      ##  4 11 22 30 30 33 37 49 46 47 44 51 48 47 41 38 43 45 26 41 32 16 31 23 18 10 
      ## 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 49 50 51 53 55 57 60 
      ## 20 14 14 10 10 10  9  3  5  5  8  3  3  5  1  2  1  3  2  1  2  1  1  1  1  1 
      ## 62 
      ##  1
      hist(samp1, xlab = 'Number of throws', main = 'Histogram of number of throws')
      Histogram of empirical distribution of number of throws for starting method 1.

      Figure 2.1: Histogram of empirical distribution of number of throws for starting method 1.


    3. I think you’ll agree that this would be a rather dull board game! It is therefore proposed that a player should instead throw two consecutive sixes. Write a function to simulate this new criterion, and estimate its distribution empirically.


      Solution
      sixes2 <- function() {
      # function to simulate number of throws needed
      # for two consecutive sixes
      # n is an integer
      # returns total number of throws taken
      out <- sample(1:6, 2, replace = TRUE)
      cond <- TRUE
      while(cond) {
        if (sum(out[1:(length(out) - 1)] + out[2:length(out)] == 12) > 0) {
          cond <- FALSE
        } else {
          out <- c(out, sample(1:6, 1))
        }
      }
      return(length(out))
      }
      sixes2()
      ## [1] 10

    4. Then use 1000 simulations to empirically estimate the distribution of the number of throws needed.


      Solution
      samp2 <- replicate(1e3, sixes2())
      table(samp2)
      ## samp2
      ##   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20 
      ##  31  28  23  21  27  33  17  18  17  18  19  13  15  17  14  17  15  12  18 
      ##  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39 
      ##  23  27  14  12  10   9   7  18   9  16  10  10  14  13   7  15  10  13  12 
      ##  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58 
      ##  11   5   3   8  15   7   2   5   8   6   6   8   7   8   4   3   3  10   6 
      ##  59  60  61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77 
      ##   8   2   4   3   5   4   2   5   4   7   5   5  11   6   3   4   4   4   4 
      ##  78  79  80  81  82  83  84  85  87  88  89  90  91  93  94  96  97  98  99 
      ##   3   4   2   2   2   6   8   5   5   7   2   2   4   3   1   4   3   2   3 
      ## 100 101 102 103 104 105 106 107 108 110 112 114 115 116 118 119 120 121 122 
      ##   1   2   3   4   2   2   2   4   2   1   2   1   2   2   4   3   1   4   2 
      ## 124 126 128 129 130 135 136 139 141 142 147 152 153 154 157 159 161 164 166 
      ##   1   2   1   2   1   1   1   1   1   1   1   2   2   1   2   1   1   1   1 
      ## 169 170 178 180 181 185 188 190 191 195 203 205 209 218 219 225 236 245 255 
      ##   2   1   2   1   1   1   2   1   3   1   1   1   1   1   1   1   1   1   1 
      ## 260 263 322 
      ##   1   1   1
      hist(samp2, xlab = 'Number of throws', main = 'Histogram of number of throws')
      Histogram of empirical distribution of number of throws for starting method 2.

      Figure 2.2: Histogram of empirical distribution of number of throws for starting method 2.


    5. By comparing sample mean starting numbers of throws, which starting criterion should get a player into the game quickest?


      Solution
      mean(samp1) - mean(samp2)
      ## [1] -24.813

      So the second approach, by comparing means, takes more throws before the game can begin.


  7. Consider the following two functions for calculating \[d_i = x_{i + 1} - x_i, \hspace{2cm} i = 1, \ldots, n - 1\] where \({\bf x}' = (x_1, \ldots, x_n)\).

    diff1 <- function(x) {
    # function to calculate differences of a vector
    # based on a for loop
    # x is a vector
    # returns a vector of length (length(x) - 1)
    out <- numeric(length(x) - 1)
    for (i in 1:(length(x) - 1)) {
      out[i] <- x[i + 1] - x[i]
    }
    out
    }
    diff2 <- function(x) {
    # function to calculate differences of a vector
    # based on vectorisation
    # x is a vector
    # returns a vector of length (length(x) - 1)
    id <- 1:(length(x) - 1)
    x[id + 1] - x[id]
    }

    The first, diff1() uses a straightforward for() loop to calculate \(d_i\), for \(i = 1, \ldots, n - 1\), whereas diff2() could be seen to be a vectorised alternative. Benchmark the two for a vector of \(n = 1000\) iid \(N(0, 1)\) random variates by comparing the median difference in execution time.


    Solution
    x <- rnorm(1000)
    microbenchmark::microbenchmark(
      diff1(x),
      diff2(x)
      )
    ## Unit: microseconds

  8. The following function assesses whether all elements is a logical vector are TRUE.

    all2 <- function(x) {
    # function to calculate whether all elements are TRUE
    # returns a scalar
    # x is a logical vector
    sum(x) == length(x)
    }
    1. Calculate the following and benchmark all2() against R’s built-in function all(), which does the same.

      n <- 1e4
      x1 <- !logical(n)

      Solution
      all2(x1)
      ## [1] TRUE
      all(x1)
      ## [1] TRUE
      microbenchmark::microbenchmark(
        all2(x1),
        all(x1)
      )
      ## Unit: microseconds

      We see that both take a similar amount of time.


    2. Now swap the first element of x1 so that it’s FALSE and repeat the benchmarking.


      Solution
      x1[1] <- FALSE
      all2(x1)
      ## [1] FALSE
      all(x1)
      ## [1] FALSE
      microbenchmark::microbenchmark(
        all2(x1),
        all(x1)
      )
      ## Unit: nanoseconds

      Now all2() is much slower. This is because it’s performed a calculation on the entire x1 vector, whereas all() has stopped as soon as it’s found a FALSE.


    3. Evaluate the function any2() below against R’s built-in function any() similarly.

      any2 <- function(x) {
      # function to calculate whether any elements are TRUE
      # returns a scalar
      # x is a logical vector
      sum(x) > 0
      }

      Solution
      x2 <- logical(n)
      any2(x2)
      ## [1] FALSE
      any(x2)
      ## [1] FALSE
      microbenchmark::microbenchmark(
        any2(x2),
        any(x2)
      )
      ## Unit: microseconds

      We again see that both take a similar amount of time.


    4. Now swap the first element of x2 so that it’s FALSE and repeat the benchmarking.


      Solution
      x2[1] <- TRUE
      microbenchmark::microbenchmark(  
        any2(x2),
        any(x2)
      )
      ## Unit: nanoseconds

      This time any2() is much slower, for similar reasoning to all2() being much slower than all(), except that any() is stopped when it reaches a TRUE.