Chapter 5

Week 10 lecture 1

Challenges I

  1. Consider modelling the sample \(y_1, \ldots, y_n\) independent as Exp(\(\lambda\)) random variables. Given that their log likelihood is \[ \log f(\mathbf{y} \mid \lambda) = n \log \lambda - \lambda \sum_{i = 1}^n y_i \] and then that \[ \dfrac{\partial \log f(\mathbf{y} \mid \lambda)}{\partial \lambda} = \dfrac{n}{\lambda} - \sum_{i = 1}^n y_i \quad \text{and} \quad \dfrac{\partial^2 \log f(\mathbf{y} \mid \lambda)}{\partial \lambda^2} = -\dfrac{n}{\lambda^2} \] write functions called nd1(lambda, y) and nd2(lambda, y) that evaluate the first and second derivatives of the negative log-likelihood, i.e. of \(-\log f(\mathbf{y} \mid \lambda)\).
Solution
nd1 <- function(lambda, y) {
  # Function to evaluate first derivative of exponential
  # negative log-likelihood w.r.t. lambda
  # lambda is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  sum(y) - n / lambda
}

nd2 <- function(lambda, y) {
  # Function to evaluate second derivative of exponential
  # negative log-likelihood w.r.t. lambda
  # lambda is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  n / lambda^2
}
  1. Find the first Newton step given a starting value of \(\lambda_0 = 1\) and the sample \[ 0.2, 0.5, 0.6, 1.0, 1.8, 2.5 \]
Solution
y <- c(0.2, 0.5, 0.6, 1.0, 1.8, 2.5)
-nd1(1, y) / nd2(1, y)

Week 10 lecture 2

Challenges I

  1. Consider modelling the sample \(y_1, \ldots, y_n\) independent as Exp(\(\lambda\)) random variables. Given that their log likelihood is \[ \log f(\mathbf{y} \mid \lambda) = n \log \lambda - \lambda \sum_{i = 1}^n y_i, \] and that \(\lambda > 0\), write a function called nd0(lambda, y) that evaluates the negative log-likelihood, i.e. \(-\log f(\mathbf{y} \mid \lambda)\).
Solution
nd0 <- function(lambda, y) {
  # Function to evaluate exponential negative log-likelihood
  # lambda is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  -n * log(lambda) + lambda * sum(y)
}
  1. Use nlminb() with nd0(), nd1() and nd2() to find \(\hat \lambda\), the maximum likelihood estimate, given independent sample \[ 0.2, 0.5, 0.6, 1.0, 1.8, 2.5 \] and a starting value of \(\lambda_0 = 1\). [Note that you will need to ensure that nd2() returns a matrix, albeit a trivial \(1 \times 1\) one. Using as.matrix() is probably easiest.]
Solution
nd1 <- function(lambda, y) {
  # Function to evaluate first derivative of exponential
  # negative log-likelihood w.r.t. lambda
  # lambda is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  sum(y) - n / lambda
}

nd2 <- function(lambda, y) {
  # Function to evaluate second derivative of exponential
  # negative log-likelihood w.r.t. lambda
  # lambda is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  matrix(n / lambda^2, 1, 1)
}

y <- c(.2, .5, .6, 1, 1.8, 2.5)
lambda_0 <- 1
nlminb(lambda_0, nd0, nd1, nd2, y = y)
  1. Experiment with different starting values \(\lambda_0\) in order to investigate the sensitivity of Newton’s method to starting values (on this occasion)
Solution
lambda0_seq <- 10^seq(-4, 4)
sapply(lambda0_seq, function(x) nlminb(x, nd0, nd1, nd2, y = y)$par)

Week 10 lecture 3

Challenges I

  1. Consider modelling the sample \(y_1, \ldots, y_n\) independent as Exp(\(\text{e}^\tau\)) random variables. Given that their log likelihood is \[ \log f(\mathbf{y} \mid \tau) = n \tau - \text{e}^\tau \sum_{i = 1}^n y_i, \] use nlminb() by supplying the negative log-likelihood and its first and second derivatives w.r.t. \(\tau\) to find \(\hat \tau\), its maximum likelihood estimate, given independent sample \[ 0.2, 0.5, 0.6, 1.0, 1.8, 2.5 \] and a starting value of \(\tau_0 = 1\).
Solution
nd0_tau <- function(tau, y) {
  # function to evaluate Exp(exp(tau)) neg. log lik.
  # tau is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  - n * tau + exp(tau) * sum(y)
}

y0 <- c(.2, .5, .6, 1, 1.8, 2.5)
tau0 <- 0
nd0_tau(tau0, y0)

nd1_tau <- function(tau, y) {
  # function to 1st deriv. w.r.t. to tau
  # of Exp(exp(tau)) neg. log lik.
  # tau is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  - n + exp(tau) * sum(y)
}

nd2_tau <- function(tau, y) {
  # function to 2nd deriv. w.r.t. to tau
  # of Exp(exp(tau)) neg. log lik.
  # tau is a scalar
  # y is a vector
  # returns a scalar
  as.matrix(exp(tau) * sum(y))
}

nd1_tau(tau0, y0)
nd2_tau(tau0, y0)

fit <- nlminb(tau0, nd0_tau, nd1_tau, nd2_tau, y = y0)
tau_hat <- fit$par
exp(tau_hat)
  1. Confirm the result with optimize().
Solution
optimize(nd0_tau, c(-1, 1), y = y0)

Week 11 lecture 1

Challenges I

  1. Consider modelling the sample \(y_1, \ldots, y_n\) independent as Exp(\(\text{e}^\tau\)) random variables. Given that their log likelihood is \[ \log f(\mathbf{y} \mid \tau) = n \tau - \text{e}^\tau \sum_{i = 1}^n y_i, \] using nd1_tau(tau, y), which was previously written to evaluate the first derivative of the negative log-likelihood w.r.t. \(\tau\), use uniroot() to find \(\hat \tau\), the m.l.e. of \(\tau\), as this satisfies \(\partial \log f(\mathbf{y} \mid \tau) / \partial \tau = 0\). You may assume \(-1 < \hat \tau < 1\).
Solution

Challenges II

  1. Consider the sample \(y_1, \ldots, y_n\) given by \[ 1.8, 2.1, 2.3, 2.4, 2.5, 2.7, 3.1, 4.3, 4.3, 4.4 \] as \(n = 10\) independent realisations from the Normal(\(\mu\), \(\sigma^2\)) with \(\mu\) unknown and \(\sigma^2 = 1\), i.e. with pdf \[ f(y \mid \mu, \sigma) = \frac{1}{\sqrt{2 \pi \sigma^2}} \text{e}^{-(y - \mu)^2/2\sigma^2}. \] Suppose that \(\sigma^2 = 1\). Find \(\log f(\mathbf{y} \mid \mu = 1, \sigma^2 = 1) = \sum_{i = 1}^n \log f(y_i \mid \mu = 1, \sigma^2 = 1)\).
Solution
  1. Write a function in R, negloglik(mu, y), that evaluates \(-\log f(\mathbf{y} \mid \mu, \sigma^2 = 1)\) for mu \(= \mu\) and y \(=y\).
Solution
  1. Find \(-\partial \log f(\mathbf{y} \mid \mu, \sigma^2 = 1) / \partial \mu\) and write a function in R, negloglik_d1(mu, y), that evaluates \(-\partial \log f(\mathbf{y} \mid \mu, \sigma^2 = 1) / \partial \mu\) for mu \(= \mu\) and y \(=y\).
Solution
  1. Use optim() in R to find \(\hat \mu\), the maximum likelihood estimate of \(\mu\), using the BFGS method. You should supply negloglik_d1(mu, y) to evaluate the first derivative w.r.t. \(\mu\), and use \(\mu_0 = 1\) as a starting value.
Solution
  1. Evaluate \(-\partial \log f(\mathbf{y} \mid \mu, \sigma^2 = 1) / \partial \mu\) at \(\hat \mu\) to verify that \(\hat \mu\) is the m.l.e.
Solution

Week 11 lecture 2

Challenges I

  1. Consider the alternative parameterisation of the Weibull pdf in terms of parameters \((\log \lambda, \log k)\) given by \[ f(y \mid \tilde \lambda, \tilde k) = \frac{\exp(\tilde k)}{\exp(\tilde \lambda)}\left(\frac{y}{\exp(\tilde \lambda)}\right)^{\exp(\tilde k)-1}e^{-(y/\exp(\tilde \lambda))^{\exp(\tilde k)}} \quad y > 0, \] where \(-\infty < \tilde \lambda, \tilde k < \infty\). Write a function in R, weib2_d0(pars, y, mult = 1), that evaluates the scaled log-likelihood of the model, i.e. \(m \log f(\mathbf{y} \mid \tilde \lambda, \tilde k)\), where pars is a 2-vector such that pars[1] \(= \tilde \lambda\) and pars[2] \(= \tilde k\), y is the data \(\mathbf{y}\) and mult \(= m\) is a multiplier as for weib_d0().
Solution
  1. Find the maximum likelihood estimates of \((\tilde \lambda, \tilde k)\) using the Nelder-Mead method, starting at \((\tilde \lambda_0, \tilde k_0) = \big(\log(1.6), \log(0.6)\big)\).
Solution
  1. Confirm that these are related to those of the original parameterisation through the logarithm.
Solution

Week 11 lecture 3

Challenges I

  • Recall the Weibull parameterisation in terms \((\log \lambda, \log k)\) \[ f(y \mid \tilde \lambda, \tilde k) = \frac{\exp(\tilde k)}{\exp(\tilde \lambda)}\left(\frac{y}{\exp(\tilde \lambda)}\right)^{\exp(\tilde k)-1}e^{-(y/\exp(\tilde \lambda))^{\exp(\tilde k)}} \quad y > 0, \] where \(-\infty < \tilde \lambda, \tilde k < \infty\).
  1. Use simulated annealing to find the maximum likelihood estimates of \((\tilde \lambda, \tilde k)\) using a starting temperature of \(T = 10\) and \(N = 10^3\) iterations, i.e. control = list(temp = 10, maxit = 1e3), for the wind speed data of Example 5.5.
Solution
  1. Repeat the above optimisation 20 times, and comment on how the resulting parameter estimates vary.
Solution
  1. Change \(T\) to \(T = 0.1\) and then repeat Question 2.
Solution
  1. Comment on what you observe between Question 2 and Question 3.
Solution