Chapter 5
Week 9 lecture 3
Challenges I
- Consider modelling the sample \(y_1, \ldots, y_n\) independent as Exp(\(\lambda\)) random variables. Given that their log likelihood is
\[
\log f(\mathbf{y} \mid \lambda) = n \log \lambda - \lambda \sum_{i = 1}^n y_i
\]
and then that
\[
\dfrac{\partial \log f(\mathbf{y} \mid \lambda)}{\partial \lambda} = \dfrac{n}{\lambda} - \sum_{i = 1}^n y_i \quad \text{and} \quad \dfrac{\partial^2 \log f(\mathbf{y} \mid \lambda)}{\partial \lambda^2} = -\dfrac{n}{\lambda^2}
\]
write functions called 
nd1(lambda, y)andnd2(lambda, y)that evaluate the first and second derivatives of the negative log-likelihood, i.e. of \(-\log f(\mathbf{y} \mid \lambda)\). 
Solution
nd1 <- function(lambda, y) {
  # Function to evaluate first derivative of exponential
  # negative log-likelihood w.r.t. lambda
  # lambda is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  sum(y) - n / lambda
}
nd2 <- function(lambda, y) {
  # Function to evaluate second derivative of exponential
  # negative log-likelihood w.r.t. lambda
  # lambda is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  n / lambda^2
}- Find the first Newton step given a starting value of \(\lambda_0 = 1\) and the sample \[ 0.2, 0.5, 0.6, 1.0, 1.8, 2.5 \]
 
Solution
y <- c(0.2, 0.5, 0.6, 1.0, 1.8, 2.5)
-nd1(1, y) / nd2(1, y)Week 10 lecture 1
Challenges I
- Consider modelling the sample \(y_1, \ldots, y_n\) independent as Exp(\(\lambda\)) random variables. Given that their log likelihood is
\[
\log f(\mathbf{y} \mid \lambda) = n \log \lambda - \lambda \sum_{i = 1}^n y_i,
\]
and that \(\lambda > 0\), write a function called 
nd0(lambda, y)that evaluates the negative log-likelihood, i.e. \(-\log f(\mathbf{y} \mid \lambda)\). 
Solution
nd0 <- function(lambda, y) {
  # Function to evaluate exponential negative log-likelihood
  # lambda is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  -n * log(lambda) + lambda * sum(y)
}- Use 
nlminb()withnd0(),nd1()andnd2()to find \(\hat \lambda\), the maximum likelihood estimate, given independent sample \[ 0.2, 0.5, 0.6, 1.0, 1.8, 2.5 \] and a starting value of \(\lambda_0 = 1\). [Note that you will need to ensure thatnd2()returns a matrix, albeit a trivial \(1 \times 1\) one. Usingas.matrix()is probably easiest.] 
Solution
nd1 <- function(lambda, y) {
  # Function to evaluate first derivative of exponential
  # negative log-likelihood w.r.t. lambda
  # lambda is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  sum(y) - n / lambda
}
nd2 <- function(lambda, y) {
  # Function to evaluate second derivative of exponential
  # negative log-likelihood w.r.t. lambda
  # lambda is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  matrix(n / lambda^2, 1, 1)
}
y <- c(.2, .5, .6, 1, 1.8, 2.5)
lambda_0 <- 1
nlminb(lambda_0, nd0, nd1, nd2, y = y)- Experiment with different starting values \(\lambda_0\) in order to investigate the sensitivity of Newton’s method to starting values (on this occasion)
 
Solution
lambda0_seq <- 10^seq(-4, 4)
sapply(lambda0_seq, function(x) nlminb(x, nd0, nd1, nd2, y = y)$par)Challenges I
- Consider modelling the sample \(y_1, \ldots, y_n\) independent as Exp(\(\text{e}^\tau\)) random variables. Given that their log likelihood is
\[
\log f(\mathbf{y} \mid \tau) = n \tau - \text{e}^\tau \sum_{i = 1}^n y_i,
\]
use 
nlminb()by supplying the negative log-likelihood and its first and second derivatives w.r.t. \(\tau\) to find \(\hat \tau\), its maximum likelihood estimate, given independent sample \[ 0.2, 0.5, 0.6, 1.0, 1.8, 2.5 \] and a starting value of \(\tau_0 = 1\). 
Solution
nd0_tau <- function(tau, y) {
  # function to evaluate Exp(exp(tau)) neg. log lik.
  # tau is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  - n * tau + exp(tau) * sum(y)
}
y0 <- c(.2, .5, .6, 1, 1.8, 2.5)
tau0 <- 0
nd0_tau(tau0, y0)
nd1_tau <- function(tau, y) {
  # function to 1st deriv. w.r.t. to tau
  # of Exp(exp(tau)) neg. log lik.
  # tau is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  - n + exp(tau) * sum(y)
}
nd2_tau <- function(tau, y) {
  # function to 2nd deriv. w.r.t. to tau
  # of Exp(exp(tau)) neg. log lik.
  # tau is a scalar
  # y is a vector
  # returns a scalar
  as.matrix(exp(tau) * sum(y))
}
nd1_tau(tau0, y0)
nd2_tau(tau0, y0)
fit <- nlminb(tau0, nd0_tau, nd1_tau, nd2_tau, y = y0)
tau_hat <- fit$par
exp(tau_hat)Week 10 lecture 2
Challenges I
- Consider the sample \(y_1, \ldots, y_n\) given by \[ 1.8, 2.1, 2.3, 2.4, 2.5, 2.7, 3.1, 4.3, 4.3, 4.4 \] as \(n = 10\) independent realisations from the Normal(\(\mu\), \(\sigma^2\)) with \(\mu\) unknown and \(\sigma^2 = 1\), i.e. with pdf \[ f(y \mid \mu, \sigma) = \frac{1}{\sqrt{2 \pi \sigma^2}} \text{e}^{-(y - \mu)^2/2\sigma^2}. \] Suppose that \(\sigma^2 = 1\). Find \(\log f(\mathbf{y} \mid \mu = 1, \sigma^2 = 1) = \sum_{i = 1}^n \log f(y_i \mid \mu = 1, \sigma^2 = 1)\).
 
Solution
y <- c(1.8, 2.1, 2.3, 2.4, 2.5, 2.7, 3.1, 4.3, 4.3, 4.4)
sum(dnorm(y, 1, 1, log = TRUE))- Write a function in 
R,negloglik(mu, y), that evaluates \(-\log f(\mathbf{y} \mid \mu, \sigma^2 = 1)\) formu\(= \mu\) andy\(=y\). 
Solution
negloglik <- function(mu, y) {
 # Function to evaluate derivative negative log-likelihood
  # mu is a scalar
  # y is a vector
  # returns a scalar
  n <- length(y)
  n * log(2 * pi) / 2 + sum((y - mu)^2) / 2
}
negloglik(1, y)- Find \(-\partial \log f(\mathbf{y} \mid \mu, \sigma^2 = 1) / \partial \mu\) and write a function in 
R,negloglik_d1(mu, y), that evaluates \(-\partial \log f(\mathbf{y} \mid \mu, \sigma^2 = 1) / \partial \mu\) formu\(= \mu\) andy\(=y\). 
Solution
negloglik_d1 <- function(mu, y) {
  # Function to evaluate derivative of 
  # negative log-likelihood w.r.t. mu
  # mu is a scalar
  # y is a vector
  # returns a scalar
  -sum(y - mu)
}
negloglik_d1(1, y)- Use 
optim()inRto find \(\hat \mu\), the maximum likelihood estimate of \(\mu\), using the BFGS method. You should supplynegloglik_d1(mu, y)to evaluate the first derivative w.r.t. \(\mu\), and use \(\mu_0 = 1\) as a starting value. 
Solution
fit_bfgs <- optim(1, negloglik, negloglik_d1, y = y, method = 'BFGS')
mu_hat <- fit_bfgs$par- Evaluate \(-\partial \log f(\mathbf{y} \mid \mu, \sigma^2 = 1) / \partial \mu\) at \(\hat \mu\) to verify that \(\hat \mu\) is the m.l.e.
 
Solution
negloglik_d1(mu_hat, y)
mean(y)Challenges II
Consider the sample \(y_1, \ldots, y_n\) given by \[ 1.8, 2.1, 2.3, 2.4, 2.5, 2.7, 3.1, 4.3, 4.3, 4.4 \] as \(n = 10\) independent realisations from the Normal(\(\mu\), \(\sigma^2\)) with \(\mu\) and \(\sigma^2\) unknown, i.e. with pdf \[ f(y \mid \mu, \sigma) = \frac{1}{\sqrt{2 \pi \sigma^2}} \text{e}^{-(y - \mu)^2/2\sigma^2}. \]
- Write a function in 
R,negloglik(pars, y), that evaluates \(-\log f(\mathbf{y} \mid \mu, \sigma^2)\) forpars[1]\(= \mu\),pars[2]\(= \sigma\) andy\(=y\). 
Solution
negloglik <- function(pars, y) {
 # Function to evaluate derivative negative log-likelihood
  # pars is a 2-vector, where
  # pars[1] is mu and
  # pars[2] is sigma
  # y is a vector
  # returns a scalar
  mu <- pars[1]
  sigma <- pars[2]
  if (sigma <= 0)
    return(1e6)
  n <- length(y)
  n * log(2 * pi) / 2 + n * log(sigma) + sum((y - mu)^2) / 2 / sigma^2
}
y <- c(1.8, 2.1, 2.3, 2.4, 2.5, 2.7, 3.1, 4.3, 4.3, 4.4)
negloglik(c(1.5, 1.5), y)
-sum(dnorm(y, 1.5, 1.5, log = TRUE))- Find \(-\nabla \log f(\mathbf{y} \mid \mu, \sigma)\), i.e. the 2-vector comprising \(-\partial \log f(\mathbf{y} \mid \mu, \sigma) / \partial \mu\) and \(-\partial \log f(\mathbf{y} \mid \mu, \sigma) / \partial \sigma\), and then write a function in 
Rthat evaluates \(-\nabla \log f(\mathbf{y} \mid \mu, \sigma)\). 
Solution
negloglik_d1 <- function(pars, y) {
  # Function to evaluate derivative of 
  # negative log-likelihood w.r.t. mu and sigma
  # pars is a 2-vector, where
  # pars[1] is mu and
  # pars[2] is sigma
  # y is a vector
  # returns a 2-vector
  mu <- pars[1]
  sigma <- pars[2]
  n <- length(y)
  out <- numeric(2)
  res <- y - mu
  out[1] <- -sum(res) / sigma^2
  out[2] <- n / sigma - sum(res^2) / sigma^3
  out
}
negloglik_d1(c(1.5, 1.5), y)- Use the BFGS method with 
optim()inRto find \(\hat \mu\) and \(\hat \sigma\), the maximum likelihood estimates of \(\mu\) and \(\sigma\). Use \(\mu_0 = 1.5\) and \(\sigma_0 = 1.5\) as starting values. 
Solution
fit_bfgs <- optim(c(1.5, 1.5), negloglik, negloglik_d1, y = y, method = 'BFGS')
mles <- fit_bfgs$par- Use the \(-\nabla \log f(\mathbf{y} \mid \mu, \sigma)\) evaluated at \(\hat \mu\) and \(\hat \sigma\) to verify that the m.l.e.s have been found.
 
Solution
negloglik_d1(mles, y)
mean(y)
sqrt(var(y) * (length(y) - 1) / length(y))Week 10 lecture 3
Challenges I
- Consider the alternative parameterisation of the Weibull pdf in terms of parameters \((\log \lambda, \log k)\) given by
\[
f(y \mid \tilde \lambda, \tilde k) =
\frac{\exp(\tilde k)}{\exp(\tilde \lambda)}\left(\frac{y}{\exp(\tilde \lambda)}\right)^{\exp(\tilde k)-1}e^{-(y/\exp(\tilde \lambda))^{\exp(\tilde k)}} \quad y > 0,
\]
where \(-\infty < \tilde \lambda, \tilde k < \infty\). Write a function in 
R,weib2_d0(pars, y, mult = 1), that evaluates the scaled log-likelihood of the model, i.e. \(m \log f(\mathbf{y} \mid \tilde \lambda, \tilde k)\), whereparsis a 2-vector such thatpars[1]\(= \tilde \lambda\) andpars[2]\(= \tilde k\),yis the data \(\mathbf{y}\) andmult\(= m\) is a multiplier as forweib_d0(). 
Solution
weib2_d0 <- function(pars, y, mult = 1) {
  # Function to evaluate re-parameterised Weibull log-likelihood
  # pars is a vector
  # pars[1] is $\tilde \lambda$
  # pars[2] is $\tilde k$
  # y can be scalar or vector
  # mult is a scalar defaulting to 1; so -1 returns neg. log likelihood
  # returns a scalar
  n <- length(y)
  epars <- exp(pars)
  out <- n * (pars[2] - epars[2] * pars[1]) + 
    (epars[2] - 1) * sum(log(y)) - sum((y / epars[1])^epars[2])
  mult * out
}- Find the maximum likelihood estimates of \((\tilde \lambda, \tilde k)\) using the Nelder-Mead method, starting at \((\tilde \lambda_0, \tilde k_0) = \big(\log(1.6), \log(0.6)\big)\).
 
Solution
fit_nm2 <- optim(log(c(1.6, 0.6)), weib2_d0, y = y0, mult = -1)
fit_nm2$par- Confirm that these are related to those of the original parameterisation through the logarithm.
 
Solution
fit_nm$par
exp(fit_nm2$par)Week 11 lecture 1
Challenges I
- Recall the Weibull parameterisation in terms \((\log \lambda, \log k)\) \[ f(y \mid \tilde \lambda, \tilde k) = \frac{\exp(\tilde k)}{\exp(\tilde \lambda)}\left(\frac{y}{\exp(\tilde \lambda)}\right)^{\exp(\tilde k)-1}e^{-(y/\exp(\tilde \lambda))^{\exp(\tilde k)}} \quad y > 0, \] where \(-\infty < \tilde \lambda, \tilde k < \infty\).
 
- Use simulated annealing to find the maximum likelihood estimates of \((\tilde \lambda, \tilde k)\) using a starting temperature of \(T = 10\) and \(N = 10^3\) iterations, i.e. 
control = list(temp = 10, maxit = 1e3), for the wind speed data of Example 5.5. 
Solution
- Repeat the above optimisation 20 times, and comment on how the resulting parameter estimates vary.
 
Solution
- Change \(T\) to \(T = 0.1\) and then repeat Question 2.
 
Solution
- Comment on what you observe between Question 2 and Question 3.
 
Solution
Challenges II
- Consider modelling the sample \(y_1, \ldots, y_n\) independent as Exp(\(\lambda\)) random variables. Given that their log likelihood is
\[
\log f(\mathbf{y} \mid \lambda) = n \log \lambda - \lambda \sum_{i = 1}^n y_i,
\]
using 
nd1(lambda, y), which was previously written to evaluate the first derivative of the negative log-likelihood w.r.t. \(\lambda\), useuniroot()to find \(\hat \lambda\), the m.l.e. of \(\lambda\), as this satisfies \(\partial \log f(\mathbf{y} \mid \lambda) / \partial \lambda = 0\). You may assume \(0.1 < \hat \lambda < 2\). 
Solution
Week 11 lecture 2
Challenges I
- Consider a random variable \(Y\) with pdf
\[
f(y \mid \theta) = \dfrac{\Gamma(2\theta) [y(1 - y)]^{\theta - 1}}{[\Gamma(\theta)]^2} \quad 0 \leq y \leq 1
\]
and for \(\theta > 0\). The log-likelihood for an independent sample, \(y_1, \ldots, y_n\), is then given by
\[
\log f(\mathbf{y} \mid \theta) = n\big[\log\big(\Gamma(2\theta)\big) - 2\log\big(\Gamma(\theta)\big)\big] + (\theta - 1) \sum_{i = 1}^n \left\{\log y_i + \log(1 - y_i)\right\}.
\]
Write a function, 
negloglik(theta, y)to evaluate the negative of the log-likelihood above, and then for the data \[ 0.16, 0.35, 0.37, 0.46, 0.67, 0.69, 0.79, 0.8, 0.91, 0.92 \] find the maximum likelihood estimate \(\hat \theta\) usingoptimize(), assuming that \(1 \leq \hat \theta \leq 3\). [Recall thatlgamma(x)evaluates \(\log\big(\Gamma(x)\big)\) inRif \(x =\)x.]