Chapter 5
Week 9 lecture 2
Challenges I
- Consider a random variable \(Y\) with pdf
\[
f(y \mid \theta) = \dfrac{\Gamma(2\theta) [y(1 - y)]^{\theta - 1}}{[\Gamma(\theta)]^2} \quad 0 \leq y \leq 1
\]
and for \(\theta > 0\). The log-likelihood for an independent sample, \(y_1, \ldots, y_n\), is then given by
\[
\log f(\mathbf{y} \mid \theta) = n\big[\log\big(\Gamma(2\theta)\big) - 2\log\big(\Gamma(\theta)\big)\big] + (\theta - 1) \sum_{i = 1}^n \left\{\log y_i + \log(1 - y_i)\right\}.
\]
Write a function,
negloglik(theta, y)
to evaluate the negative of the log-likelihood above, and then for the data \[ 0.16, 0.35, 0.37, 0.46, 0.67, 0.69, 0.79, 0.8, 0.91, 0.92 \] find the maximum likelihood estimate \(\hat \theta\) usingoptimize()
, assuming that \(1 \leq \hat \theta \leq 3\). [Recall thatlgamma(x)
evaluates \(\log\big(\Gamma(x)\big)\) inR
if \(x =\)x
.]
Solution
negloglik <- function(theta, y) {
# Function to evaluate negative log-likelihood
# theta is a scalar
# y is a vector
# returns a scalar
n <- length(y)
loglik <- n * (lgamma(2 * theta) - 2 * lgamma(theta))
loglik <- loglik + (theta - 1) * sum(log(y) + log(1 - y))
-loglik
}
y <- c(.16, .35, .37, .46, .67, .69, .79, .8, .91, .92)
negloglik(2, y)
optimize(negloglik, c(1, 3), y = y)
Week 9 lecture 3
Challenges I
- Consider modelling the sample \(y_1, \ldots, y_n\) independent as Exp(\(\lambda\)) random variables. Given that their log likelihood is
\[
\log f(\mathbf{y} \mid \lambda) = n \log \lambda - \lambda \sum_{i = 1}^n y_i
\]
and then that
\[
\dfrac{\partial \log f(\mathbf{y} \mid \lambda)}{\partial \lambda} = \dfrac{n}{\lambda} - \sum_{i = 1}^n y_i \quad \text{and} \quad \dfrac{\partial^2 \log f(\mathbf{y} \mid \lambda)}{\partial \lambda^2} = -\dfrac{n}{\lambda^2}
\]
write functions called
nd1(lambda, y)
andnd2(lambda, y)
that evaluate the first and second derivatives of the negative log-likelihood, i.e. of \(-\log f(\mathbf{y} \mid \lambda)\).
Solution
nd1 <- function(lambda, y) {
# Function to evaluate first derivative of exponential
# negative log-likelihood w.r.t. lambda
# lambda is a scalar
# y is a vector
# returns a scalar
n <- length(y)
sum(y) - n / lambda
}
nd2 <- function(lambda, y) {
# Function to evaluate second derivative of exponential
# negative log-likelihood w.r.t. lambda
# lambda is a scalar
# y is a vector
# returns a scalar
n <- length(y)
n / lambda^2
}
- Find the first Newton step given a starting value of \(\lambda_0 = 1\) and the sample \[ 0.2, 0.5, 0.6, 1.0, 1.8, 2.5 \]
Week 10 lecture 1
Challenges I
- Consider modelling the sample \(y_1, \ldots, y_n\) independent as Exp(\(\lambda\)) random variables. Given that their log likelihood is
\[
\log f(\mathbf{y} \mid \lambda) = n \log \lambda - \lambda \sum_{i = 1}^n y_i,
\]
write a function called
nd0(lambda, y)
that evaluates the negative log-likelihood, i.e. \(-\log f(\mathbf{y} \mid \lambda)\).
Solution
- Use
nlminb()
withnd0()
,nd1()
andnd2()
to find \(\hat \lambda\), the maximum likelihood estimate, given independent sample \[ 0.2, 0.5, 0.6, 1.0, 1.8, 2.5 \] and a starting value of \(\lambda_0 = 1\). [Note that you will need to ensure thatnd2()
returns a matrix, albeit a trivial \(1 \times 1\) one. Usingas.matrix()
is probably easiest.]
Solution
nd1 <- function(lambda, y) {
# Function to evaluate first derivative of exponential
# negative log-likelihood w.r.t. lambda
# lambda is a scalar
# y is a vector
# returns a scalar
n <- length(y)
sum(y) - n / lambda
}
nd2 <- function(lambda, y) {
# Function to evaluate second derivative of exponential
# negative log-likelihood w.r.t. lambda
# lambda is a scalar
# y is a vector
# returns a scalar
n <- length(y)
matrix(n / lambda^2, 1, 1)
}
y <- c(.2, .5, .6, 1, 1.8, 2.5)
lambda_0 <- 1
nlminb(lambda_0, nd0, nd1, nd2, y = y)
- Confirm your result with
optimize()
andoptim(..., method = 'Brent)
.
Solution
- Experiment with different starting values \(\lambda_0\) in order to investigate the sensitivity of Newton’s method to starting values (on this occasion)
Week 10 lecture 2
Challenges I
- Consider modelling the sample \(y_1, \ldots, y_n\) independent as Exp(\(\text{e}^\tau\)) random variables. Given that their log likelihood is
\[
\log f(\mathbf{y} \mid \tau) = n \tau - \text{e}^\tau \sum_{i = 1}^n y_i,
\]
use
nlminb()
by supplying the negative log-likelihood and its first and second derivatives w.r.t. \(\tau\) to find \(\hat \tau\), its maximum likelihood estimate, given independent sample \[ 0.2, 0.5, 0.6, 1.0, 1.8, 2.5 \] and a starting value of \(\tau_0 = 1\).
Solution
nd0_tau <- function(tau, y) {
# function to evaluate Exp(exp(tau)) neg. log lik.
# tau is a scalar
# y is a vector
# returns a scalar
n <- length(y)
- n * tau + exp(tau) * sum(y)
}
y0 <- c(.2, .5, .6, 1, 1.8, 2.5)
tau0 <- 0
nd0_tau(tau0, y0)
nd1_tau <- function(tau, y) {
# function to 1st deriv. w.r.t. to tau
# of Exp(exp(tau)) neg. log lik.
# tau is a scalar
# y is a vector
# returns a scalar
n <- length(y)
- n + exp(tau) * sum(y)
}
nd2_tau <- function(tau, y) {
# function to 2nd deriv. w.r.t. to tau
# of Exp(exp(tau)) neg. log lik.
# tau is a scalar
# y is a vector
# returns a scalar
as.matrix(exp(tau) * sum(y))
}
nd1_tau(tau0, y0)
nd2_tau(tau0, y0)
fit <- nlminb(tau0, nd0_tau, nd1_tau, nd2_tau, y = y0)
tau_hat <- fit$par
exp(tau_hat)
- Confirm the result with
optimize()
.
Week 10 lecture 3
Challenges I
- Consider the sample \(y_1, \ldots, y_n\) given by \[ 1.8, 2.1, 2.3, 2.4, 2.5, 2.7, 3.1, 4.3, 4.3, 4.4 \] as \(n = 10\) independent realisations from the Normal(\(\mu\), \(\sigma^2\)) with \(\mu\) unknown and \(\sigma^2 = 1\), i.e. with pdf \[ f(y \mid \mu, \sigma) = \frac{1}{\sqrt{2 \pi \sigma^2}} \text{e}^{-(y - \mu)^2/2\sigma^2}. \] Suppose that \(\sigma^2 = 1\). Find \(\log f(\mathbf{y} \mid \mu = 1, \sigma^2 = 1) = \sum_{i = 1}^n \log f(y_i \mid \mu = 1, \sigma^2 = 1)\).
- Write a function in
R
,negloglik(mu, y)
, that evaluates \(-\log f(\mathbf{y} \mid \mu, \sigma^2 = 1)\) formu
\(= \mu\) andy
\(=y\).
Solution
- Find \(-\partial \log f(\mathbf{y} \mid \mu, \sigma^2 = 1) / \partial \mu\) and write a function in
R
,negloglik_d1(mu, y)
, that evaluates \(-\partial \log f(\mathbf{y} \mid \mu, \sigma^2 = 1) / \partial \mu\) formu
\(= \mu\) andy
\(=y\).
Solution
- Use
optim()
inR
to find \(\hat \mu\), the maximum likelihood estimate of \(\mu\), using the BFGS method. You should supplynegloglik_d1(mu, y)
to evaluate the first derivative w.r.t. \(\mu\), and use \(\mu_0 = 1\) as a starting value.
Solution
- Evaluate \(-\partial \log f(\mathbf{y} \mid \mu, \sigma^2 = 1) / \partial \mu\) at \(\hat \mu\) to verify that \(\hat \mu\) is the m.l.e.
Week 11 lecture 1
Extra example
Consider the sample \(y_1, \ldots, y_n\) given by \[ 1.8, 2.1, 2.3, 2.4, 2.5, 2.7, 3.1, 4.3, 4.3, 4.4 \] as \(n = 10\) independent realisations from the Normal(\(\mu\), \(\sigma^2\)) with \(\mu\) and \(\sigma^2\) unknown, i.e. with pdf \[ f(y \mid \mu, \sigma) = \frac{1}{\sqrt{2 \pi \sigma^2}} \text{e}^{-(y - \mu)^2/2\sigma^2}. \]
- Write a function in
R
,negloglik(pars, y)
, that evaluates \(-\log f(\mathbf{y} \mid \mu, \sigma^2)\) forpars[1]
\(= \mu\),pars[2]
\(= \sigma\) andy
\(=y\).
Solution
negloglik <- function(pars, y) {
# Function to evaluate derivative negative log-likelihood
# pars is a 2-vector, where
# pars[1] is mu and
# pars[2] is sigma
# y is a vector
# returns a scalar
mu <- pars[1]
sigma <- pars[2]
if (sigma <= 0)
return(1e6)
n <- length(y)
n * log(2 * pi) / 2 + n * log(sigma) + sum((y - mu)^2) / 2 / sigma^2
}
y <- c(1.8, 2.1, 2.3, 2.4, 2.5, 2.7, 3.1, 4.3, 4.3, 4.4)
negloglik(c(1.5, 1.5), y)
-sum(dnorm(y, 1.5, 1.5, log = TRUE))
- Find \(-\nabla \log f(\mathbf{y} \mid \mu, \sigma)\), i.e. the 2-vector comprising \(-\partial \log f(\mathbf{y} \mid \mu, \sigma) / \partial \mu\) and \(-\partial \log f(\mathbf{y} \mid \mu, \sigma) / \partial \sigma\), and then write a function in
R
that evaluates \(-\nabla \log f(\mathbf{y} \mid \mu, \sigma)\).
Solution
negloglik_d1 <- function(pars, y) {
# Function to evaluate derivative of
# negative log-likelihood w.r.t. mu and sigma
# pars is a 2-vector, where
# pars[1] is mu and
# pars[2] is sigma
# y is a vector
# returns a 2-vector
mu <- pars[1]
sigma <- pars[2]
n <- length(y)
out <- numeric(2)
res <- y - mu
out[1] <- -sum(res) / sigma^2
out[2] <- n / sigma - sum(res^2) / sigma^3
out
}
negloglik_d1(c(1.5, 1.5), y)
- Use the BFGS method with
optim()
inR
to find \(\hat \mu\) and \(\hat \sigma\), the maximum likelihood estimates of \(\mu\) and \(\sigma\). Use \(\mu_0 = 1.5\) and \(\sigma_0 = 1.5\) as starting values.
Solution
- Use the \(-\nabla \log f(\mathbf{y} \mid \mu, \sigma)\) evaluated at \(\hat \mu\) and \(\hat \sigma\) to verify that the m.l.e.s have been found.
Challenges I
- Consider the alternative parameterisation of the Weibull pdf in terms of parameters \((\log \lambda, \log k)\) given by
\[
f(y \mid \tilde \lambda, \tilde k) =
\frac{\exp(\tilde k)}{\exp(\tilde \lambda)}\left(\frac{y}{\exp(\tilde \lambda)}\right)^{\exp(\tilde k)-1}e^{-(y/\exp(\tilde \lambda))^{\exp(\tilde k)}} \quad y > 0,
\]
where \(-\infty < \tilde \lambda, \tilde k < \infty\). Write a function in
R
,weib2_d0(pars, y, mult = 1)
, that evaluates the scaled log-likelihood of the model, i.e. \(m \log f(\mathbf{y} \mid \tilde \lambda, \tilde k)\), wherepars
is a 2-vector such thatpars[1]
\(= \tilde \lambda\) andpars[2]
\(= \tilde k\),y
is the data \(\mathbf{y}\) andmult
\(= m\) is a multiplier as forweib_d0()
.
Solution
weib2_d0 <- function(pars, y, mult = 1) {
# Function to evaluate re-parameterised Weibull log-likelihood
# pars is a vector
# pars[1] is $\tilde \lambda$
# pars[2] is $\tilde k$
# y can be scalar or vector
# mult is a scalar defaulting to 1; so -1 returns neg. log likelihood
# returns a scalar
n <- length(y)
epars <- exp(pars)
out <- n * (pars[2] - epars[2] * pars[1]) +
(epars[2] - 1) * sum(log(y)) - sum((y / epars[1])^epars[2])
mult * out
}
- Find the maximum likelihood estimates of \((\tilde \lambda, \tilde k)\) using the Nelder-Mead method, starting at \((\tilde \lambda_0, \tilde k_0) = \big(\log(1.6), \log(0.6)\big)\).
- Confirm that these are related to those of the original parameterisation through the logarithm.
Week 11 lecture 2
Challenges I
- Recall the Weibull parameterisation in terms \((\log \lambda, \log k)\) \[ f(y \mid \tilde \lambda, \tilde k) = \frac{\exp(\tilde k)}{\exp(\tilde \lambda)}\left(\frac{y}{\exp(\tilde \lambda)}\right)^{\exp(\tilde k)-1}e^{-(y/\exp(\tilde \lambda))^{\exp(\tilde k)}} \quad y > 0, \] where \(-\infty < \tilde \lambda, \tilde k < \infty\).
- Use simulated annealing to find the maximum likelihood estimates of \((\tilde \lambda, \tilde k)\) using a starting temperature of \(T = 10\) and \(N = 10^3\) iterations, i.e.
control = list(temp = 10, maxit = 1e3)
, for the wind speed data of Example 5.5.
Solution
weib2_d0 <- function(pars, y, mult = 1) {
# Function to evaluate re-parameterised Weibull log-likelihood
# pars is a vector
# pars[1] is $\tilde \lambda$
# pars[2] is $\tilde k$
# y can be scalar or vector
# mult is a scalar defaulting to 1; so -1 returns neg. log likelihood
# returns a scalar
n <- length(y)
epars <- exp(pars)
out <- n * (pars[2] - epars[2] * pars[1]) +
(epars[2] - 1) * sum(log(y)) - sum((y / epars[1])^epars[2])
mult * out
}
y0 <- c(3.52, 1.95, 0.62, 0.02, 5.13, 0.02, 0.01, 0.34, 0.43, 15.5,
4.99, 6.01, 0.28, 1.83, 0.14, 0.97, 0.22, 0.02, 1.87, 0.13, 0.01,
4.81, 0.37, 8.61, 3.48, 1.81, 37.21, 1.85, 0.04, 2.32, 1.06)
optim(c(log(1.6), log(0.6)), weib2_d0, y = y0, mult = -1, method = 'SANN',
control = list(maxit = 1e3, temp = 10))
- Repeat the above optimisation 20 times, and comment on how the resulting parameter estimates vary.
Solution
- Change \(T\) to \(T = 0.1\) and then repeat Question 2.
Solution
- Comment on what you observe between Question 2 and Question 3.