Coverage Probability

Last edited on January 18, 2018 at 08:16:08 AM

Coverage Probability

The coverage probability of a confidence interval procedure for estimating \(\pi\) at a fixed value of \(\pi\) is

\[C_n(\pi) = \sum_{k=0}^nI(k, \pi)\binom{n}{k}\pi^k(1 - \pi)^{n-k}\]

where \(I(k, \pi)\) equals 1 if the interval contains \(\pi\) when \(X = k\) and equals 0 if it does not contain \(\pi\).

Coverage Probability (concept)

The coverage probability of a confidence interval is the proportion of all possible confidence intervals for a fixed \(\pi\) that contain \(\pi\).
Confidence intervals are constructed at a given confidence level \((1 - \alpha)\), which is referred to as the nominal coverage probability or the nominal confidence level.
In an ideal setting, the nominal confidence level will equal the coverage probability; however, when assumptions used to derive a confidence interval are not satisfied, the actual coverage probability can be either less than or greater than the nominal confidence level.

Example 8.24

Consider a random variable \(X \sim Bin(n = 25, \pi)\), and define \(P = \frac{X}{n}\).

Compute the coverage probability for a 95% Wald (asymptotic) confidence interval if \(\pi = 0.70\).
The Wald (asymptotic) confidence interval for \(\pi\) is given below.

\[CI_{1 - \alpha}(\pi) = \left[ p - z_{1 - \alpha/2} \sqrt{\frac{p(1 - p)}{n}}, p + z_{1 - \alpha/2} \sqrt{\frac{p(1 - p)}{n}}\right]\]

Solution

To compute \(C_{n = 25}(\pi = 0.70)\), one must consider all the possible outcomes for \(X\) when \(n = 25\). The random variable \(X\) can assume values \(0, 1, 2, \ldots,25\), and for each value of \(X\) a different value of \(p\) (the sample proportion of successes) results, which one uses with the Wald (asymptotic) confidence interval to compute a 95% confidence interval.

Code

n <- 25            # number of Bernoulli trials
alpha <- 0.05      # alpha level
x <- 0:n           # vector containing values RV can assume
p <- x/n           # vector of possible p values
z <- qnorm(1 - alpha/2)     # critical value
ME <- z*sqrt(p*(1 - p)/n)   # margin of error
lcl <- p - ME      # lower confidence limit  
ucl <- p + ME      # upper confidence limit  
PI <- 0.70         # PI = P(Success)
BP <- dbinom(x, n, PI)      # Binomial probability
cover <- (PI >= lcl) & (PI <= ucl)  # Logical vector

Code (continued)

RES <- cbind(x, p, lcl, ucl, BP, cover) # cover is coerced to 0/1
DT::datatable(round(RES, 4), options = list(pageLength = 5, 
                                            autoWidth = TRUE))

Computing the Coverage Probability

Recall that \(C_n(\pi) = \sum_{k=0}^nI(k, \pi)\binom{n}{k}\pi^k(1 - \pi)^{n-k}\).
Need to programatically add all of the Binomial Probabilities (BP) values when the Wald interval contains \(\pi\).

x[cover]

[1] 13 14 15 16 17 18 19 20 21

In this problem, \[C_{n = 25}(\pi = 0.70) = P(X = 13) + \cdots + P(X = 21)\].

Final Code

dbinom(x[cover], n, PI)

[1] 0.02677676 0.05355351 0.09163601 0.13363585 0.16507958 0.17119364
[7] 0.14716646 0.10301652 0.05723140

sum(dbinom(x[cover], n, PI))

[1] 0.9492897

binom::binom.coverage(p = 0.70, n = 25, 
                      conf.level = 0.95, method = "asymptotic")

      method   p  n  coverage
1 asymptotic 0.7 25 0.9492897

\(C_{n = 25}(\pi = 0.70) = 0.9492897\).

Example 8.24 (continued)

Compute and graph the coverage probability for the Wald (asymptotic) confidence interval, using a confidence level of 95% with 2000 equally spaced values of \(\pi\).
Previously, we computed the coverage probability when \(\pi\) was 0.70. In this problem, we will need to compute 2000 coverage probability values and graph those against the 2000 values of \(\pi\).

R Code

n <- 25            # number of Bernoulli trials
alpha <- 0.05      # alpha level
CL <-  1 - alpha   # Confidence level
x <- 0:n           # vector containing values RV can assume
p <- x/n           # vector of possible p values
z <- qnorm(1 - alpha/2)     # critical value
ME <- z*sqrt(p*(1 - p)/n)   # margin of error
lcl <- p - ME      # lower confidence limit  
ucl <- p + ME      # upper confidence limit  
m <- 2000
PI <- seq(1/m, 1 - 1/m, length = m)   # PI = P(Success)
P_cov <- numeric(m) # allocating storage space
for(i in 1:m){
cover <- (PI[i] >= lcl) & (PI[i] <= ucl)  # Logical vector 
P_cov[i] <- sum(dbinom(x[cover], n, PI[i]))
}

Final Graph Code

plot(PI, P_cov, type = "l", xlab = expression(pi), 
     ylab = "Coverage Probability", ylim = c(0.0, 1.05))
lines(c(1/m, 1 - 1/m), c(CL, CL), col = "red", 
      lty = "dotted")
text(0.5, CL + 0.05, paste("Targeted Confidence Level =", CL))

Final Graph

`ggplot2` code

DF <- data.frame(PI, P_cov)
library(ggplot2)
ggplot(data = DF, aes(x = PI, y = P_cov)) + 
  geom_line() + 
  theme_bw() + 
  labs(x = expression(pi), y = "Coverage Probability") + 
  geom_hline(yintercept = CL, color = "red", lty = "dashed") + 
  geom_text(aes(x = 0.5, y = CL + 0.05), 
            label = paste("Targeted Confidence Level = ", CL))

`ggplot2` Graph

Using `binom`

library(binom)
binom.plot(n = 25, method = binom.asymp, np = 2000)

Better Confidence Intervals for \(\pi\)

Wilson confidence interval
Agresti-Coull confidence interval
Clopper-Pearson confidence interval

Wilson Confidence Interval

\[ \mathbb{P}\left(P-z_{1-\alpha/2}\sqrt{\frac{\pi(1-\pi)}{n}}\leq\pi\leq P + z_{1+\alpha/2}\sqrt{\frac{\pi(1-\pi)}{n}}\,\right)=\\1-\alpha \] Solution to above is

\[ CI_{1 - \alpha}(\pi) = [lcl, ucl], \] where \(lcl = \dfrac{p+\frac{z^2_{1-\alpha/2}}{2n}-z_{1-\alpha/2}\sqrt{\frac{p(1-p)}{n}+\frac{z^2_{1-\alpha/2}}{4n^2}}}{\left(1+\frac{z^2_{1-\alpha/2}}{n} \right)}\), and \(ucl = \dfrac{p+\frac{z^2_{1-\alpha/2}}{2n}+z_{1-\alpha/2}\sqrt{\frac{p(1-p)}{n}+\frac{z^2_{1-\alpha/2}}{4n^2}}}{\left(1+\frac{z^2_{1-\alpha/2}}{n} \right)}\).

Computing Options for Wilson (score) Confidence Interval

Use prop.test()
Use binom.confint() from binom

prop.test(x = 26, n = 40, correct = FALSE, conf.level = 0.90)$conf

[1] 0.5200677 0.7609263
attr(,"conf.level")
[1] 0.9

library(binom)
binom.confint(x = 26, n = 40, conf.level = 0.90, methods = "wilson")

  method  x  n mean     lower     upper
1 wilson 26 40 0.65 0.5200677 0.7609263

Agresti-Coull Confidence Interval for \(\pi\)

\[ CI_{1-\alpha}(\pi)=\left[\tilde{p}-z_{1-\alpha/2} \sqrt{\frac{\tilde{p}(1-\tilde{p})}{\tilde{n}}},\: \tilde{p}+z_{1-\alpha/2} \sqrt{\frac{\tilde{p}(1-\tilde{p})}{\tilde{n}}} \right] \]

where \(X\) denotes the number of successes in a sample of size \(n\),

\(\tilde{n} = n + z^2_{1 - \alpha/2}\), and
\(\tilde{p} = \frac{1}{\tilde{n}}\left(X + \frac{1}{2}z^2_{1 - \alpha/2} \right)\).
Compute with binom.confint() using methods = "ac"

binom.confint(x = 26, n = 40, conf.level = 0.90, methods = "ac")

         method  x  n mean    lower    upper
1 agresti-coull 26 40 0.65 0.519717 0.761277

Clopper-Pearson Confidence Interval for \(\pi\)

Often referred to as an "exact" confidence interval for \(\pi\). The Clopper-Pearson confidence interval is

\[ CI_{1-\alpha}(\pi)=\left[\beta_{\alpha/2, x, n - x + 1}, \beta_{1 - \alpha/2, x + 1, n - x} \right] \] where \(x\) is the number out of \(n\) observed successes and \(\beta_{\alpha/2, x, n - x + 1}\) and \(\beta_{1 - \alpha/2, x + 1, n - x}\) are the \(\alpha/2\) and \(1-\alpha/2\) percentiles of the standard \(\beta(\alpha,\beta)\) distribution. The function binom.confint() from the binom package will return a Clopper-Pearson confidence interval when the user provides the argument methods = "exact".

Computing Clopper-Pearson Confidence Interval

alpha <- 0.10
n <- 40
x <- 26
CI <- c(qbeta(alpha/2, x, n - x + 1), qbeta(1 - alpha/2, x + 1, n - x))
CI

[1] 0.5080545 0.7744675

binom.confint(x = x, n = n, conf.level = 1 - alpha, method = "exact")

  method  x  n mean     lower     upper
1  exact 26 40 0.65 0.5080545 0.7744675

Coverage Probability

Coverage Probability (concept)

Example 8.24

Solution

Code

Code (continued)

Computing the Coverage Probability

Final Code

Example 8.24 (continued)

R Code

Final Graph Code

Final Graph

`ggplot2` code

`ggplot2` Graph

Using `binom`

Better Confidence Intervals for \(\pi\)

Wilson Confidence Interval

Computing Options for Wilson (score) Confidence Interval

Agresti-Coull Confidence Interval for \(\pi\)

Clopper-Pearson Confidence Interval for \(\pi\)

Computing Clopper-Pearson Confidence Interval

Which One?

Expected Width of 95% Confidence Intervals when \(n = 20\)

Coverage Probability

Coverage Probability (concept)

Example 8.24

Solution

Code

Code (continued)

Computing the Coverage Probability

Final Code

Example 8.24 (continued)

R Code

Final Graph Code

Final Graph

ggplot2 code

ggplot2 Graph

Using binom

Better Confidence Intervals for \(\pi\)

Wilson Confidence Interval

Computing Options for Wilson (score) Confidence Interval

Agresti-Coull Confidence Interval for \(\pi\)

Clopper-Pearson Confidence Interval for \(\pi\)

Computing Clopper-Pearson Confidence Interval

Which One?

Expected Width of 95% Confidence Intervals when \(n = 20\)

`ggplot2` code

`ggplot2` Graph

Using `binom`