Some authors have shown that the bootstrap distribution is equal to the confidence distribution because it meets the definition of a consonance distribution.13 The bootstrap distribution and the asymptotic consonance distribution would be defined as:

\[H_{n}(\theta)=1-P\left(\hat{\theta}-\hat{\theta}^{*} \leq \hat{\theta}-\theta | \mathbf{x}\right)=P\left(\hat{\theta}^{*} \leq \theta | \mathbf{x}\right)\]

Certain bootstrap methods such as the BCa method and t-bootstrap method also yield second order accuracy of consonance distributions.

\[H_{n}(\theta)=1-P\left(\frac{\hat{\theta}^{*}-\hat{\theta}}{\widehat{S E}^{*}\left(\hat{\theta}^{*}\right)} \leq \frac{\hat{\theta}-\theta}{\widehat{S E}(\hat{\theta})} | \mathbf{x}\right)\]

Here, I demonstrate how to use these particular bootstrap methods to arrive at consonance curves and densities.

We’ll use the Iris dataset and construct a function that’ll yield a parameter of interest.

The Nonparametric Bootstrap

iris <- datasets::iris
foo <- function(data, indices) {
  dt <- data[indices, ]
  c(
    cor(dt[, 1], dt[, 2], method = "p")
  )
}

We can now use the curve_boot() method to construct a function. The default method used for this function is the “Bca” method provided by the bcaboot package.2

I will suppress the output of the function because it is unnecessarily long. But we’ve placed all the estimates into a list object called y.

The first item in the list will be the consonance distribution constructed by typical means, while the third item will be the bootstrap approximation to the consonance distribution.

ggcurve(data = y[[1]], nullvalue = TRUE)

ggcurve(data = y[[3]], nullvalue = TRUE)

We can also print out a table for TeX documents

(gg <- curve_table(data = y[[1]], format = "image"))

Lower Limit

Upper Limit

Interval Width

Interval Level (%)

CDF

P-value

S-value (bits)

-0.141

-0.094

0.048

25

0.625

0.75

0.415

-0.168

-0.067

0.101

50

0.750

0.50

1.000

-0.204

-0.031

0.172

75

0.875

0.25

2.000

-0.213

-0.022

0.192

80

0.900

0.20

2.322

-0.264

0.029

0.293

95

0.975

0.05

4.322

-0.310

0.075

0.386

99

0.995

0.01

6.644

More bootstrap replications will lead to a smoother function. But for now, we can compare these two functions to see how similar they are.

plot_compare(y[[1]], y[[3]])

If we wanted to look at the bootstrap standard errors, we could do so by loading the fifth item in the list

knitr::kable(y[[5]])
theta sdboot z0 a sdjack
est -0.1175698 0.0748308 0.0313380 0.0304863 0.075694
jsd 0.0000000 0.0008222 0.0190043 0.0000000 0.000000

where in the top row, theta is the point estimate, and sdboot is the bootstrap estimate of the standard error, sdjack is the jacknife estimate of the standard error. z0 is the bias correction value and a is the acceleration constant.

The values in the second row are essentially the internal standard errors of the estimates in the top row.


The densities can also be calculated accurately using the t-bootstrap method. Here we use a different dataset to show this

library(Lock5Data)
dataz <- data(CommuteAtlanta)
func <- function(data, index) {
  x <- as.numeric(unlist(data[1]))
  y <- as.numeric(unlist(data[2]))
  return(mean(x[index]) - mean(y[index]))
}

Our function is a simple mean difference. This time, we’ll set the method to “t” for the t-bootstrap method

z <- curve_boot(data = CommuteAtlanta, func = func, method = "t", replicates = 2000, steps = 1000)
#> Warning in norm.inter(t, alpha): extreme order statistics used as endpoints
ggcurve(data = z[[1]], nullvalue = FALSE)

ggcurve(data = z[[2]], type = "cd", nullvalue = FALSE)

The consonance curve and density are nearly identical. With more bootstrap replications, they are very likely to converge.

(zz <- curve_table(data = z[[1]], format = "image"))

Lower Limit

Upper Limit

Interval Width

Interval Level (%)

CDF

P-value

S-value (bits)

-39.391

-39.060

0.331

25.0

0.625

0.750

0.415

-39.594

-38.861

0.733

50.0

0.750

0.500

1.000

-39.870

-38.612

1.258

75.0

0.875

0.250

2.000

-39.928

-38.552

1.376

80.0

0.900

0.200

2.322

-40.006

-38.474

1.532

85.0

0.925

0.150

2.737

-40.112

-38.340

1.772

90.0

0.950

0.100

3.322

-40.302

-38.152

2.149

95.0

0.975

0.050

4.322

-40.402

-37.968

2.434

97.5

0.988

0.025

5.322

-40.676

-37.792

2.884

99.0

0.995

0.010

6.644

The Parametric Bootstrap

For the examples above, we mainly used nonparametric bootstrap methods. Here I show an example using the parametric Bca bootstrap and the results it yields.

First, we’ll load our data again and set our function.

data(diabetes, package = "bcaboot")
X <- diabetes$x
y <- scale(diabetes$y, center = TRUE, scale = FALSE)
lm.model <- lm(y ~ X - 1)
mu.hat <- lm.model$fitted.values
sigma.hat <- stats::sd(lm.model$residuals)
t0 <- summary(lm.model)$adj.r.squared
y.star <- sapply(mu.hat, rnorm, n = 1000, sd = sigma.hat)
tt <- apply(y.star, 1, function(y) summary(lm(y ~ X - 1))$adj.r.squared)
b.star <- y.star %*% X

Now, we’ll use the same function, but set the method to “bcapar” for the parametric method.

df <- curve_boot(method = "bcapar", t0 = t0, tt = tt, bb = b.star)

Now we can look at our outputs.

ggcurve(df[[1]], nullvalue = FALSE)

ggcurve(df[[3]], nullvalue = FALSE)

We can compare the functions to see how well the bootstrap approximations match up

plot_compare(df[[1]], df[[3]])

We can also look at the density function

ggcurve(df[[5]], type = "cd", nullvalue = FALSE)

That concludes our demonstration of the bootstrap method to approximate consonance functions.

Session info

#> R version 4.0.2 (2020-06-22)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Catalina 10.15.6
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] Lock5Data_2.8  concurve_2.7.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyr_1.1.1           splines_4.0.2         carData_3.0-4        
#>  [4] ProfileLikelihood_1.1 assertthat_0.2.1      highr_0.8            
#>  [7] metafor_2.4-0         cellranger_1.1.0      yaml_2.2.1           
#> [10] bcaboot_0.2-1         gdtools_0.2.2         pillar_1.4.6         
#> [13] backports_1.1.8       lattice_0.20-41       glue_1.4.1           
#> [16] uuid_0.1-4            digest_0.6.25         ggsignif_0.6.0       
#> [19] colorspace_1.4-1      htmltools_0.5.0       Matrix_1.2-18        
#> [22] pkgconfig_2.0.3       broom_0.7.0           haven_2.3.1          
#> [25] xtable_1.8-4          purrr_0.3.4           scales_1.1.1         
#> [28] km.ci_0.5-2           openxlsx_4.1.5        officer_0.3.12       
#> [31] rio_0.5.16            KMsurv_0.1-5          tibble_3.0.3         
#> [34] farver_2.0.3          generics_0.0.2        car_3.0-8            
#> [37] ggplot2_3.3.2         ellipsis_0.3.1        ggpubr_0.4.0         
#> [40] survival_3.2-3        magrittr_1.5          crayon_1.3.4         
#> [43] readxl_1.3.1          memoise_1.1.0         evaluate_0.14        
#> [46] fs_1.5.0              nlme_3.1-148          MASS_7.3-51.6        
#> [49] rstatix_0.6.0         forcats_0.5.0         xml2_1.3.2           
#> [52] foreign_0.8-80        tools_4.0.2           data.table_1.13.0    
#> [55] hms_0.5.3             lifecycle_0.2.0       stringr_1.4.0        
#> [58] flextable_0.5.10      munsell_0.5.0         zip_2.0.4            
#> [61] compiler_4.0.2        pkgdown_1.5.1         survminer_0.4.8      
#> [64] pbmcapply_1.5.0       systemfonts_0.2.3     rlang_0.4.7          
#> [67] grid_4.0.2            rstudioapi_0.11       labeling_0.3         
#> [70] base64enc_0.1-3       rmarkdown_2.3         boot_1.3-25          
#> [73] gtable_0.3.0          abind_1.4-5           curl_4.3             
#> [76] R6_2.4.1              zoo_1.8-8             gridExtra_2.3        
#> [79] knitr_1.29            dplyr_1.0.1           survMisc_0.5.5       
#> [82] rprojroot_1.3-2       desc_1.2.0            stringi_1.4.6        
#> [85] parallel_4.0.2        Rcpp_1.0.5            vctrs_0.3.2          
#> [88] tidyselect_1.1.0      xfun_0.16

References


1. Efron B, Tibshirani RJ. An Introduction to the Bootstrap. CRC Press; 1994.

2. Efron B, Narasimhan B. The automatic construction of bootstrap confidence intervals. October 2018:17.

3. Xie M-g, Singh K. Confidence Distribution, the Frequentist Distribution Estimator of a Parameter: A Review. International Statistical Review. 2013;81(1):3-39. doi:10.1111/insr.12000