MixedModelsPermutations.jl API
Nonparametric Bootstrap
MixedModelsPermutations.nonparametricbootstrap
— Functionnonparametricbootstrap([rng::AbstractRNG,] nsamp::Integer, m::LinearMixedModel;
use_threads=false, blup_method=ranef, β=coef(morig))
Perform nsamp
nonparametric bootstrap replication fits of m
, returning a MixedModelBootstrap
.
The default random number generator is Random.GLOBAL_RNG
.
GeneralizedLinearMixedModel
is currently unsupported.
Named Arguments
use_threads
determines whether or not to use thread-based parallelism.
Note that use_threads=true
may not offer a performance boost and may even decrease peformance if multithreaded linear algebra (BLAS) routines are available. In this case, threads at the level of the linear algebra may already occupy all processors/processor cores. There are plans to provide better support in coordinating Julia- and BLAS-level threads in the future.
The PRNG shared between threads is locked using Threads.SpinLock
, which should not be used recursively. Do not wrap nonparametricbootstrap
in an outer SpinLock
.
hide_progress
can be used to disable the progress bar. Note that the progress bar is automatically disabled for non-interactive (i.e. logging) contexts.
blup_method
provides options for how/which group-level effects are passed for resampling. The default ranef
uses the shrunken conditional modes / BLUPs. Unshrunken estimates from ordinary least squares (OLS) can be used with olsranef
. There is no shrinkage of the group-level estimates with this approach, which means singular estimates can be avoided. However, if the design matrix for the random effects is rank deficient (e.g., through the use of MixedModels.fulldummy
or missing cells in the data), then this method will fail. See olsranef
and MixedModels.ranef
for more information.
Method
The method implemented here is based on the approach given in Section 3.2 of: Carpenter, J.R., Goldstein, H. and Rasbash, J. (2003). A novel bootstrap procedure for assessing the relationship between class size and achievement. Journal of the Royal Statistical Society: Series C (Applied Statistics), 52: 431-443. https://doi.org/10.1111/1467-9876.00415
Note that this method is not exported to match permute!
.
MixedModelsPermutations.resample!
— Functionresample!([rng::AbstractRNG,] model::LinearMixedModel;
β=coef(model), blups=ranef(model), resids=residuals(model,blups)
scalings=inflation_factor(model))
Simulate and install a new response using resampling at the observational and group level.
At both levels, resampling is done with replacement. At the observational level, this is resampling of the residuals, i.e. comparable to the step in the classical nonparametric bootstrap for OLS regression. At the group level, samples are done jointly for all terms ("random intercept and random slopes", conditional modes) associated with a particular level of a blocking factor. For example, the predicted slope and intercept for a single participant or item are kept together. This clumping in the resampling procedure is necessary to preserve the original correlation structure of the slopes and intercepts.
In addition to this resampling step, there is also an inflation step. Due to the shrinkage associated with the random effects in a mixed model, the variance of the conditional modes / BLUPs / random intercepts and slopes is less than the variance estimated by the model and displayed in the model summary or via MixedModels.VarCorr
. This shrinkage also impacts the observational level residuals. To compensate for this, the resampled residuals and groups are scale-inflated so that their standard deviation matches that of the estimates in the original model. The default inflation factor is computed using inflation_factor
on the model.
See also nonparametricbootstrap
and MixedModels.simulate!
.
This method has serious limitations for singular models because resampling from a distribution with many zeros (e.g. the random effects components with zero variance) will often generate new data with even less variance.
Reference
The method implemented here is based on the approach given in Section 3.2 of: Carpenter, J.R., Goldstein, H. and Rasbash, J. (2003). A novel bootstrap procedure for assessing the relationship between class size and achievement. Journal of the Royal Statistical Society: Series C (Applied Statistics), 52: 431-443. https://doi.org/10.1111/1467-9876.00415
Permutation Testing
MixedModelsPermutations.permutation
— Functionpermutation([rng::AbstractRNG,] nsamp::Integer, m::LinearMixedModel;
use_threads::Bool=false,
β=zeros(length(coef(morig))),
residual_permutation=:signflip,
blup_method=ranef,
residual_method=residuals)
Perform nsamp
nonparametric bootstrap replication fits of m
, returning a MixedModelBootstrap
.
The default random number generator is Random.GLOBAL_RNG
.
GeneralizedLinearMixedModel
is currently unsupported.
Named Arguments
use_threads
determines whether or not to use thread-based parallelism.
Note that use_threads=true
may not offer a performance boost and may even decrease peformance if multithreaded linear algebra (BLAS) routines are available. In this case, threads at the level of the linear algebra may already occupy all processors/processor cores. There are plans to provide better support in coordinating Julia- and BLAS-level threads in the future.
The PRNG shared between threads is locked using Threads.SpinLock
, which should not be used recursively. Do not wrap permutation
in an outer SpinLock
.
hide_progress
can be used to disable the progress bar. Note that the progress bar is automatically disabled for non-interactive (i.e. logging) contexts.
Permutation at the level of residuals can be accomplished either via sign flipping (residual_permutation=:signflip
) or via classical permutation/shuffling (residual_permutation=:shuffle
).
blup_method
provides options for how/which group-level effects are passed for permutation. The default ranef
uses the shrunken conditional modes / BLUPs. Unshrunken estimates from ordinary least squares (OLS) can be used with olsranef
. There is no shrinkage of the group-level estimates with this approach, which means singular estimates can be avoided. However, if the design matrix for the random effects is rank deficient (e.g., through the use of MixedModels.fulldummy
or missing cells in the data), then this method will fail. See olsranef
and MixedModels.ranef
for more information.
residual_method
provides options for how observation-level residuals are passed for permuation. This should be a two-argument function, taking the model and the BLUPs (as computed with blup_method
) as arguments. If you wish to ignore the BLUPs as computed with blup_method
, then you still need the second argument, but you can simply not use it in your function.
inflation_method
is a three-argument function (model, BLUPs as computed by blup_method
, residuals computed by residual_method
) for computing the inflation factor passed onto permute!
.
Generally, permutations are used to test a particular (null) hypothesis. This hypothesis is specified via by setting β
argument to match the hypothesis. For example, the null hypothesis that the first coefficient is zero would expressed as
julia> hypothesis = coef(model);
julia> hypothesis[1] = 0.0;
The permutation (test) generates samples from H0, from which it is possible to compute p-values. The bootstrap typically generates samples from H1, which are convenient for generating coverage/confidence intervals. Of course, confidence intervals and p-values are duals of each other, so it is possible to convert from one to the other.
Method
The method implemented here is based on the approach given in:
ter Braak C.J.F. (1992) Permutation Versus Bootstrap Significance Tests in Multiple Regression and Anova. In: Jöckel KH., Rothe G., Sendler W. (eds) Bootstrapping and Related Techniques. Lecture Notes in Economics and Mathematical Systems, vol 376. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-48850-4_10
and
Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., & Nichols, T. E. (2014). Permutation inference for the general linear model. NeuroImage, 92, 381–397. https://doi.org/10.1016/j.neuroimage.2014.01.060
This method has serious limitations for singular models because sign-flipping a zero is not an effective randomization technique.
MixedModelsPermutations.olsranef
— Functionolsranef(model::LinearMixedModel, method=:simultaneous)
Compute the group-level estimates using ordinary least squares.
This is somewhat similar to the conditional modes / BLUPs computed without shrinkage.
There is no shrinkage of the group-level estimates with this approach, which means singular estimates can be avoided. However, this also means the random effects design matrix must not be singular.
Two methods are provided:
- (default) OLS estimates computed for all strata (blocking variables) simultaneously with
method=simultaneous
. This pools the variance across estimates but does not shrink the estimates. Note that this method internal reparameterizes the random effects matrixZ
to use effects coding and only use a single intercept shared across all grouping variables. - OLS estimates computed within each stratum with
method=stratum
. This method is equivalent for example to computing each subject-level and each item-level regression separately.
For fully balanced designs with a single blocking variable, these methods will give the same results.
If the design matrix for the random effects is rank deficient (e.g., through the use of MixedModels.fulldummy
or missing cells in the data), then these methods will fail because no shrinkage/regularization is applied.
Note that this method is not exported to avoid a name collision with Base.permute!
.
MixedModelsPermutations.permute!
— Functionpermute!([rng::AbstractRNG,] model::LinearMixedModel;
β=zeros(length(coef(model))),
blups=ranef(model),
resids=residuals(model,blups),
residual_permutation=:signflip,
scalings=inflation_factor(model))
Simulate and install a new response via permutation of the residuals at the observational level and sign-flipping of the conditional modes at group level.
Generally, permutations are used to test a particular (null) hypothesis. This hypothesis is specified via by setting β
argument to match the hypothesis. For example, the null hypothesis that the first coefficient is zero would expressed as
julia> hypothesis = coef(model);
julia> hypothesis[1] = 0.0;
Permutation at the level of residuals can be accomplished either via sign flipping (residual_permutation=:signflip
) or via classical permutation/shuffling (residual_permutation=:shuffle
).
Sign-flipped permutation of the residuals is similar to permuting the (fixed-effects) design matrix; shuffling the residuals is the same as permuting the (fixed-effects) design matrix. Sign-flipping the random effects preserves the correlation structure of the random effects, while also being equivalent to permutation via swapped labels for categorical variables.
This method has serious limitations for singular models because sign-flipping a zero is not an effective randomization technique.
Optionally, instead of using the shrunken random effects from ranef
, within-group OLS estimates can be computed and used instead with olsranef
. There is no shrinkage of the group-level estimates with this approach, which means singular estimates can be avoided. However, if the design matrix for the random effects is rank deficient (e.g., through the use of MixedModels.fulldummy
or missing cells in the data), then this method will fail.
In addition to the permutation step, there is also an inflation step. Due to the shrinkage associated with the random effects in a mixed model, the variance of the conditional modes / BLUPs / random intercepts and slopes is less than the variance estimated by the model and displayed in the model summary or via MixedModels.VarCorr
. This shrinkage also impacts the observational level residuals. To compensate for this, the resampled residuals and groups are scale-inflated so that their standard deviation matches that of the estimates in the original model. The default inflation factor is computed using inflation_factor
on the model.
See also permutation
, nonparametricbootstrap
and resample!
.
The functions coef
and coefnames
from MixedModels
may also be useful.
Reference
The method implemented here is based on the approach given in:
ter Braak C.J.F. (1992) Permutation Versus Bootstrap Significance Tests in Multiple Regression and Anova. In: Jöckel KH., Rothe G., Sendler W. (eds) Bootstrapping and Related Techniques. Lecture Notes in Economics and Mathematical Systems, vol 376. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-48850-4_10
and
Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., & Nichols, T. E. (2014). Permutation inference for the general linear model. NeuroImage, 92, 381–397. https://doi.org/10.1016/j.neuroimage.2014.01.060
MixedModelsPermutations.permutationtest
— Functionpermutationtest(perm::MixedModelPermutation, model, type=:greater)
Perform a permutation using the already computed permutation and given the observed values.
The type
parameter specifies use of a two-sided test (:twosided
) or the directionality of a one-sided test (either :lesser
or :greater
, depending on the hypothesized difference to the null hypothesis).
See also permutation
.
Scale Inflation
MixedModelsPermutations.inflation_factor
— Functioninflation_factor(m::LinearMixedModel, blups=ranef(m), resids=residuals(m)))
Compute how much the standard deviation of the BLUPs/conditional modes and residuals needs to be inflated in order to match the (restricted) maximum-likelihood estimate.
Due to the shrinkage associated with the random effects in a mixed model, the variance of the conditional modes / BLUPs / random intercepts and slopes is less than the variance estimated by the model and displayed in the model summary or via MixedModels.VarCorr
. This shrinkage also impacts the observational level residuals. To compensate for this, the resampled residuals and groups need to be scale-inflated so that their standard deviation matches that of the estimates in the original model.
The factor for scale inflation is returned as a Vector
of inflation factors for each of the random-effect strata (blocking variables) and the observation-level variability. The factor is on the standard deviation scale (lower Cholesky factor in the case of vector-valued random effects).