Shrinkage and the correlation parameter

Author

Phillip M. Alday

Published

September 8, 2022

What does omitting the correlation parameter do?

Because the number of correlations grows quadratically with the number of random slopes, adding just one additional slope (whether main effect or interaction) can greatly increase the number of free parameters in the model. We can omit them from the model, using the || in lme4, splitting elements into separate (x + ... | grp) terms in lme4 or MixedModels.jl or using zerocorr in MixedModels.jl.

In terms of philosophy, this is a bit like omitting higher order interactions from the fixed effects: there is change in the bias-variance tradeoff. However, practice suggests that the tradeoff is often worthwhile, although it makes shrinkage less efficient. John Kruschke has a nice worked example on his blog.

For our sleep study example, we we can see that there is very little impact because there is almost no correlation between the random intercept and random slope.

using CairoMakie
using MixedModels
using MixedModelsMakie

sleepstudy = MixedModels.dataset(:sleepstudy)
# REML=false by default in Julia
m2 = fit(MixedModel, @formula(reaction ~ 1 + days + (1 + days | subj)), sleepstudy)
┌ Warning: ProgressMeter by default refresh meters with additional information in IJulia via `IJulia.clear_output`, which clears all outputs in the cell. 
│  - To prevent this behaviour, do `ProgressMeter.ijulia_behavior(:append)`. 
│  - To disable this warning message, do `ProgressMeter.ijulia_behavior(:clear)`.
└ @ ProgressMeter /home/phillip/.julia/packages/ProgressMeter/sN2xr/src/ProgressMeter.jl:618
Minimizing 57    Time: 0:00:00 ( 7.41 ms/it)
Est. SE z p σ_subj
(Intercept) 251.4051 6.6323 37.91 <1e-99 23.7805
days 10.4673 1.5022 6.97 <1e-11 5.7168
Residual 25.5918

We can see this with a shrinkage plot, which show the by-group (here: by-subject) offsets from the grand mean for each random effect. The red dots correspond to the esimtates you would get from classical linear regression within subjects, while the blue dots correspond to the shrunken “estimates” (technically predictions) you get for each subject from the mixed model.

shrinkageplot(m2)

MixedModels.PCA(m2)[:subj]

Principal components based on correlation matrix
 (Intercept)  1.0    .
 days         0.08  1.0

Normalized cumulative variances:
[0.5407, 1.0]

Component loadings
                PC1    PC2
 (Intercept)  -0.71  -0.71
 days         -0.71   0.71
m2_zerocorr = fit(MixedModel, @formula(reaction ~ 1 + days + zerocorr(1 + days | subj)), sleepstudy)
Est. SE z p σ_subj
(Intercept) 251.4051 6.7077 37.48 <1e-99 24.1714
days 10.4673 1.5193 6.89 <1e-11 5.7994
Residual 25.5561
shrinkageplot(m2_zerocorr)

MixedModels.PCA(m2)[:subj]

Principal components based on correlation matrix
 (Intercept)  1.0    .
 days         0.08  1.0

Normalized cumulative variances:
[0.5407, 1.0]

Component loadings
                PC1    PC2
 (Intercept)  -0.71  -0.71
 days         -0.71   0.71

If we consider a more complex model, then the change can be much more dramatic:

kb07 = MixedModels.dataset(:kb07)
contrasts = Dict(:subj => Grouping(),
                 :item => Grouping(),
                 :spkr => EffectsCoding(),
                 :prec => EffectsCoding(),
                 :load => EffectsCoding() )
m_kb07 = fit(MixedModel, @formula(rt_raw ~ 1 + spkr * prec * load + (1 + spkr + prec + load|subj) + (1 + spkr + prec + load|item)), kb07; contrasts)
┌ Warning: ProgressMeter by default refresh meters with additional information in IJulia via `IJulia.clear_output`, which clears all outputs in the cell. 
│  - To prevent this behaviour, do `ProgressMeter.ijulia_behavior(:append)`. 
│  - To disable this warning message, do `ProgressMeter.ijulia_behavior(:clear)`.
└ @ ProgressMeter /home/phillip/.julia/packages/ProgressMeter/sN2xr/src/ProgressMeter.jl:618
Minimizing 601   Time: 0:00:00 ( 1.55 ms/it)
  objective:  29652.62866326455
Est. SE z p σ_subj σ_item
(Intercept) 2225.5754 84.9527 26.20 <1e-99 328.0674 393.4597
spkr: old 74.0912 23.5388 3.15 0.0016 43.1209 45.1421
prec: maintain -377.3128 59.7116 -6.32 <1e-09 118.9270 302.2964
load: yes 102.0965 24.6618 4.14 <1e-04 49.7595 58.4598
spkr: old & prec: maintain -28.1656 21.3820 -1.32 0.1878
spkr: old & load: yes 26.9646 21.3821 1.26 0.2073
prec: maintain & load: yes -18.5510 21.3821 -0.87 0.3856
spkr: old & prec: maintain & load: yes 15.6366 21.3820 0.73 0.4646
Residual 904.3472
MixedModels.PCA(m_kb07)[:subj]

Principal components based on correlation matrix
 (Intercept)      1.0    .     .     .
 spkr: old        1.0   1.0    .     .
 prec: maintain  -1.0  -1.0   1.0    .
 load: yes        1.0   1.0  -1.0   1.0

Normalized cumulative variances:
[1.0, 1.0, 1.0, 1.0]

Component loadings
                   PC1    PC2    PC3    PC4
 (Intercept)     -0.5   -0.38  -0.33  -0.71
 spkr: old       -0.5   -0.38  -0.33   0.71
 prec: maintain   0.5    0.09  -0.86   0.0
 load: yes       -0.5    0.84  -0.2    0.0

The effective dimensionality can be seen in the way that the random effects collapse into lines (i.e.. a 1-D object) within the majority of the panels (each representing a 2-D plane).

shrinkageplot(m_kb07, :subj)

m_kb07zc = fit(MixedModel, @formula(rt_raw ~ 1 + spkr * prec * load + zerocorr(1 + spkr + prec + load | subj) + zerocorr(1 + spkr + prec + load | item)), kb07; contrasts)
Est. SE z p σ_subj σ_item
(Intercept) 2225.6609 85.7868 25.94 <1e-99 330.9087 397.6761
spkr: old 74.1345 21.4927 3.45 0.0006 0.0000 0.0000
prec: maintain -377.2695 59.6939 -6.32 <1e-09 113.3176 303.1632
load: yes 102.0110 22.9634 4.44 <1e-05 46.3311 29.4211
spkr: old & prec: maintain -28.0802 21.4927 -1.31 0.1914
spkr: old & load: yes 26.9213 21.4927 1.25 0.2104
prec: maintain & load: yes -18.5943 21.4927 -0.87 0.3870
spkr: old & prec: maintain & load: yes 15.5512 21.4927 0.72 0.4693
Residual 909.0068
MixedModels.PCA(m_kb07zc)[:subj]

Principal components based on correlation matrix
 (Intercept)     1.0  .    .    .
 spkr: old       0.0  0.0  .    .
 prec: maintain  0.0  0.0  1.0  .
 load: yes       0.0  0.0  0.0  1.0

Normalized cumulative variances:
[0.3333, 0.6667, 1.0, 1.0]

Component loadings
                  PC1   PC2   PC3     PC4
 (Intercept)     1.0   0.0   0.0     0.0
 spkr: old       0.0   0.0   0.0   NaN
 prec: maintain  0.0   0.0   1.0     0.0
 load: yes       0.0   1.0   0.0     0.0

When we force the correlations to be zero, we can no longer get diagonal lines – we we can only get horizontal or vertical lines within each panel. Diagonal lines correspond to non zeor correlations between two variance components.

shrinkageplot(m_kb07zc, :subj)