Shrinkage and the correlation parameter

Author

Phillip M. Alday

Published

September 8, 2022

What does omitting the correlation parameter do?

Because the number of correlations grows quadratically with the number of random slopes, adding just one additional slope (whether main effect or interaction) can greatly increase the number of free parameters in the model. We can omit them from the model, using the || in lme4, splitting elements into separate (x + ... | grp) terms in lme4 or MixedModels.jl or using zerocorr in MixedModels.jl.

In terms of philosophy, this is a bit like omitting higher order interactions from the fixed effects: there is change in the bias-variance tradeoff. However, practice suggests that the tradeoff is often worthwhile, although it makes shrinkage less efficient. John Kruschke has a nice worked example on his blog.

For our sleep study example, we we can see that there is very little impact because there is almost no correlation between the random intercept and random slope.

using CairoMakie
using MixedModels
using MixedModelsMakie

sleepstudy = MixedModels.dataset(:sleepstudy)
# REML=false by default in Julia
m2 = fit(MixedModel, @formula(reaction ~ 1 + days + (1 + days | subj)), sleepstudy)

┌ Warning: ProgressMeter by default refresh meters with additional information in IJulia via `IJulia.clear_output`, which clears all outputs in the cell. 
│  - To prevent this behaviour, do `ProgressMeter.ijulia_behavior(:append)`. 
│  - To disable this warning message, do `ProgressMeter.ijulia_behavior(:clear)`.
└ @ ProgressMeter /home/phillip/.julia/packages/ProgressMeter/sN2xr/src/ProgressMeter.jl:618
Minimizing 57    Time: 0:00:00 ( 7.41 ms/it)

	Est.	SE	z	p	σ_subj
(Intercept)	251.4051	6.6323	37.91	<1e-99	23.7805
days	10.4673	1.5022	6.97	<1e-11	5.7168
Residual	25.5918

We can see this with a shrinkage plot, which show the by-group (here: by-subject) offsets from the grand mean for each random effect. The red dots correspond to the esimtates you would get from classical linear regression within subjects, while the blue dots correspond to the shrunken “estimates” (technically predictions) you get for each subject from the mixed model.

shrinkageplot(m2)

MixedModels.PCA(m2)[:subj]


Principal components based on correlation matrix
 (Intercept)  1.0    .
 days         0.08  1.0

Normalized cumulative variances:
[0.5407, 1.0]

Component loadings
                PC1    PC2
 (Intercept)  -0.71  -0.71
 days         -0.71   0.71

m2_zerocorr = fit(MixedModel, @formula(reaction ~ 1 + days + zerocorr(1 + days | subj)), sleepstudy)

	Est.	SE	z	p	σ_subj
(Intercept)	251.4051	6.7077	37.48	<1e-99	24.1714
days	10.4673	1.5193	6.89	<1e-11	5.7994
Residual	25.5561

shrinkageplot(m2_zerocorr)

MixedModels.PCA(m2)[:subj]


Principal components based on correlation matrix
 (Intercept)  1.0    .
 days         0.08  1.0

Normalized cumulative variances:
[0.5407, 1.0]

Component loadings
                PC1    PC2
 (Intercept)  -0.71  -0.71
 days         -0.71   0.71

If we consider a more complex model, then the change can be much more dramatic:

kb07 = MixedModels.dataset(:kb07)
contrasts = Dict(:subj => Grouping(),
                 :item => Grouping(),
                 :spkr => EffectsCoding(),
                 :prec => EffectsCoding(),
                 :load => EffectsCoding() )
m_kb07 = fit(MixedModel, @formula(rt_raw ~ 1 + spkr * prec * load + (1 + spkr + prec + load|subj) + (1 + spkr + prec + load|item)), kb07; contrasts)

┌ Warning: ProgressMeter by default refresh meters with additional information in IJulia via `IJulia.clear_output`, which clears all outputs in the cell. 
│  - To prevent this behaviour, do `ProgressMeter.ijulia_behavior(:append)`. 
│  - To disable this warning message, do `ProgressMeter.ijulia_behavior(:clear)`.
└ @ ProgressMeter /home/phillip/.julia/packages/ProgressMeter/sN2xr/src/ProgressMeter.jl:618
Minimizing 601   Time: 0:00:00 ( 1.55 ms/it)
  objective:  29652.62866326455

	Est.	SE	z	p	σ_subj	σ_item
(Intercept)	2225.5754	84.9527	26.20	<1e-99	328.0674	393.4597
spkr: old	74.0912	23.5388	3.15	0.0016	43.1209	45.1421
prec: maintain	-377.3128	59.7116	-6.32	<1e-09	118.9270	302.2964
load: yes	102.0965	24.6618	4.14	<1e-04	49.7595	58.4598
spkr: old & prec: maintain	-28.1656	21.3820	-1.32	0.1878
spkr: old & load: yes	26.9646	21.3821	1.26	0.2073
prec: maintain & load: yes	-18.5510	21.3821	-0.87	0.3856
spkr: old & prec: maintain & load: yes	15.6366	21.3820	0.73	0.4646
Residual	904.3472

MixedModels.PCA(m_kb07)[:subj]


Principal components based on correlation matrix
 (Intercept)      1.0    .     .     .
 spkr: old        1.0   1.0    .     .
 prec: maintain  -1.0  -1.0   1.0    .
 load: yes        1.0   1.0  -1.0   1.0

Normalized cumulative variances:
[1.0, 1.0, 1.0, 1.0]

Component loadings
                   PC1    PC2    PC3    PC4
 (Intercept)     -0.5   -0.38  -0.33  -0.71
 spkr: old       -0.5   -0.38  -0.33   0.71
 prec: maintain   0.5    0.09  -0.86   0.0
 load: yes       -0.5    0.84  -0.2    0.0

The effective dimensionality can be seen in the way that the random effects collapse into lines (i.e.. a 1-D object) within the majority of the panels (each representing a 2-D plane).

shrinkageplot(m_kb07, :subj)

m_kb07zc = fit(MixedModel, @formula(rt_raw ~ 1 + spkr * prec * load + zerocorr(1 + spkr + prec + load | subj) + zerocorr(1 + spkr + prec + load | item)), kb07; contrasts)

	Est.	SE	z	p	σ_subj	σ_item
(Intercept)	2225.6609	85.7868	25.94	<1e-99	330.9087	397.6761
spkr: old	74.1345	21.4927	3.45	0.0006	0.0000	0.0000
prec: maintain	-377.2695	59.6939	-6.32	<1e-09	113.3176	303.1632
load: yes	102.0110	22.9634	4.44	<1e-05	46.3311	29.4211
spkr: old & prec: maintain	-28.0802	21.4927	-1.31	0.1914
spkr: old & load: yes	26.9213	21.4927	1.25	0.2104
prec: maintain & load: yes	-18.5943	21.4927	-0.87	0.3870
spkr: old & prec: maintain & load: yes	15.5512	21.4927	0.72	0.4693
Residual	909.0068

MixedModels.PCA(m_kb07zc)[:subj]


Principal components based on correlation matrix
 (Intercept)     1.0  .    .    .
 spkr: old       0.0  0.0  .    .
 prec: maintain  0.0  0.0  1.0  .
 load: yes       0.0  0.0  0.0  1.0

Normalized cumulative variances:
[0.3333, 0.6667, 1.0, 1.0]

Component loadings
                  PC1   PC2   PC3     PC4
 (Intercept)     1.0   0.0   0.0     0.0
 spkr: old       0.0   0.0   0.0   NaN
 prec: maintain  0.0   0.0   1.0     0.0
 load: yes       0.0   1.0   0.0     0.0

When we force the correlations to be zero, we can no longer get diagonal lines – we we can only get horizontal or vertical lines within each panel. Diagonal lines correspond to non zeor correlations between two variance components.

shrinkageplot(m_kb07zc, :subj)