Using your mixed models to the fullest: moving beyond ANOVA-style thinking

Phillip Alday

The Dirty Secret

ANOVA is regression.

More precisely, it’s a computational trick to efficiently compute a particular type of nested model comparison under additional assumptions.

It wasn’t a bad idea back in the slide-rules and mechanical adding machine days. But all the extra assumptions (e.g. sphericity) and shortcuts aren’t worth it with today’s computing hardware.

We’ve moved forward in computation but not thinking

We stop learning statistics after we finish our doctorates.

But we use statistics like our mentors. And they learned ANOVA. So even when we advance the methods, we don’t advance the thinking and the explaining.

Leaving ANOVA’s legacy behind

Think in terms of effect sizes and not p-values

Mixed models give you an explicit estimate of effect size: the coefficient estimate!
If you want standardized effect sizes, then you can also standardize your coefficient.
Confidence intervals give strictly more information than p-values and can also be used for significance testing.

Think in terms of specific hypotheses and comparisons

ANOVA encourages thinking in a weird omnibus + posthoc test framework
- “There was a significant difference somewhere between these 3 groups”
- let’s follow that up with t-tests amongst all possible pairs
Instead, we should be thinking of specific comparisons that we care about.
Moreover, we can do all of this in a single step instead of in two stages.

Learn to love and leverage contrast coding

Got a hypothesis that A > B > C? Test that hypothesis directly!
Got a hypothesis that A != B and A != C? Test that!
Make the intercept interesting again!
Too many options to discuss here, but contrast coding and centering and scaling of covaries should be part of your design (and not specifying it makes your analysis unintepretable Brehm & Alday, 2022)

Treat population-/group-level and individual analyses holistically

in addition to the fixed effects, i.e. “population-level” estimates, mixed models also provide predictions about the individual grouping levels
these can be interpreted like individual estimates, but are technically predictions
there are sometimes called conditional modes, conditional means, or the best linear unbiased predictions (BLUPs)

General model fit

	Est.	SE	z	p	σ_subj
(Intercept)	251.4051	6.6323	37.91	<1e-99	23.7805
days	10.4673	1.5022	6.97	<1e-11	5.7168
Residual	25.5918

BLUPs

18×3 DataFrame

Row	subj	(Intercept)	days
	String	Float64	Float64
1	S308	2.81582	9.07551
2	S309	-40.0484	-8.64408
3	S310	-38.4331	-5.5134
4	S330	22.8321	-4.65872
5	S331	21.5498	-2.94449
6	S332	8.81554	-0.235201
7	S333	16.4419	-0.158809
8	S334	-6.99667	1.03273
9	S335	-1.03759	-10.5994
10	S337	34.6663	8.63238
11	S349	-24.558	1.06438
12	S350	-12.3345	6.47168
13	S351	4.274	-2.95533
14	S352	20.6222	3.56171
15	S369	3.25854	0.871711
16	S370	-24.7101	4.6597
17	S371	0.723262	-0.971053
18	S372	12.1189	1.3107

Subject-level predictions

Comparison of individual regression vs mixed model

Embrace continuous covariates

ANOVA forces discrete thinking, both in terms of significance and in terms of predictors
Mixed models can handle continuous measures
This allows you to control for additional potential confounders:
- screen brightness
- reaction time
- presentation sequence
You dont have to assume a constant a priori impact of baseline (Alday, 2019)

(fin*)

thank you for attention

any questions?

References

https://embraceuncertaintybook.com/

Alday, P. M. (2019). How much baseline correction do we need in ERP research? Extended GLM model can replace baseline correction while lifting its limits. Psychophysiology, 56(12), e13451. https://doi.org/https://doi.org/10.1111/psyp.13451

Brehm, L., & Alday, P. M. (2022). Contrast coding choices in a decade of mixed models. Journal of Memory and Language, 125, 104334. https://doi.org/https://doi.org/10.1016/j.jml.2022.104334