37Multiple comparisons in One-way ANOVA, pt. 2: planned contrasts
Author
Tehran Davis
Published
November 14, 2023
37.1 TLDR; Steps for running an a-priori contrasts:
This walkthrough requires the following to be installed / loaded in R
Be sure that the column that contains your IV is indeed being treated as a factor. If it is levels(IV) will list your levels, also the column containing the IV will contain <fct>. If not mutate() as column with your IV as a factor, ex: newCol = factor(oldCol)
check the order of your IV using levels(). If you need to reorder, use fct_relevel()
create your contrast matrix according to the levels obtained in Step 1
build an ANOVA model
run your contrasts using emmeans(), inputting
your ANOVA model from Step 3
your contrasts from Step 2
your familywise correction (if any)
Example (run line-by-line):
# Preliminaries## load in datadataset <-read_table("https://www.uvm.edu/~statdhtx/methods8/DataFiles/Tab12-1.dat")
── Column specification ────────────────────────────────────────────────────────
cols(
ID = col_character(),
Group = col_double(),
Time = col_double()
)
## Group is numerically coded. Fixing this and turning to a factordataset <- dataset %>%mutate(Group_fct =fct_recode(factor(Group), MS ="1",MM ="2",SS ="3",SM ="4",McM ="5") )# Step 1: Check levelslevels(dataset$Group_fct)
[1] "MS" "MM" "SS" "SM" "McM"
# Step 2: build contrasts## MS v. MMcontrast1 <-c(1,-1,0,0,0) ## MS + MM v. SS + SM + McMcontrast2 <-c(1/2,1/2,-1/3,-1/3,-1/3)my_contrasts <-list("MS v. MM"= contrast1,"MS + MM v. SS + SM + McM"= contrast2 )# Step 3morphine_mdl <-aov(Time~Group_fct, data = dataset) # Step 4emmeans(object = morphine_mdl,specs =~Group_fct, adjust="bonf") %>%contrast(method = my_contrasts)
contrast estimate SE df t.ratio p.value
MS v. MM -6.0 2.83 35 -2.121 0.0411
MS + MM v. SS + SM + McM -14.3 1.83 35 -7.851 <.0001
37.2 A few considerations regarding planned contrasts
If you have reasons to predict differences between particular sets of group means that are theory-driven, then you may perform a priori or planned contrasts. Logically, planned contrasts are similar to post-hoc tests in that we are comparing against two means, but there are some differences that make planned contrasts more powerful.
Since your predictions are made prior to collecting data you, technically do not need to get a significant result on the omnibus ANOVA to run your contrasts. However, when doing post-hoc tests, if the omnibus ANOVA fails to reject the null, convention is that you are not permitted to run follow-up post hoc tests.
Since you are making predictions prior to seeing the outcome of your observed data, then you are safe to make a limited number of comparisons without the charge of cherry-picking. For example, if you have predicted-ahead that there would be differences between groups “MS” and “McM”, or groups “MM” and “MS” then you are free to run those comparisons and those comparisons-only. This is especially useful, since by limiting the number of comparisons you can keep the required corrections relatively minimal. This allows you to effectively reduce the problem of Type I error inflation while limiting the possibility of Type II error. For example, recall that using a Bonferonni correction on this data in the post-hoc case mandates that I use an adjusted \(\alpha\) of .005 (.05/10 comparisons), even if I was only really interested in these two comparisons. Here, I am allowed to only perform these two, so my adjusted \(p\) is .025.
Depending on the number of comparisons, you may be justified in not performing any \(p\) correction at all. For example some recommend no need for correction if the number of contrasts is low or when the comparisons are complementary (e.g. orthogonal). See here, here, and here for discussion of this issue.
We can easily perform a variety of comparisons using planned contrasts. For example, say we are interested in whether MS is different from the mean of the remaining groups (McM+MM+SM+SS), or that MM+MS is different from McM+SM+SS. We can test this using planned contrasts.
37.3 The logic of planned contrasts
37.4 Performing planned contrasts in R
As outlined in both Winter and Field, planned contrasts begin by creating contrast weights. The idea with contrast weights is that the groups that are being compared should sum to equal -1 and +1 respectively, resulting in a null test against 0 (i.e. the two weighted means are equal). Any groups that are not being included in the comparison should be assigned weights of 0. You’ve already seen contrasts before, when looking at the aov() ANOVA output. We remarked that this output allows you to make a comparison of each group against the control group.
First let’s load in the data and recode the factors:
## load in datadataset <-read_table("https://www.uvm.edu/~statdhtx/methods8/DataFiles/Tab12-1.dat")
── Column specification ────────────────────────────────────────────────────────
cols(
ID = col_character(),
Group = col_double(),
Time = col_double()
)
## Group is numerically coded. Fixing this and turning to a factordataset <- dataset %>%mutate(Group_fct =fct_recode(factor(Group), MS ="1",MM ="2",SS ="3",SM ="4",McM ="5") )
Looking at Siegel’s data we see that MS defaults as the control:
# Step 1: Check levelslevels(dataset$Group_fct)
[1] "MS" "MM" "SS" "SM" "McM"
We can check which contrasts any model we run will default to, by invoking contrasts():
contrasts(dataset$Group_fct)
MM SS SM McM
MS 0 0 0 0
MM 1 0 0 0
SS 0 1 0 0
SM 0 0 1 0
McM 0 0 0 1
These sorts of contrasts are known as treatment contrasts… measuring the effect of each treatment against a control. While there are several types of contrasts built into R we can create custom contrasts as well.
For example assume we are comparing “MS” to “MM” and were not concerned with the remaining groups. Then our contrast weights would be:
MS = +1
MM = -1
McM = SM = SS = 0
How about we want to compare McM + MM against the remaining three groups?
MS + MM = 1
McM + SM + SS = -1
From there we distribute the weight equally between the number of groups on each side of the contrast, so
MS = +1/2; MM = +1/2
McM = -1/3; SM = -1/3; SS = -1/3
in R we can construct each of these contrasts like so:
# checking the order of groups:levels(dataset$Group_fct)
[1] "MS" "MM" "SS" "SM" "McM"
# building contrasts# MS v. MMcontrast1 <-c(1,-1,0,0,0) # MS + MM v. SS + SM + McMcontrast2 <-c(1/2,1/2,-1/3,-1/3,-1/3)
When performing planned contrasts in R I recommend using the emmeans package rather than the base package examples in the Field text. For those that are interested as to why, in my experience the base method has a hard time dealing with non-orthogonal contrasts.
To perform the contrast requires 3 steps:
run your omnibus ANOVA aov() model and save it as an object
input your aov() model into the emmeans() function, specifying the name of your IV and save it to an object
submit your emmeans object to the contrasts() function indicating what your contrasts are.
estimate: the difference in means between my contrasted groups
SE: a measure of variability
t.ratio: obtained from the t-test between groups
p.value: the resulting p-value
Based upon these results I conclude that “MS” and “MM” are significantly different from one another. I make no claims about the other three conditions.
Recall that contrast2 was testing [MS + MM] v. [SS + SM + McM]. If this was my only contrast I could run it independently as above. However, we can also test multiple contrasts simultaneously with a single test. We just add additional contrasts to the list in the method. So, for both contrast1 and contrast2:
One thing you may have noticed above is that your contrasts aren’t labeled very transparently. Looking at the output you would have to remember what was being contrasted in contrast1 and contrast2. You can save yourself a little headache if you label your contrasts while constructing the matrix. Let’s say I wanted to run these four contrasts based upon my levels:
From here you can modify the previous code to assign names to the rows in your contrast matrix. For the sake of ease, I like to create an object, my_contrasts that includes the list of all contrasts. FWIW, on of the annoying things about the contrast function is that it requires the method argument of contrasts to be in a list(), even if its just one contrast.
my_contrasts <-list("MM v. McM"= contrast1,"MS v. SS"= contrast2,"MM v. SS"= contrast3,"MS+MM+SS v. SM+McM"= contrast4 )my_contrasts
$`MM v. McM`
[1] 0 -1 0 0 1
$`MS v. SS`
[1] -1 0 1 0 0
$`MM v. SS`
[1] 0 1 -1 0 0
$`MS+MM+SS v. SM+McM`
[1] -0.3333333 -0.3333333 -0.3333333 0.5000000 0.5000000
Now if I run my contrasts it includes my descriptions of what the contrasts are:
contrast(emm_mdl, method = my_contrasts)
contrast estimate SE df t.ratio p.value
MM v. McM 19.0 2.83 35 6.718 <.0001
MS v. SS 7.0 2.83 35 2.475 0.0183
MM v. SS -1.0 2.83 35 -0.354 0.7258
MS+MM+SS v. SM+McM 18.2 1.83 35 9.950 <.0001
Note that this defaults to the holm correction. Can we think of anyway to run a “bonferroni” adjustment to the same comparisons?
contrast estimate SE df t.ratio p.value
MM v. McM 19.0 2.83 35 6.718 <.0001
MS v. SS 7.0 2.83 35 2.475 0.0733
MM v. SS -1.0 2.83 35 -0.354 1.0000
MS+MM+SS v. SM+McM 18.2 1.83 35 9.950 <.0001
P value adjustment: bonferroni method for 4 tests
37.4.2 Effect size for planned contrasts
This is some text that’s recycled from the post-hoc test walkthrough. I’m going to shout again one more time with feeling. Typically when reporting the effect size off the difference between two means we use Cohen’s \(d\). However, calculating Cohen’s $d$ in a planned contrast is slightly more involved than the method used for a regular t-test. This is because with a regular t-test you only have 2 means from 2 samples that you have collected. In the case of Planned Contrasts in ANOVA, while you are only comparing two means, those means are nested within a larger group (e.g., comparing MS and MM, we still need to account for the fact that we also collected samples from SS, SM, and McM) or may be derived from multiple samples (e.g., contrasting the mean of MS + MM against the mean of SS + SM + McM). Simply put, in our calculations we need to account for the influence of all of our collected groups. This is done by placing the contrasted difference in the context of the Root Mean Square Error, or the square root of the Mean Square Error of the residuals in our ANOVA model.
Simply put, in our calculations we need to account for the influence of all of our collected groups. This is done by placing the contrasted difference in the context of the Root Mean Square Error, or the square root of the Mean Square Error of the residuals in our ANOVA model. Recall that typically Cohen’s \(d\) is the difference between the two means divided by their pooled standard deviation. Here, \(d\) is the difference between the two means divided by sigma, or the estimated standard deviation of the errors of the linear model.
To do this we’re going to need two things from the original model. Let’s take a look at the model summary:
morphine_mdl %>%summary()
Df Sum Sq Mean Sq F value Pr(>F)
Group_fct 4 3498 874.4 27.32 2.44e-10 ***
Residuals 35 1120 32.0
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The values line that is important to us is the Residual standard error ___ on __ degrees of freedom—in this case the values 5.657 and 35 respectively.
Assuming you have run your comparisons using emmeans() you can calculate your effect sizes for each comparison using emmeans::eff_size().
eff_size() takes three arguments:
the emmeans object
the estimated standard deviation of the errors of the linear model, sigma
the residual degrees of freedom from the model, df.residual
a vector containing our contrasts— e.g., c(-1/2,-1/2,1/3,1/3,1/3)
The functions sigma() and df.residual() allow us to get the relevant information directly from the model. A typical call would look something like this
# create and emmeans object specifying your factoremm_mdl <-emmeans(emm_mdl, specs =~Group_fct)# specify your contrasts, this goes in the method callmy_contrasts <-list("MM v. McM"= contrast1,"MS v. SS"= contrast2,"MM v. SS"= contrast3,"MS+MM+SS v. SM+McM"= contrast4 )# run your contrastsplanned_contrasts <-contrast(emm_mdl, # emm_mdl from abovemethod = my_contrasts,adjust ="bonferroni" )# use the saved object in the following functioneff_size(emm_mdl,sigma =sigma(morphine_mdl), df.residual(morphine_mdl),method=my_contrasts )
contrast effect.size SE df lower.CL upper.CL
MM v. McM 3.359 0.641 35 2.057 4.660
MS v. SS 1.237 0.521 35 0.179 2.296
MM v. SS -0.177 0.500 35 -1.193 0.839
MS+MM+SS v. SM+McM 3.211 0.501 35 2.193 4.230
sigma used for effect sizes: 5.657
Confidence level used: 0.95
37.5 Reporting your results
When reporting your results, we typically report:
the omnibus ANOVA
a statement on what multiple comparisons were run and how corrected
note which comparisons were significant.
Since our comparisons, if corrected, rely on adjusted p.values its kosher to simply state that \(p<\alpha\). If you elect to use the actual p values, then you need to note that they are corrected.
For planned comparisons, we might say:
Our prevailing hypothesis predicted that response latency for groups SS, MM, and MS would be significantly less than groups SM and McM. To test this hypothesis we performed a planned contrast (no corrections). Our results revealed statistically significant difference between these groups, \(t\)(35) = 9.95, \(p\) < .001, \(d\) = 3.21, where the response latency in MS, MM, and SS was lower than SM and McM.
Note that the \(df\) in the t-test are the error \(df\) from the omnibus model (35).