37  Multiple comparisons in One-way ANOVA, pt. 2: planned contrasts

Author

Tehran Davis

Published

November 14, 2023

37.1 TLDR; Steps for running an a-priori contrasts:

This walkthrough requires the following to be installed / loaded in R

  1. Be sure that the column that contains your IV is indeed being treated as a factor. If it is levels(IV) will list your levels, also the column containing the IV will contain <fct>. If not mutate() as column with your IV as a factor, ex: newCol = factor(oldCol)
  2. check the order of your IV using levels(). If you need to reorder, use fct_relevel()
  3. create your contrast matrix according to the levels obtained in Step 1
  4. build an ANOVA model
  5. run your contrasts using emmeans(), inputting
  • your ANOVA model from Step 3
  • your contrasts from Step 2
  • your familywise correction (if any)

Example (run line-by-line):

# Preliminaries
## load in data
dataset <- read_table("https://www.uvm.edu/~statdhtx/methods8/DataFiles/Tab12-1.dat")

── Column specification ────────────────────────────────────────────────────────
cols(
  ID = col_character(),
  Group = col_double(),
  Time = col_double()
)
## check data, identify columns
dataset
# A tibble: 40 × 3
   ID    Group  Time
   <chr> <dbl> <dbl>
 1 01        1     3
 2 02        1     5
 3 03        1     1
 4 04        1     8
 5 05        1     1
 6 06        1     1
 7 07        1     4
 8 08        1     9
 9 09        2     2
10 10        2    12
# ℹ 30 more rows
## Group is numerically coded. Fixing this and turning to a factor
dataset <- 
  dataset %>% 
  mutate(Group_fct = fct_recode(factor(Group), 
                                MS = "1",
                                MM = "2",
                                SS = "3",
                                SM = "4",
                                McM = "5")
  )
  
# Step 1: Check levels
levels(dataset$Group_fct)
[1] "MS"  "MM"  "SS"  "SM"  "McM"
# Step 2: build contrasts
## MS v. MM
contrast1 <- c(1,-1,0,0,0) 

## MS + MM v. SS + SM + McM
contrast2 <- c(1/2,1/2,-1/3,-1/3,-1/3)

my_contrasts <- list("MS v. MM" = contrast1,
                      "MS + MM v. SS + SM + McM" = contrast2
                     )

# Step 3
morphine_mdl <- aov(Time~Group_fct, data = dataset) 

# Step 4
emmeans(object = morphine_mdl,specs = ~Group_fct, adjust="bonf") %>% 
  contrast(method = my_contrasts)
 contrast                 estimate   SE df t.ratio p.value
 MS v. MM                     -6.0 2.83 35  -2.121  0.0411
 MS + MM v. SS + SM + McM    -14.3 1.83 35  -7.851  <.0001

37.2 A few considerations regarding planned contrasts

If you have reasons to predict differences between particular sets of group means that are theory-driven, then you may perform a priori or planned contrasts. Logically, planned contrasts are similar to post-hoc tests in that we are comparing against two means, but there are some differences that make planned contrasts more powerful.

  • Since your predictions are made prior to collecting data you, technically do not need to get a significant result on the omnibus ANOVA to run your contrasts. However, when doing post-hoc tests, if the omnibus ANOVA fails to reject the null, convention is that you are not permitted to run follow-up post hoc tests.

  • Since you are making predictions prior to seeing the outcome of your observed data, then you are safe to make a limited number of comparisons without the charge of cherry-picking. For example, if you have predicted-ahead that there would be differences between groups “MS” and “McM”, or groups “MM” and “MS” then you are free to run those comparisons and those comparisons-only. This is especially useful, since by limiting the number of comparisons you can keep the required corrections relatively minimal. This allows you to effectively reduce the problem of Type I error inflation while limiting the possibility of Type II error. For example, recall that using a Bonferonni correction on this data in the post-hoc case mandates that I use an adjusted \(\alpha\) of .005 (.05/10 comparisons), even if I was only really interested in these two comparisons. Here, I am allowed to only perform these two, so my adjusted \(p\) is .025.

  • Depending on the number of comparisons, you may be justified in not performing any \(p\) correction at all. For example some recommend no need for correction if the number of contrasts is low or when the comparisons are complementary (e.g. orthogonal). See here, here, and here for discussion of this issue.

  • We can easily perform a variety of comparisons using planned contrasts. For example, say we are interested in whether MS is different from the mean of the remaining groups (McM+MM+SM+SS), or that MM+MS is different from McM+SM+SS. We can test this using planned contrasts.

37.3 The logic of planned contrasts

37.4 Performing planned contrasts in R

As outlined in both Winter and Field, planned contrasts begin by creating contrast weights. The idea with contrast weights is that the groups that are being compared should sum to equal -1 and +1 respectively, resulting in a null test against 0 (i.e. the two weighted means are equal). Any groups that are not being included in the comparison should be assigned weights of 0. You’ve already seen contrasts before, when looking at the aov() ANOVA output. We remarked that this output allows you to make a comparison of each group against the control group.

First let’s load in the data and recode the factors:

## load in data
dataset <- read_table("https://www.uvm.edu/~statdhtx/methods8/DataFiles/Tab12-1.dat")

── Column specification ────────────────────────────────────────────────────────
cols(
  ID = col_character(),
  Group = col_double(),
  Time = col_double()
)
## Group is numerically coded. Fixing this and turning to a factor
dataset <- 
  dataset %>% 
  mutate(Group_fct = fct_recode(factor(Group), 
                                MS = "1",
                                MM = "2",
                                SS = "3",
                                SM = "4",
                                McM = "5")
  )

Looking at Siegel’s data we see that MS defaults as the control:

# Step 1: Check levels
levels(dataset$Group_fct)
[1] "MS"  "MM"  "SS"  "SM"  "McM"

We can check which contrasts any model we run will default to, by invoking contrasts():

contrasts(dataset$Group_fct)
    MM SS SM McM
MS   0  0  0   0
MM   1  0  0   0
SS   0  1  0   0
SM   0  0  1   0
McM  0  0  0   1

These sorts of contrasts are known as treatment contrasts… measuring the effect of each treatment against a control. While there are several types of contrasts built into R we can create custom contrasts as well.

For example assume we are comparing “MS” to “MM” and were not concerned with the remaining groups. Then our contrast weights would be:

  • MS = +1
  • MM = -1
  • McM = SM = SS = 0

How about we want to compare McM + MM against the remaining three groups?

  • MS + MM = 1
  • McM + SM + SS = -1

From there we distribute the weight equally between the number of groups on each side of the contrast, so

  • MS = +1/2; MM = +1/2
  • McM = -1/3; SM = -1/3; SS = -1/3

in R we can construct each of these contrasts like so:

# checking the order of groups:
levels(dataset$Group_fct)
[1] "MS"  "MM"  "SS"  "SM"  "McM"
# building contrasts

# MS v. MM
contrast1 <- c(1,-1,0,0,0) 

# MS + MM v. SS + SM + McM
contrast2 <- c(1/2,1/2,-1/3,-1/3,-1/3)

When performing planned contrasts in R I recommend using the emmeans package rather than the base package examples in the Field text. For those that are interested as to why, in my experience the base method has a hard time dealing with non-orthogonal contrasts.

To perform the contrast requires 3 steps:

  • run your omnibus ANOVA aov() model and save it as an object
  • input your aov() model into the emmeans() function, specifying the name of your IV and save it to an object
  • submit your emmeans object to the contrasts() function indicating what your contrasts are.
morphine_mdl <- aov(formula = Time~Group_fct,data = dataset)
emm_mdl <- emmeans(morphine_mdl,specs = ~Group_fct)
contrast(emm_mdl, method = list(contrast1))
 contrast          estimate   SE df t.ratio p.value
 c(1, -1, 0, 0, 0)       -6 2.83 35  -2.121  0.0411

The above result tells me:

  • contrast: the defined contrast
  • estimate: the difference in means between my contrasted groups
  • SE: a measure of variability
  • t.ratio: obtained from the t-test between groups
  • p.value: the resulting p-value

Based upon these results I conclude that “MS” and “MM” are significantly different from one another. I make no claims about the other three conditions.

Recall that contrast2 was testing [MS + MM] v. [SS + SM + McM]. If this was my only contrast I could run it independently as above. However, we can also test multiple contrasts simultaneously with a single test. We just add additional contrasts to the list in the method. So, for both contrast1 and contrast2:

contrast(emm_mdl, 
         method = list(contrast1,contrast2),
         )
 contrast                                                              
 c(1, -1, 0, 0, 0)                                                     
 c(0.5, 0.5, -0.333333333333333, -0.333333333333333, -0.333333333333333
 estimate   SE df t.ratio p.value
     -6.0 2.83 35  -2.121  0.0411
    -14.3 1.83 35  -7.851  <.0001

37.4.1 Making the contrasts output easier to read

One thing you may have noticed above is that your contrasts aren’t labeled very transparently. Looking at the output you would have to remember what was being contrasted in contrast1 and contrast2. You can save yourself a little headache if you label your contrasts while constructing the matrix. Let’s say I wanted to run these four contrasts based upon my levels:

levels(dataset$Group_fct)
[1] "MS"  "MM"  "SS"  "SM"  "McM"

Now setting up my contrasts to match:

contrast1 <- c(0,-1,0,0,1)
contrast2 <- c(-1,0,1,0,0)
contrast3 <- c(0,1,-1,0,0)
contrast4 <- c(-1/3,-1/3,-1/3,1/2,1/2)

From here you can modify the previous code to assign names to the rows in your contrast matrix. For the sake of ease, I like to create an object, my_contrasts that includes the list of all contrasts. FWIW, on of the annoying things about the contrast function is that it requires the method argument of contrasts to be in a list(), even if its just one contrast.

my_contrasts <- list("MM v. McM" = contrast1,
                     "MS v. SS" = contrast2,
                     "MM v. SS" = contrast3,
                     "MS+MM+SS v. SM+McM" = contrast4
                     )

my_contrasts
$`MM v. McM`
[1]  0 -1  0  0  1

$`MS v. SS`
[1] -1  0  1  0  0

$`MM v. SS`
[1]  0  1 -1  0  0

$`MS+MM+SS v. SM+McM`
[1] -0.3333333 -0.3333333 -0.3333333  0.5000000  0.5000000

Now if I run my contrasts it includes my descriptions of what the contrasts are:

contrast(emm_mdl, method = my_contrasts)
 contrast           estimate   SE df t.ratio p.value
 MM v. McM              19.0 2.83 35   6.718  <.0001
 MS v. SS                7.0 2.83 35   2.475  0.0183
 MM v. SS               -1.0 2.83 35  -0.354  0.7258
 MS+MM+SS v. SM+McM     18.2 1.83 35   9.950  <.0001

Note that this defaults to the holm correction. Can we think of anyway to run a “bonferroni” adjustment to the same comparisons?

contrast(emm_mdl, 
         method = my_contrasts,
         adjust = "bonferroni"
)
 contrast           estimate   SE df t.ratio p.value
 MM v. McM              19.0 2.83 35   6.718  <.0001
 MS v. SS                7.0 2.83 35   2.475  0.0733
 MM v. SS               -1.0 2.83 35  -0.354  1.0000
 MS+MM+SS v. SM+McM     18.2 1.83 35   9.950  <.0001

P value adjustment: bonferroni method for 4 tests 

37.4.2 Effect size for planned contrasts

This is some text that’s recycled from the post-hoc test walkthrough. I’m going to shout again one more time with feeling. Typically when reporting the effect size off the difference between two means we use Cohen’s \(d\). However, calculating Cohen’s $d$ in a planned contrast is slightly more involved than the method used for a regular t-test. This is because with a regular t-test you only have 2 means from 2 samples that you have collected. In the case of Planned Contrasts in ANOVA, while you are only comparing two means, those means are nested within a larger group (e.g., comparing MS and MM, we still need to account for the fact that we also collected samples from SS, SM, and McM) or may be derived from multiple samples (e.g., contrasting the mean of MS + MM against the mean of SS + SM + McM). Simply put, in our calculations we need to account for the influence of all of our collected groups. This is done by placing the contrasted difference in the context of the Root Mean Square Error, or the square root of the Mean Square Error of the residuals in our ANOVA model.

Simply put, in our calculations we need to account for the influence of all of our collected groups. This is done by placing the contrasted difference in the context of the Root Mean Square Error, or the square root of the Mean Square Error of the residuals in our ANOVA model. Recall that typically Cohen’s \(d\) is the difference between the two means divided by their pooled standard deviation. Here, \(d\) is the difference between the two means divided by sigma, or the estimated standard deviation of the errors of the linear model.

To do this we’re going to need two things from the original model. Let’s take a look at the model summary:

morphine_mdl %>% summary()
            Df Sum Sq Mean Sq F value   Pr(>F)    
Group_fct    4   3498   874.4   27.32 2.44e-10 ***
Residuals   35   1120    32.0                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The values line that is important to us is the Residual standard error ___ on __ degrees of freedom—in this case the values 5.657 and 35 respectively.

Assuming you have run your comparisons using emmeans() you can calculate your effect sizes for each comparison using emmeans::eff_size().

eff_size() takes three arguments:

  • the emmeans object
  • the estimated standard deviation of the errors of the linear model, sigma
  • the residual degrees of freedom from the model, df.residual
  • a vector containing our contrasts— e.g., c(-1/2,-1/2,1/3,1/3,1/3)

The functions sigma() and df.residual() allow us to get the relevant information directly from the model. A typical call would look something like this

# create and emmeans object specifying your factor
emm_mdl <- emmeans(emm_mdl, specs = ~Group_fct)

# specify your contrasts, this goes in the method call
my_contrasts <- list("MM v. McM" = contrast1,
                     "MS v. SS" = contrast2,
                     "MM v. SS" = contrast3,
                     "MS+MM+SS v. SM+McM" = contrast4
                     )

# run your contrasts
planned_contrasts <- contrast(emm_mdl, # emm_mdl from above
                              method = my_contrasts,
                              adjust = "bonferroni"
                              )

# use the saved object in the following function
eff_size(emm_mdl,
         sigma = sigma(morphine_mdl), 
         df.residual(morphine_mdl),
         method=my_contrasts
         )
 contrast           effect.size    SE df lower.CL upper.CL
 MM v. McM                3.359 0.641 35    2.057    4.660
 MS v. SS                 1.237 0.521 35    0.179    2.296
 MM v. SS                -0.177 0.500 35   -1.193    0.839
 MS+MM+SS v. SM+McM       3.211 0.501 35    2.193    4.230

sigma used for effect sizes: 5.657 
Confidence level used: 0.95 

37.5 Reporting your results

When reporting your results, we typically report:

  • the omnibus ANOVA
  • a statement on what multiple comparisons were run and how corrected
  • note which comparisons were significant.

Since our comparisons, if corrected, rely on adjusted p.values its kosher to simply state that \(p<\alpha\). If you elect to use the actual p values, then you need to note that they are corrected.

For planned comparisons, we might say:

Our prevailing hypothesis predicted that response latency for groups SS, MM, and MS would be significantly less than groups SM and McM. To test this hypothesis we performed a planned contrast (no corrections). Our results revealed statistically significant difference between these groups, \(t\)(35) = 9.95, \(p\) < .001, \(d\) = 3.21, where the response latency in MS, MM, and SS was lower than SM and McM.

Note that the \(df\) in the t-test are the error \(df\) from the omnibus model (35).