How to Perform One-Way ANOVA in R (With Example Dataset)

Renesh Bedre    3 minute read

The one-way ANOVA (Analysis of Variance) is used for determining statistical differences in more than two groups by comparing their group means.

The one-way ANOVA is also known as one-factor ANOVA as there is only one independent variable (factor or group variable) to analyze.

A one-way ANOVA tests the null hypothesis that group means are equal against the alternative hypothesis that group means are not equal (i.e. there is a significant difference between at least one group and the others).

You can use following code to perform one-way ANOVA in R:

# model
model <- aov(y ~ x, data = df)

# view ANOVA summary
summary(model)

Where,

Parameter Description  
y Response variable (should be continuous variable)  
x Group variable  
df Data frame containing the group and response variable  

The following example illustrates how to use one-way ANOVA for analyzing the group differences.

How to Perform One-Way ANOVA in R

For example, a researcher wants to analyze whether plant height differs among plant genotypes. The researcher collects plant height data for four plant genotypes.

The researcher have following Null and Alternative hypotheses:

Null Hypothesis: The plant height is equal among plant genotypes i.e. the mean of plant height is equal
Alternative hypothesis: The plant height is not equal among plant genotypes i.e. the mean of plant height is significantly different

Here, the alternative hypothesis is two-side as the plant height can be lesser or greater in one plant genotype than in another genotypes.

The following ANOVA code shows how to perform one-way ANOVA in R:

Load and view the dataset,

# load dataset
df <- read.csv("https://reneshbedre.github.io/assets/posts/anova/one_way_anova.csv")

# view five rows of data frame
head(df)

  genotype height
1        A      5
2        A      6
3        A      7
4        A      8
5        A      8
6        B     12

Check descriptive statistics (mean and variance) for each plant genotype,

# load package
library(dplyr)

# get descriptive statistics
df  %>% group_by(genotype) %>% summarise(mean = mean(height), var = var(height))

# A tibble: 4 × 3
  genotype  mean   var
  <fct>    <dbl> <dbl>
1 A          6.8   1.7
2 B         13.6   2.3
3 C          7     3.5
4 D          7.2   1.7

From the descriptive statistics, we can see that plant height is highest for genotype B and lowest for genotype A. The variance is a roughly similar for all genotypes.

Now, we will perform a one-way ANOVA to check whether these differences in plant height are statistically significant.

Perform a one-way ANOVA and summarise the results using summary() function,

# fit model
model <- aov(height ~ genotype, data = df)

# summary statistics
summary(model)

            Df Sum Sq Mean Sq F value   Pr(>F)    
genotype     3  163.8   54.58   23.73 3.93e-06 ***
Residuals   16   36.8    2.30                     
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

The one-way ANOVA analysis reports the following important statistics for interpretation,

Parameter Value  
F 23.73  
p value 3.93e-06  
Degree of freedom 3 and 16  

According to the one-way ANOVA results, the p value is significant [F(3, 16) = 23.73, p < 0.05]. Hence, we reject the null hypothesis and conclude that plant height among genotypes is significantly different.

Relevant article

Enhance your skills with courses on Statistics and R


This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.