Two-sample Kolmogorov-Smirnov test in R
The two-sample Kolmogorov-Smirnov test is used for comparing two independent samples to determine whether they come from the same distribution.
Two-sample Kolmogorov-Smirnov test checks the null hypothesis that two independent samples comes from same continuous probability distribution against the alternative hypothesis that two independent samples does not come from same continuous probability distribution.
In R, you can perform two-sample Kolmogorov-Smirnov test using built-in ks.test()
function.
The general syntax of ks.test()
looks like this:
# two-sample Kolmogorov-Smirnov test
ks.test(x, y)
Where, x
and y
are two independent sample datasets
Note: The Kolmogorov-Smirnov test is only valid for the continuous distribution
The following examples demonstrate how to perform two-sample Kolmogorov-Smirnov test in R
Example 1
Suppose we have a two datasets that follows a normal distribution,
# generate random dataset
x = rnorm(50)
y = rnorm(50)
Now, check whether datasets x
and y
comes from a same distribution using a two-sample Kolmogorov-Smirnov test.
# two-sample Kolmogorov-Smirnov test
ks.test(x, y)
# output
Exact two-sample Kolmogorov-Smirnov test
data: x and y
D = 0.12, p-value = 0.8693
alternative hypothesis: two-sided
As the p value (D = 0.12, p = 0.8693) obtained from the two-sample Kolmogorov-Smirnov test is greater than the significance level (0.05), we fail to reject the null hypothesis and conclude that the two datasets come from the same distribution.
Example 2
Suppose we have two datasets that come from different distributions,
# generate random dataset from normal distribution
x = rnorm(50)
# generate random dataset from uniform distribution
y = runif(50)
Now, check whether these two datasets come from same distribution using a two-sample Kolmogorov-Smirnov test.
# one-sample Kolmogorov-Smirnov test
ks.test(x, y)
# output
Exact two-sample Kolmogorov-Smirnov test
data: x and y
D = 0.54, p-value = 4.929e-07
alternative hypothesis: two-sided
As the p value (D = 0.54, p < 0.05) obtained from the two-sample Kolmogorov-Smirnov test is lesser than the significance level (0.05), we reject the null hypothesis and conclude that the two datasets does not come from same distribution.
Related: one-sample Kolmogorov-Smirnov test in R
Enhance your skills with courses on Statistics and R
- Introduction to Statistics
- R Programming
- Data Science: Foundations using R Specialization
- Data Analysis with R Specialization
- Getting Started with Rstudio
- Applied Data Science with R Specialization
- Statistical Analysis with R for Public Health Specialization
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.