Two-sample Kolmogorov-Smirnov test in R

Renesh Bedre 2 minute read

The two-sample Kolmogorov-Smirnov test is used for comparing two independent samples to determine whether they come from the same distribution.

Two-sample Kolmogorov-Smirnov test checks the null hypothesis that two independent samples comes from same continuous probability distribution against the alternative hypothesis that two independent samples does not come from same continuous probability distribution.

In R, you can perform two-sample Kolmogorov-Smirnov test using built-in ks.test() function.

The general syntax of ks.test() looks like this:

# two-sample Kolmogorov-Smirnov test
ks.test(x, y)

Where, x and y are two independent sample datasets

Note: The Kolmogorov-Smirnov test is only valid for the continuous distribution

The following examples demonstrate how to perform two-sample Kolmogorov-Smirnov test in R

Example 1

Suppose we have a two datasets that follows a normal distribution,

# generate random dataset
x = rnorm(50)
y = rnorm(50)

Now, check whether datasets x and y comes from a same distribution using a two-sample Kolmogorov-Smirnov test.

# two-sample Kolmogorov-Smirnov test
ks.test(x, y)

# output
	Exact two-sample Kolmogorov-Smirnov test

data:  x and y
D = 0.12, p-value = 0.8693
alternative hypothesis: two-sided

As the p value (D = 0.12, p = 0.8693) obtained from the two-sample Kolmogorov-Smirnov test is greater than the significance level (0.05), we fail to reject the null hypothesis and conclude that the two datasets come from the same distribution.

Example 2

Suppose we have two datasets that come from different distributions,

# generate random dataset from normal distribution
x = rnorm(50)
# generate random dataset from uniform distribution
y = runif(50)

Now, check whether these two datasets come from same distribution using a two-sample Kolmogorov-Smirnov test.

# one-sample Kolmogorov-Smirnov test
ks.test(x, y)

# output
	Exact two-sample Kolmogorov-Smirnov test

data:  x and y
D = 0.54, p-value = 4.929e-07
alternative hypothesis: two-sided

As the p value (D = 0.54, p < 0.05) obtained from the two-sample Kolmogorov-Smirnov test is lesser than the significance level (0.05), we reject the null hypothesis and conclude that the two datasets does not come from same distribution.

Enhance your skills with courses on Statistics and R

This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.

Share on

Twitter Facebook LinkedIn

Two-sample Kolmogorov-Smirnov test in R

Example 1

Example 2

Enhance your skills with courses on Statistics and R

Share on

You may also enjoy

Calculate Coverage From BAM File

Python: Why VIF Return Inf Value?

Find Max and Min Sequence Length in Fasta

Get Non-overlapping Portion Between Two Regions in bedtools