One-sample Kolmogorov-Smirnov test in R
The one-sample Kolmogorov-Smirnov test is used for assessing whether a sample dataset follows a particular theoretical distribution (e.g. standard normal distribution).
One-sample Kolmogorov-Smirnov test checks the null hypothesis that the sample data comes from a particular theoretical distribution against the alternative hypothesis that the sample data does not come from a particular theoretical distribution.
In R, you can perform one-sample Kolmogorov-Smirnov test using built-in ks.test()
function.
The general syntax of ks.test()
looks like this:
# one-sample Kolmogorov-Smirnov test
ks.test(data, "pnorm")
The pnorm
indicates standard normal distribution.
Note: The Kolmogorov-Smirnov test is only valid for the continuous distribution
The following examples demonstrate how to perform one-sample Kolmogorov-Smirnov test in R
Example 1
Suppose we have a dataset that follows a standard normal distribution,
# generate random dataset
data = rnorm(50)
Now, check whether this dataset comes from a standard normal distribution using a one-sample Kolmogorov-Smirnov test.
Note: By default,
ks.test()
function checks against the standard normal distribution (mean=0 and sd=1). If you know the mean and standard deviation of the sample data, you should also pass these values withpnorm
(see example 2 below).
# one-sample Kolmogorov-Smirnov test
ks.test(data, "pnorm")
# output
Exact one-sample Kolmogorov-Smirnov test
data: data
D = 0.094387, p-value = 0.729
alternative hypothesis: two-sided
As the p value (D = 0.09, p = 0.729) obtained from the one-sample Kolmogorov-Smirnov test is greater than the significance level (0.05), we fail to reject the null hypothesis and conclude that the sample data follows a standard normal distribution.
Example 2
Suppose we have a dataset that follows a normal distribution with known mean and standard deviation,
# generate random dataset
data = rnorm(50, mean = 60, sd = 10)
Now, check whether this dataset comes from a normal distribution using a one-sample Kolmogorov-Smirnov test.
# one-sample Kolmogorov-Smirnov test
ks.test(data, "pnorm", mean = 60, sd = 10)
# output
Exact one-sample Kolmogorov-Smirnov test
data: data
D = 0.092274, p-value = 0.7535
alternative hypothesis: two-sided
As the p value (D = 0.09, p = 0.7535) obtained from the one-sample Kolmogorov-Smirnov test is greater than the significance level (0.05), we fail to reject the null hypothesis and conclude that the sample data follows a normal distribution.
Example 3
Suppose we have a dataset that does not follow normal distribution,
# generate random dataset
data = runif(50)
Now, check whether this dataset comes from a normal distribution using a one-sample Kolmogorov-Smirnov test.
# one-sample Kolmogorov-Smirnov test
ks.test(data, "pnorm")
# output
Exact one-sample Kolmogorov-Smirnov test
data: data
D = 0.50439, p-value = 2.653e-12
alternative hypothesis: two-sided
As the p value (D = 0.50, p < 0.05) obtained from the one-sample Kolmogorov-Smirnov test is lesser than the significance level (0.05), we reject the null hypothesis and conclude that the sample data does not follow a normal distribution.
In addition to the one-sample Kolmogorov-Smirnov test, the data normality can also be assessed using the Shapiro-Wilk test and Q-Q plot.
Related: two-sample Kolmogorov-Smirnov test in R
Enhance your skills with courses on Statistics and R
- Introduction to Statistics
- R Programming
- Data Science: Foundations using R Specialization
- Data Analysis with R Specialization
- Getting Started with Rstudio
- Applied Data Science with R Specialization
- Statistical Analysis with R for Public Health Specialization
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.