How to Use describe()
Function in R
The describe()
is a helpful function
for summarising the descriptive statistics for a data frame or matrix.
In contrast to the summary()
function, the describe()
function produces additional
descriptive statistics such as sample count, range, standard error (se), standard deviation (sd), trimmed mean, and skew.
The describe()
function is available in the psych
R package and the basic syntax for the describe()
function is,
describe(df)
In the above syntax, the df
could be a data frame or matrix.
The following two examples illustrate how to use a describe()
function to summarise the results for a data frame or
matrix.
1. Summary statistics for the data frame
The following example shows how to use the describe()
function on a data frame to summarise the statistical results.
# load package
library(psych)
# load example dataset
df <- read.table("https://reneshbedre.github.io/assets/posts/anova/onewayanova.txt",
header = TRUE)
# view data frame
df
A B C D
1 25 45 30 54
2 30 55 29 60
3 28 29 33 51
4 36 56 37 62
5 29 40 27 73
# get descriptive statistics
describe(df)
vars n mean sd median trimmed mad min max range skew kurtosis se
A 1 5 29.6 4.04 29 29.6 1.48 25 36 11 0.49 -1.39 1.81
B 2 5 45.0 11.20 45 45.0 14.83 29 56 27 -0.27 -1.85 5.01
C 3 5 31.2 3.90 30 31.2 4.45 27 37 10 0.39 -1.72 1.74
D 4 5 60.0 8.51 60 60.0 8.90 51 73 22 0.41 -1.61 3.81
If there are character variable in a data frame, the describe()
function convert them into a numeric variable and summarise
the descriptive statistic.
Note: By default, the
describe()
function drops NA values while providing a descriptive statistical summary of a data frame.
2. Summary statistics for matrix
The describe()
function returns descriptive summary statistics for each column of the
matrix (similar to a data frame).
If you convert a data frame to the matrix, the factor columns (characters) are converted to integer values.
# load package
library(psych)
# load example dataset
df <- read.table("https://reneshbedre.github.io/assets/posts/anova/onewayanova.txt", header = TRUE)
# convert to matrix
df_mat = data.matrix(df)
# get summary statistics
describe(df_mat)
vars n mean sd median trimmed mad min max range skew kurtosis se
A 1 5 29.6 4.04 29 29.6 1.48 25 36 11 0.49 -1.39 1.81
B 2 5 45.0 11.20 45 45.0 14.83 29 56 27 -0.27 -1.85 5.01
C 3 5 31.2 3.90 30 31.2 4.45 27 37 10 0.39 -1.72 1.74
D 4 5 60.0 8.51 60 60.0 8.90 51 73 22 0.41 -1.61 3.81
Enhance your skills with statistical courses using R
- Statistics with R Specialization
- Data Science: Foundations using R Specialization
- Data Analysis with R Specialization
- Understanding Clinical Research: Behind the Statistics
- Introduction to Statistics
- R Programming
- Getting Started with Rstudio
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.