Solving the Most Common R Error: ‘Duplicate row.names are not allowed’

Renesh Bedre 2 minute read

When you read a CSV file using the read.csv() base R function, you may encounter error duplicate 'row.names' are not allowed while using row.names parameter.

The R data frame does not allow duplicated values in a column specified by row.names parameter.

This error occurs when there are duplicated values in a column specified by row.names parameter.

For example, if you have the following dataset in a CSV format,

name,height,weight
x,1.80,65
y,1.62,67
z,1.55,62
z,1.56,63

In this dataset, one of the values in the name column is duplicated. If you import this dataset using the read.csv() function with the name column specified as row.names, you will get a duplicate 'row.names' are not allowed error.

Let’s replicate the error using the above dataset,

df = read.csv('https://reneshbedre.github.io/assets/posts/other/data.csv', row.names = "name")

Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  duplicate 'row.names' are not allowed

To fix duplicate row.names are not allowed error, you can use following two solutions.

Solution 1: Import without `row.names` parameter

In this solution, you can fix this error by importing a CSV file without specifying a row.names or assign row.names = NULL. This will assign numerical values to the row names.

df <- read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv") 
# same as df = read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv", row.names = NULL) 

# view data frame
df

  name height weight
1    x   1.80     65
2    y   1.62     67
3    z   1.55     62
4    z   1.56     63

Now, if you want to set the first column (name) as row names, you can try to make values in the name column as unique values using make.names() function.

# make values in name column unique
uniq_name <- make.names(df$name, unique = TRUE)
row.names(df) <- uniq_name

# view data frame
df

   name height weight
x      x   1.80     65
y      y   1.62     67
z      z   1.55     62
z.1    z   1.56     63

You have created a data frame with unique row names. If you would like you can drop (df[,-1]) or keep the name column in the data frame.

Solution 2: Create a matrix

In this solution, you can fix this error by creating a matrix from a data frame.

In R, data frame does not allow to have duplicated rows, but the matrix can have the duplicated rows.

# load dataset
df <- read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv") 

# view data frame
df

  name height weight
1    x   1.80     65
2    y   1.62     67
3    z   1.55     62
4    z   1.56     63

Create a matrix from data frame using data.matrix() function,

# convert data frame to matrix
df_mat <- data.matrix(df)

# assign row name to matrix
row.names(df_mat) <- df$name

# drop first name column
df_mat = df_mat[ ,-1]

# view matrix
df_mat

  height weight
x   1.80     65
y   1.62     67
z   1.55     62
z   1.56     63

Enhance your skills with courses on Statistics and R

This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.

Share on

Twitter Facebook LinkedIn

Solving the Most Common R Error: ‘Duplicate row.names are not allowed’

Solution 1: Import without `row.names` parameter

Solution 2: Create a matrix

Enhance your skills with courses on Statistics and R

Share on

You may also enjoy

Calculate Coverage From BAM File

Python: Why VIF Return Inf Value?

Find Max and Min Sequence Length in Fasta

Get Non-overlapping Portion Between Two Regions in bedtools

Solution 1: Import without row.names parameter

Solution 2: Create a matrix

Enhance your skills with courses on Statistics and R

Share on

You may also enjoy

Calculate Coverage From BAM File

Python: Why VIF Return Inf Value?

Find Max and Min Sequence Length in Fasta

Get Non-overlapping Portion Between Two Regions in bedtools

Solution 1: Import without `row.names` parameter