IUPAC Nucleotide Codes

Shreya Udawant    1 minute read

The International Union of Pure and Applied Chemistry (IUPAC) code provides the nomenclature for a single or set of nucleotide bases for representing biological concepts such as nucleotide degeneracy and single base variations in sequences.

IUPAC codes are useful to represent a sequence if there is uncertainty about a nucleotide at a given position. For example, to represent the sequence for possible single nucleotide polymorphisms.

The other use of IUPAC codes could be estimating the nucleotide sequence from the protein sequence due to codon degeneracy (if the nucleotide sequence is unknown for that gene).

IUPAC codes for single nucleotides

IUPAC code Nucleotide bases Complementary Base
A A (Adenine) T
C C (Cytosine) G
G G (Guanine) C
T T (Thymine ) A
U U (Uracil) A

IUPAC codes for nucleotide degeneracy (Wobbles)

IUPAC code Nucleotide bases Complementary code
R A or G (Purines) Y
Y C or T/U (Pyrimidines) R
M A or C (Amino group) K
K G or T/U (Keto group) M
S C or G S
W A or T/U W
H A or C or T/U D
B C or G or T/U V
V A or C or G B
D A or G or T/U H
N A or G or C or T/U (any other base) N
- Gap -

Enhance your skills with courses on genomics and bioinformatics

Author: Shreya Udawant

Shreya Udawant is a experienced molecular biology researcher with a passion for unraveling the complexities of diseases at the molecular level. She is specialized in genetic analysis and molecular biology techniques. She is committed to advancing scientific knowledge and contributing to breakthroughs in biomedical research.


This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.