IUPAC Nucleotide Codes
The International Union of Pure and Applied Chemistry (IUPAC) code provides the nomenclature for a single or set of nucleotide bases for representing biological concepts such as nucleotide degeneracy and single base variations in sequences.
IUPAC codes are useful to represent a sequence if there is uncertainty about a nucleotide at a given position. For example, to represent the sequence for possible single nucleotide polymorphisms.
The other use of IUPAC codes could be estimating the nucleotide sequence from the protein sequence due to codon degeneracy (if the nucleotide sequence is unknown for that gene).
IUPAC codes for single nucleotides
IUPAC code | Nucleotide bases | Complementary Base |
---|---|---|
A | A (Adenine) | T |
C | C (Cytosine) | G |
G | G (Guanine) | C |
T | T (Thymine ) | A |
U | U (Uracil) | A |
IUPAC codes for nucleotide degeneracy (Wobbles)
IUPAC code | Nucleotide bases | Complementary code |
---|---|---|
R | A or G (Purines) | Y |
Y | C or T/U (Pyrimidines) | R |
M | A or C (Amino group) | K |
K | G or T/U (Keto group) | M |
S | C or G | S |
W | A or T/U | W |
H | A or C or T/U | D |
B | C or G or T/U | V |
V | A or C or G | B |
D | A or G or T/U | H |
N | A or G or C or T/U (any other base) | N |
- | Gap | - |
Enhance your skills with courses on genomics and bioinformatics
- Genomic Data Science Specialization
- Biology Meets Programming: Bioinformatics for Beginners
- Python for Genomic Data Science
- Bioinformatics Specialization
- Command Line Tools for Genomic Data Science
- Introduction to Genomic Technologies
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.