How to Reverse Complement of DNA Sequence in Bash
The reverse complement of a DNA sequence is obtained by reversing the DNA sequence and replacing each nucleotide with its complementary base. For example, the reverse complement of the “AATGTACTAT” sequence is “ATAGTACATT”.
The following examples demonstrate how to get reverse complement of DNA sequence using BASH commands:
Example 1
The tr
(translate) and rev (reverse) commands can be used for getting the reverse complement of DNA sequence.
# example sequence AATGTACTAT
echo "AATGTACTAT" | tr 'ATCGatcg' 'TAGCtagc' | rev
Output:
ATAGTACATT
Example 2
The following awk
command can be used for getting the reverse complement of all DNA
sequences from the FASTA file.
For example, the input.fasta file contains the following two sequences,
# input.fasta
>1
ATGGGAAAC
TGGAGGAAA
>2
TGAAACCTT
Now, perform the reverse complement of DNA sequences from the FASTA file using awk
,
awk '/^>/ { if (seq) system("echo "seq"| tr 'ATCGatcg' 'TAGCtagc' | rev"); printf "%s\n", $0; seq=""; next } { seq = seq $0 } END { if (seq) system("echo "seq"| tr 'ATCGatcg' 'TAGCtagc' | rev")} ' input.fa
Output:
>1
TTTCCTCCAGTTTCCCAT
>2
AAGGTTTCA
Enhance your skills with courses on genomics and bioinformatics
- Genomic Data Science Specialization
- Biology Meets Programming: Bioinformatics for Beginners
- Python for Genomic Data Science
- Bioinformatics Specialization
- Command Line Tools for Genomic Data Science
- Introduction to Genomic Technologies
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.