blastn
vs blastp
: Which One You Should Use?
The BLAST (Basic Local Alignment Search Tool) is a highly used bioinformatics application in genomic analysis to find the region of similarities (to infer homology) between the biological sequences.
blastn
and blastp
are a part of BLAST family tools, and are used for sequence similarity searching against a reference database.
The main difference between blastn
and blastp
is that blastn
performs nucleotide search against a nucleotide
database, whereas blastp
performs protein search against a protein database.
Tool | Query sequence | Target database sequence | |
---|---|---|---|
blastn |
Nucleotide | Nucleotide | |
blastp |
protein | Protein |
The blastn
algorithm is commonly used when you want to find similarities between DNA or RNA sequences.
For example, you can use blastn
to find similar genes or identify conserved regions in genomic DNA, mapping sequences to the genomes, etc.
blastn
is less preferred if your goal is to identify the protein coding sequences for functional annotation. This is
due to multiple nucleotide sequences encoding the same amino acid sequence (degeneracy of the genetic code).
Further, in comparison to protein alignments, nucleotide alignments generally have high number of gaps.
blastp
, on the other hand, can retrieve matches with fewer nucleotide similarities, making it more suitable for
finding identical amino acid sequences encoded by different nucleotide sequences.
Furthermore, when assessing homology, nucleotide alignments (blastn
) typically require a higher percentage similarity
(generally >70%), while protein alignments (blastp
) consider sequences as homologous with a percentage similarity of >30%-40%.
But, you should also assess the E-value and bit score (in addition to percentage similarity) for inferring the homology as sequences with < 30% percentage similarity are also homologous (when they have significant E-value).
References
- Pearson, W. R. (2013). An Introduction to Sequence Similarity (“Homology”) Searching. Current Protocols in Bioinformatics / Editoral Board, Andreas D. Baxevanis … [et al.], 0 3.
Enhance your skills with courses on genomics and bioinformatics
- Genomic Data Science Specialization
- Biology Meets Programming: Bioinformatics for Beginners
- Python for Genomic Data Science
- Bioinformatics Specialization
- Command Line Tools for Genomic Data Science
- Introduction to Genomic Technologies
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.