How to Use blastdbcmd
to Extract Sequences from BLAST Database
blastdbcmd
is a command-line utility from the NCBI BLAST toolkit
that allows extracting the sequences from a formatted BLAST database based on sequence identifiers.
The general syntax of blastdbcmd
looks like this:
# extract specific sequences
blastdbcmd -db blast_db_name -entry seq_id -out out.fasta
# extract all sequences
blastdbcmd -db blast_db_name -entry all -out out.fasta
Where,
Parameter | Description |
---|---|
-db |
BLAST database name |
-entry |
Comma-delimited sequence identfier to extract the sequences. Use “all” to extract all sequences from formatted BLAST database |
-out |
Redirects the output to a file instead of printing to the console |
Note: The BLAST database should be created with the
-parse_seqids
option for extracting the specific sequences from the formatted BLAST database.
The following examples explains how to use blastdbcmd
to extract the sequences from formatted BLAST database.
Extract specific sequences from BLAST database
Extract the single sequence from sample_nucl.fasta BLAST database
# extract single sequence
blastdbcmd -db sample -entry seq1
Output:
>seq1
MERLNSKLYVENCYIMKENEKLRKKAELLNQENQQLLVQLKQKLSKANKNPNGSNNDNNVSSSSSASGKS
Extract multiple sequences from sample_nucl.fasta BLAST database
# extract single sequence
blastdbcmd -db sample -entry seq1,seq2
Output:
>seq1
MERLNSKLYVENCYIMKENEKLRKKAELLNQENQQLLVQLKQKLSKANKNPNGSNNDNNVSSSSSASGKS
>seq2
KQKLSKANKNPNGSNNDNNVSSSSSASGKSNCYIMKENEKLRKKAELLNQENQQLL
Extract sequences from sample_nucl.fasta BLAST database and redirect ouptut to a file
# extract single sequence
blastdbcmd -db sample -entry seq1,seq2 -out out.fasta
Extract all sequences from BLAST database
Extract all sequence from sample_nucl.fasta BLAST database and redirect ouptut to a file
# extract single sequence
blastdbcmd -db sample -entry all -out out.fasta
The out.fasta
should contain the all sequences from the formatted BLAST database
Enhance your skills with courses on genomics and bioinformatics
- Genomic Data Science Specialization
- Biology Meets Programming: Bioinformatics for Beginners
- Python for Genomic Data Science
- Bioinformatics Specialization
- Command Line Tools for Genomic Data Science
- Introduction to Genomic Technologies
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.