How to Use blastdbcmd to Extract Sequences from BLAST Database
          
  
        blastdbcmd is a command-line utility from the NCBI BLAST toolkit 
that allows extracting the sequences from a formatted BLAST database based on sequence identifiers.
The general syntax of blastdbcmd looks like this:
# extract specific sequences
blastdbcmd -db blast_db_name -entry seq_id  -out out.fasta
# extract all sequences
blastdbcmd -db blast_db_name -entry all -out out.fasta
Where,
| Parameter | Description | 
|---|---|
| -db | BLAST database name | 
| -entry | Comma-delimited sequence identfier to extract the sequences. Use “all” to extract all sequences from formatted BLAST database | 
| -out | Redirects the output to a file instead of printing to the console | 
Note: The BLAST database should be created with the
-parse_seqidsoption for extracting the specific sequences from the formatted BLAST database.
The following examples explains how to use blastdbcmd to extract the sequences from formatted BLAST database.
Extract specific sequences from BLAST database
Extract the single sequence from sample_nucl.fasta BLAST database
# extract single sequence
blastdbcmd -db sample -entry seq1
Output:
>seq1
MERLNSKLYVENCYIMKENEKLRKKAELLNQENQQLLVQLKQKLSKANKNPNGSNNDNNVSSSSSASGKS
Extract multiple sequences from sample_nucl.fasta BLAST database
# extract single sequence
blastdbcmd -db sample -entry seq1,seq2
Output:
>seq1
MERLNSKLYVENCYIMKENEKLRKKAELLNQENQQLLVQLKQKLSKANKNPNGSNNDNNVSSSSSASGKS
>seq2
KQKLSKANKNPNGSNNDNNVSSSSSASGKSNCYIMKENEKLRKKAELLNQENQQLL
Extract sequences from sample_nucl.fasta BLAST database and redirect ouptut to a file
# extract single sequence
blastdbcmd -db sample -entry seq1,seq2 -out out.fasta
Extract all sequences from BLAST database
Extract all sequence from sample_nucl.fasta BLAST database and redirect ouptut to a file
# extract single sequence
blastdbcmd -db sample -entry all -out out.fasta
The out.fasta should contain the all sequences from the formatted BLAST database
Enhance your skills with courses on genomics and bioinformatics
- Genomic Data Science Specialization
- Biology Meets Programming: Bioinformatics for Beginners
- Python for Genomic Data Science
- Bioinformatics Specialization
- Command Line Tools for Genomic Data Science
- Introduction to Genomic Technologies
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.
 
      