Find Max and Min Sequence Length in Fasta
You can use various command-line tools to get the maximum and minimum sequence lengths in a FASTA file.
This article describes how to find the maximum and minimum sequence lengths in a FASTA file in Python, seqkit, and samtools.
Python
You can use the max_min_len()
function from bioinfokit (v2.1.4) to find the maximum and minimum sequence lengths in a FASTA file.
# import package
from bioinfokit.analys import Fasta
Fasta.max_min_len("file.fasta")
# output
Max Length Seq: KU562861.1 153
Min Length Seq: MH150936.1 114
In the example file.fasta, the maximum and minimum sequence lengths are 153 bp and 114 bp, respectively.
seqkit
You can use the fx2tab
parameter from seqkit to find the maximum and minimum sequence lengths in a FASTA file.
# get max length
seqkit fx2tab --length --name file.fasta | cut -f2 | sort -n | head -1
# output
153
# get min length
seqkit fx2tab --length --name file.fasta | cut -f2 | sort -n | tail -1
# output
114
samtools
You can also use the samtools indexed fasta file for finding the maximum and minimum sequence lengths.
You first need to create an index of the fasta file.
samtools faidx file.fa
The first two columns in the index fasta file (file.fasta.fai
) contain the sequence name and their lengths.
You can get the maximum and minimum sequence lengths from file.fasta.fai
like this:
# get max length
cut -f2 file.fasta.fai | sort -n | head -1
# output
153
# get min length
cut -f2 file.fasta.fai | sort -n | tail -1
# output
114
Enhance your skills with courses on genomics and bioinformatics
- Genomic Data Science Specialization
- Biology Meets Programming: Bioinformatics for Beginners
- Python for Genomic Data Science
- Bioinformatics Specialization
- Command Line Tools for Genomic Data Science
- Introduction to Genomic Technologies
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.