Convert Multi-line Fasta into Single-line Fasta
Most FASTA files obtained from biological databases contain sequences in multi-line format, but some bioinformatics tools and scripts require single-line FASTA files.
You can use the multi_to_single_line()
function from Python bioinfokit
package (v2.1.3) for converting multi-line FASTA into single-line FASTA.
The general syntax of looks like this:
# load package
from bioinfokit.analys import Fasta
# convert multi line FASTA into single line FASTA
Fasta.multi_to_single_line(file="eg.fasta")
The above function generates an output file (output.fasta
) in the same directory and contains the sequences in one line.
The following examples explain how to use the multi_to_single_line()
function,
For example, if you have the following multi-line FASTA,
head eg.fasta
>seq
GAATGAGATTATTCTCATAGCGAAGCTTCAACATCGGAATCTTGTGAGATTACTTGGATGTTGCTTCGAG
GGAGAAGAGAAAATGCTTGTTTATGAGTATATGCCTAACAAGAGCTTGGATTTCTTCCTCTTTGATGAAA
Now convert it to single-line FASTA using the multi_to_single_line()
function
# load package
from bioinfokit.analys import Fasta
# convert multi line FASTA into single line FASTA
Fasta.multi_to_single_line(file="eg.fasta")
The single-line FASTA file (output.fasta
) will be saved in the same directory.
head output.fasta
>seq
GAATGAGATTATTCTCATAGCGAAGCTTCAACATCGGAATCTTGTGAGATTACTTGGATGTTGCTTCGAGGGAGAAGAGAAAATGCTTGTTTATGAGTATATGCCTAACAAGAGCTTGGATTTCTTCCTCTTTGATGAAA
Enhance your skills with courses on genomics and bioinformatics
- Genomic Data Science Specialization
- Biology Meets Programming: Bioinformatics for Beginners
- Python for Genomic Data Science
- Bioinformatics Specialization
- Command Line Tools for Genomic Data Science
- Introduction to Genomic Technologies
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.