FASTA File Documentation

Feature	Value
File Extension	`.fasta`, `.fa`, `.mpfa`, `.fna`, `.fsa`
MIME Type	`text/x-fasta`
Primary Usage	Storing Biological Sequences
Type of Data	Nucleotide Sequences, Amino Acid Sequences
Header Line Indicator	`>` symbol preceding description
Sequence Data	Nucleotides (A, C, G, T/U) or Amino Acids (single-letter codes)
Character Encoding	ASCII
Line Width in Files	Typically 60-80 characters (not strictly enforced)
File Creation Software	Various bioinformatics tools (e.g., BLAST, Clustal)
Support for Multiple Sequences	Yes
Usage in Databases	NCBI, EMBL, DDBJ
Compression	Often gzipped (.fasta.gz)
Comment Lines	Start with `;` (less common)
Blank Lines	Generally ignored/not recommended
Special Characters	N (ambiguous nucleotide), X (ambiguous amino acid)
Case Sensitivity	Upper and lower case letters are accepted (meaning can vary)
Modifications and Annotations	Limited; use other formats (e.g., GenBank) for detailed annotations
File Concatenation	Simple due to format structure (concatenate with care)
Space within Sequences	Spaces are not allowed in the sequence data
Popularity	Widely used in bioinformatics and computational biology
Origins	Developed in the 1980s for the FASTA sequence alignment software

Was this page helpful?

Sorry to hear that. Please tell us how we can improve.