FASTA File Documentation
Overview
Feature | Value |
---|---|
File Extension | .fasta , .fa , .mpfa , .fna , .fsa |
MIME Type | text/x-fasta |
Primary Usage | Storing Biological Sequences |
Type of Data | Nucleotide Sequences, Amino Acid Sequences |
Header Line Indicator | > symbol preceding description |
Sequence Data | Nucleotides (A, C, G, T/U) or Amino Acids (single-letter codes) |
Character Encoding | ASCII |
Line Width in Files | Typically 60-80 characters (not strictly enforced) |
File Creation Software | Various bioinformatics tools (e.g., BLAST, Clustal) |
Support for Multiple Sequences | Yes |
Usage in Databases | NCBI, EMBL, DDBJ |
Compression | Often gzipped (.fasta.gz) |
Comment Lines | Start with ; (less common) |
Blank Lines | Generally ignored/not recommended |
Special Characters | N (ambiguous nucleotide), X (ambiguous amino acid) |
Case Sensitivity | Upper and lower case letters are accepted (meaning can vary) |
Modifications and Annotations | Limited; use other formats (e.g., GenBank) for detailed annotations |
File Concatenation | Simple due to format structure (concatenate with care) |
Space within Sequences | Spaces are not allowed in the sequence data |
Popularity | Widely used in bioinformatics and computational biology |
Origins | Developed in the 1980s for the FASTA sequence alignment software |
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.