FASTA File Documentation
Overview
| Feature | Value |
|---|---|
| File Extension | .fasta, .fa, .mpfa, .fna, .fsa |
| MIME Type | text/x-fasta |
| Primary Usage | Storing Biological Sequences |
| Type of Data | Nucleotide Sequences, Amino Acid Sequences |
| Header Line Indicator | > symbol preceding description |
| Sequence Data | Nucleotides (A, C, G, T/U) or Amino Acids (single-letter codes) |
| Character Encoding | ASCII |
| Line Width in Files | Typically 60-80 characters (not strictly enforced) |
| File Creation Software | Various bioinformatics tools (e.g., BLAST, Clustal) |
| Support for Multiple Sequences | Yes |
| Usage in Databases | NCBI, EMBL, DDBJ |
| Compression | Often gzipped (.fasta.gz) |
| Comment Lines | Start with ; (less common) |
| Blank Lines | Generally ignored/not recommended |
| Special Characters | N (ambiguous nucleotide), X (ambiguous amino acid) |
| Case Sensitivity | Upper and lower case letters are accepted (meaning can vary) |
| Modifications and Annotations | Limited; use other formats (e.g., GenBank) for detailed annotations |
| File Concatenation | Simple due to format structure (concatenate with care) |
| Space within Sequences | Spaces are not allowed in the sequence data |
| Popularity | Widely used in bioinformatics and computational biology |
| Origins | Developed in the 1980s for the FASTA sequence alignment software |
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.