EMB File Documentation


Overview

Feature Value
Format Name EMBL (European Molecular Biology Laboratory) Nucleotide Format
File Extension .emb, .embl
MIME Type chemical/x-embl-dl-nucleotide
File Type Text
Developed By European Molecular Biology Laboratory
Format Type Biological sequence data format
Encoding ASCII
Primary Use Storing nucleotide sequences along with associated annotation
Advantages Widely used, supports rich annotation, compatible with many bioinformatics tools
Disadvantages Text-based format can be large and slow to process for very long sequences
Data Contained Nucleotide sequences, sequence features, annotations, references
Structure Line-based, with distinct identifiers for different sections (ID, SQ, FT, etc.)
ID Line First line, contains identifier and sequence information
FT Lines Feature table lines, describe annotations and features of the sequence
SQ Line Sequence header, precedes the actual nucleotide sequence
Sequence Representation Single-letter nucleotide codes (A, C, G, T, U, etc.)
Annotations Supported Gene names, protein products, functional regions, and more
Compatibility Compatible with a wide range of bioinformatics software and databases
Usage Research, academia, drug discovery, genome annotation projects
Accessibility Text-readable, easily editable with text editors or specialized software
File Size Varies with sequence length and amount of annotation; generally small to moderate