Arabidopsis Information Resource (TAIR)
The Arabidopsis Information Resource (TAIR) collects information and maintains a database of genetic and molecular biology data for Arabidopsis thaliana, a widely used model plant.
Integrates data from three sources: the NCI/NCBI SKY/M-FISH and CGH Database, the NCI Mitelman Database of Chromosome Aberrations in Cancer, and the NCI Recurrent Aberrations in Cancer. The integrated databases can be searched for cytogenetic, clinical, and/or reference information.
Consensus CDS (CCDS)
The Consensus CDS (CCDS) project is a collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality. The long term goal is to support convergence towards a standard set of gene annotations.
EST (Expressed Sequence Tags)
NCBI's EST database is a collection of short single-read transcript sequences from GenBank. These sequences provide a resource to evaluate gene expression, find potential variation, and annotate genes.
A database of drosophila genes & genomes.
The NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. GenBank is part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI. These three organizations exchange data on a daily basis. GenBank consists of several divisions, most of which can be accessed through the Nucleotide database.
Gene Expression Omnibus (GEO)
A public functional genomics data repository supporting MIAME-compliant data submissions. Array- and sequence-based data are accepted and tools are provided to help users query and download experiments and curated gene expression profiles.
Presents data from the NIAID Influenza Genome Sequencing Project and from GenBank, and provides tools for flu sequence analysis, annotation and submission to GenBank. It also provides links to other flu sequence resources, and publications and general information about flu viruses.
JSNP (Japanese Single Nucleotide Polymorphisms)
Database of over 197,000 single nucleotide polymorphisms distributed throughout the human genome.
Molecular Modeling Database (MMDB)
The Molecular Modeling DataBase (MMDB), also known as "Entrez Structure," is a database of experimentally determined structures obtained from the RCSB Protein Data Bank (PDB).
NCBI (National Center for Biotechnology Information)
BioSystems NCBI's database that groups biomedical literature, small molecules, and sequence data in terms of biological relationships.
BioSample NCBI's BioSample database contains descriptions of biological source materials used in experimental assays.
Clone DB NCBI's Clone DB is a database that integrates information about clones and libraries, including sequence data, map positions and distributor information. It replaces the former NCBI Clone Registry.
Conserved Domains Database NCBI's collection of sequence alignments and profiles representing protein domains conserved in molecular evolution. It also includes alignments of the domains to known 3-dimensional protein structures in the MMDB database.
dbGaP (Database of Genotypes and Phenotypes) NCBI's database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype.
dbMHC (Database of Major Histocompatibility Complex) Provides an open, publicly accessible platform where the HLA community can submit, edit, view, and exchange data related to the human Major Histocompatibility Complex. It consists of an interactive Alignment Viewer for HLA and related genes, an MHC microsatellite database, a sequence interpretation site for Sequencing Based Typing (SBT), and a Primer/Probe database.
dbVar (Database of Genomic Structural Variation) NCBI's dbVar database has been developed to archive information associated with large scale genomic variation, including large insertions, deletions, translocations and inversions. In addition to archiving variation discovery, dbVar also stores associations of defined variants with phenotype information.
Entrez Gene NCBI's searchable database of genes, focusing on genomes that have been completely sequenced and that have an active research community to contribute gene-specific data. Information includes nomenclature, chromosomal localization, gene products and their attributes (e.g., protein interactions), associated markers, phenotypes, interactions, and links to citations, sequences, variation details, maps, expression reports, homologs, protein domain content, and external databases.
Epigenomics This NCBI resource enables users to explore and visualize richly-annotated epigenomics datasets. It provides a unique interface to search and navigate epigenomic data in the context of biological sample information, as well as tools to select, download and view multiple sets of epigenomic data as tracks on genome browsers.
GEO DataSets This NCBI database stores curated gene expression DataSets, as well as original Series and Platform records in the Gene Expression Omnibus (GEO) repository. Enter search terms to locate experiments of interest. DataSet records contain additional resources including cluster tools and differential expression queries.
GEO Profiles This NCBI database stores individual gene expression profiles from curated DataSets in the Gene Expression Omnibus (GEO) repository. Search for specific profiles of interest based on gene annotation or pre-computed profile characteristics.
Nucleotide NCBI's Nucleotide database is a collection of sequences from several sources, including GenBank, RefSeq, TPA and PDB. Genome, gene and transcript sequence data provide the foundation for biomedical research and discovery.
OMIA (Online Mendelian Inheritance in Animals) NCBI's database of genes, inherited disorders and traits in animal species (other than human and mouse), with textual information and references, as well as links to relevant records from other NCBI databases, such as PubMed and Gene.
OMIM (Online Mendelian Inheritance in Man) NCBI's catalog of human genes and genetic disorders, with links to associated literature references, sequence records, maps, and related databases.
Trace Archive NCBI's repository of DNA sequence chromatograms (traces), base calls, and quality estimates for single-pass reads from various large-scale sequencing projects.
UniGene NCBI's database that provides sets of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.
UniGene Library Browser NCBI's database thtat contains libraries of Expressed Sequence Tags (ESTs) organized by organism, tissue type and developmental stage.
UniSTS NCBI's comprehensive database of sequence tagged sites (STSs) derived from STS-based maps and other experiments. STSs are defined by PCR primer pairs and are associated with additional information, such as genomic position, genes, and sequences.
Viral Genomes NCBI's wide range of resources, including a brief summary of the biology of viruses, links to viral genome sequences in Entrez Genome, and information about viral Reference Sequences, a collection of reference sequences for thousands of viral genomes.
Virus Variaton NCBI's extension of the Influenza Virus Resource to other organisms, providing an interface to download sequence sets of selected viruses; analysis tools, including virus-specific BLAST pages; and genome annotation pipelines (in progress).
The Protein database is a collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB. Protein sequences are the fundamental determinants of biological structure and function.
Protein Data Bank (PDB)
The Protein Data Bank (PDB) archive is the single worldwide repository of information about the 3D structures of large biological molecules, including proteins and nucleic acids. Managed by the Research Collaboratory for Structural Bioinformatics (RCSB).
RefSeq (Reference Sequence)
A collection of curated, non-redundant genomic DNA, transcript (RNA), and protein sequences produced by NCBI. RefSeqs provide a stable reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis, expression studies, and comparative analyses. The RefSeq collection is accessed through the Nucleotide and Protein databases.
RefSeqGene sequences are a collection of gene-specific reference genomic sequences from the NCBI RefSeq collection. They form a stable foundation for reporting mutations, for establishing consistent intron and exon numbering conventions, and for defining the coordinates of other biologically significant variation.
Sequence Read Archive (SRA)
The Sequence Read Archive (SRA) stores sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Life Technologies AB SOLiD System® , Helicos Biosciences Heliscope®;, Complete Genomics®, and Pacific Biosciences SMRT®.
SNP (Single Nucleotide Polymorphisms)
Includes single nucleotide polymorphisms, microsatellites, and small-scale insertions and deletions. SNP contains population-specific frequency and genotype data, experimental conditions, molecular context, and mapping information for both neutral polymorphisms and clinical mutations.
Third Party Annotation (TPA)
A database that contains sequences built from the existing primary sequence data in GenBank. The sequences and corresponding annotations are experimentally supported and have been published in a peer-reviewed scientific journal. TPA records are retrieved through the Nucleotide Database.
UniProt (Universal Protein Resource)
The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The UniProt databases are the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), and the UniProt Archive (UniParc). The UniProt Metagenomic and Environmental Sequences (UniMES) database is a repository specifically developed for metagenomic and environmental data.
Facilitates insights into nematode biology.
Yeast Resource Center Public Data Repository
The YRC PDR provides access to scientific experimental data resulting from collaborations and technology development involving the Yeast Resource Center.
ZFIN (Zberafish Model Organism Database)
ZFIN serves as the zebrafish model organism database.