-
PROTEOMIC DATABASES
-
-
-
- Proteome (Proteome
Databases): Caenorhabditis elegans (WormPD) and Saccharomyces cerevisiae
(YPD), and S. pombe. Lots of information on protein expression, function,
homologs, etc.
-
- PDB (Protein Data Bank):
3-D structure of proteins, nucleic acids and some other biological molecules
-
- PIR (Protein Information Resources): supports research
on molecular evolution, functional genomics, and computational biology by
providing an integrated system of protein sequence databases, derived related
databases, and access facilities.
-
- MIPS: Munich Infomation Site on Protein Sequences
- Protein extraction, description, and
analysis tools at MIPS.
-
- Protein analysis:
-
- BLOCKS
- http://www.blocks.fhcrc.org/
- Blocks are multiply aligned ungapped segments corresponding
to the most highly conserved regions of proteins.
- The blocks for the Blocks Database are made automatically
by looking for the most highly conserved regions in groups of proteins documented
in the PrositeDatabase.
-
- EPD - Eukaryotic
Promoter Database, Current release 63
- http://www.epd.isb-sib.ch/
- The Eukaryotic Promoter Database is an annotated non-redundant
collection of eukaryotic POL II promoters, for which the transcription start
site has been determined experimentally.
-
- ENZYME - Enzyme
nomenclature database
- http://www.expasy.ch/enzyme/
- ENZYME is a repository of information relative to the
nomenclature of enzymes. It is primarily based on the recommendations of
the Nomenclature
- Committee of the International Union of Biochemistry
and Molecular Biology (IUBMB) and it describes each type of characterized
enzyme for which an EC
- (Enzyme Commission) number has been provided.
-
- GeneCards
- http://bioinformatics.weizmann.ac.il/cards/
- GeneCards is a database of human genes, their products
and their involvement in diseases. It offers concise information about the
functions of all human genes that have an approved symbol, as well as selected
others.
-
- KEGG: Kyoto
Encyclopedia of Genes and Genomes
- http://www.genome.ad.jp/kegg/
- Kyoto Encyclopedia of Genes and Genomes (KEGG) is an
effort to computerize current knowledge of molecular and cellular biology
in terms of the information pathways that consist of interacting molecules
or genes and to provide links from the genecatalogs produced by genome sequencing
projects.
- KEGG consists of the following five types of data:
- Pathway maps - represented by graphical diagrams
- Ortholog group tables - represented by HTML tables
- Molecular catalogs - represented by HTML tables or
hierarchical texts
- Genome maps - represented by Java graphics
- Gene catalogs - represented by hierarchical texts
-
- Library of Protein Family Cores
- http://www-camis.stanford.edu/projects/helix/LPFC/
- We have taken structural alignments of protein families
and computed average core structures for each family. The core structures
can be divided into residues with low spatial variation and those with high
spatial variation. Amino acids with low spatial variance occupy essentially
the same relative position in all family members. This library is useful
for building models, threading, and exploratory analysis. It is also a useful
mechanism for summarizing variability in NMR structures.
-
- Pfam
- http://www.sanger.ac.uk/Software/Pfam/
- Pfam is a collection of protein families and domains.
Pfam contains multiple protein alignments and profile-HMMs of these families.
Pfam is a semi-automatic protein family database, which aims to be comprehensive
as well as accurate.
-
- PRINTS - PROTEIN FINGERPRINT DATABASE
- http://bioinf.man.ac.uk/dbbrowser/PRINTS/
- PRINTS is a compendium of protein fingerprints. A fingerprint
is a group of conserved motifs used to characterise a protein family; its
diagnostic power is refined by iterative scanning of a SWISS-PROT/TrEMBL
composite. Usually the motifs do not overlap, but are separated along a
sequence, though they may be contiguous in 3D-space. Fingerprints can encode
protein folds and functionalities more flexibly and powerfully than can
single motifs, full diagnostic potency deriving from the mutual context
provided by motif neighbours.
-
- Protein
Data Bank - PDB
- http://pdb-browsers.ebi.ac.uk//
- http://www.rcsb.org/pdb/index.html
- PDB, the single international repository for the processing
and distribution of 3-D macromolecular structure data primarily determined
experimentally by X-ray crystallography and NMR.
-
- InterPro - Integrated Resource of Protein Domains and Functional
Sites
- http://www.ebi.ac.uk/interpro/databases.html
- nterPro release 1.2 (June 2000) was built from Pfam 5.2,
PRINTS 26.1, PROSITE 16, ProDom 2000.1 and the current SWISS-PROT + TrEMBL
data. This release of InterPro contains 3052 entries, representing 574 domains,
2418 families, 46 repeats and 14 post-translational modification sites.
InterPro is a useful resource for whole genome analysis and has already
been used for the proteome analysis of a number of completely sequenced
organisms. A preliminary proteome analysis was also produced for the human
genome.
-
- S. cerevisiae functional analysis: Eisenberg,
UCLA
-
- SWISS-PROT - Annotated protein
sequence database
- TrEMBL - Computer-annotated supplement to SWISS-PROT
- http://expasy.hcuge.ch/sprot/sprot-top.html
- SWISS-PROT is a curated protein sequence database which
strives to provide a high level of annotations (such as the description
of the function of a protein, its domains structure, post-translational
modifications, variants, etc.), a minimal level of redundancy and high level
of integration with other databases.
- TrEMBL is a computer-annotated supplement of SWISS-PROT
that contains all the translations of EMBL nucleotide sequence entries not
yet integrated in SWISS-PROT.
-
- TRANSFAC
- The Transcription Factor Database
- http://transfac.gbf.de/TRANSFAC/
- http://www.cbi.pku.edu.cn/TRANSFAC/
- http://www.hgmp.mrc.ac.uk/Bioinformatics/Databases/transfac-help.html
- TRANSFAC is a database on eukaryotic cis-acting regulatory
DNA elements and trans-acting factors. It covers the whole range from yeast
to human. The TRANSFAC database is a database of TRANScription regulatory
FACtors and is maintained at the GBF Braunschweig It combines data about
the transcription factors and their DNA binding sites with additional important
information (e.g. the sources of the factors, systematic classification
of transcription factors) All experimental data have been extracted from
literature.These data are accessible through two main tables, the FACTORS
and the SITES table. While the first table holds data about the binding
proteins, the second holds the data about the DNA sequences that are recognized
by these proteins. Besides these experimental data, TRANSFAC comprises also
information derived from them. As many transcription factors can be classified
by their DNA binding domains and/or their dimerization domains we introduced
the CLASS table to TRANSFAC. We also prepared a GENES table, which contains
data about the according genes and their promoters/enhancers (Knueppel et
al.) and which will be part of the ASCII flatfile version in future.
-
- Transpath - Signal
Transduction Browser
- http://193.175.244.148/
- The database on gene-regulatory pathways.
-
- SCOP - Structural
Classification of Proteins
- http://pdb.weizmann.ac.il/scop/
- The scop database aims to provide a detailed and comprehensive
description of the structural and evolutionary relationships between all
proteins whose structure is known, including all entries in Brookhaven National
Laboratory's Protein Data Bank (PDB). It is available as a set of tightly
linked hypertext documents which make the large database comprehensible
and accessible. In addition, the hypertext pages offer a panoply of representations
of proteins, including links to PDB entries, sequences, references, images
and interactive display systems.
-
- CATH - Protein Structure Classification
- http://www.biochem.ucl.ac.uk/bsm/cath/
- The CATH database is a hierarchical domain classification
of protein structures in the Brookhaven protein databank. All non-protein,
model, and "C-alpha only" structures are not classified in CATH. Only crystal
structures solved to resolution better than 3.0 angstroms are considered,
together with NMR structures.
-
- FSSP - Fold classification based on Structure-Structure alignment
of Proteins
- http://www2.ebi.ac.uk/dali/fssp/fssp.html
- The FSSP database is based on exhaustive all-against-all
3D structure comparison of protein structures currently in the Protein Data
Bank (PDB). The classification and alignments are automatically maintained
and continuously updated using the Dali search engine.
-
- 3 Dee - Database of Protein Domain Definitions
- http://jura.ebi.ac.uk:8080/3Dee/help/help_intro.html
- 3Dee contains structural domain definitions for all protein
chains in the Brookhaven Protein Databank (PDB) that have 20 or more residues
and are not theoretical models [listed here]. In addition, the domains have
been clusterd on sequence similarity and structural similarity. The resulting
families are stored as a hierarchy.
-
- PRESAGE
- http://presage.berkeley.edu/
- PRESAGE is a collaborative resource for structural genomics.
It provides a database of proteins, each of, which has a collection of annotations
reflecting current experimental status, structural assignments models, and
suggestions. PRESAGE is a tool for scientists to keep track of structural
knowledge of their proteins of interest
- GeneCensus Genome Comparisons
- http://bioinfo.mbb.yale.edu/genome/
- GeneCensus is intended to give a comprehensive statistical
accounting of protein features, particularly structural ones, in genomes
-- in the sense of a demographic census.
-
- TRRD - Transcription Regulatory Region Database
- http://wwwmgs.bionet.nsc.ru/mgs/dbases/trrd4/
- The Transcription Regulatory Regions Database (TRRD 4.x)
collects information on structural and functional organisation of transcription
regulatory regions of eukaryotic genes. The hierarchical organisation of
transcription regulatory regions of eukaryotic genomes is put into the database
schema. It includes the following information: transcription factor binding
sites eukaryotic gene promoters, enhancers transcription regulatory regions
gene expression regulation.
-
- TargetDB
- http://molbio.nmsu.edu:81/
- a database of peptides targeting proteins to cellular
locations.
- Metabolic Pathways of Biochemistry
- http://www.media.gwu.edu/~mpb/index.html
- This site is designed to graphically represent all major
metabolic pathways, primarily those important to human biochemistry.
-
- COMPEL
- http://compel.bionet.nsc.ru/
- COMPEL collects information about composite regulatory
elements (CEs) - pairs of closely situated sites and transcription factors
binding to them. We define a composite element as a minimal functional unit
within that both protein-DNA and protein-protein interactions contribute
to a highly specific pattern of gene transcriptional regulation. The factors
that cooperate at an individual CE mostly belong to different classes with
respect to the structure of protein domains, namely DNA-binding and activation
domain. The factors also differ in their functional properties: cell-specificity,
inducibility and othres. Thus, composite regulatory elements contribute
to the one of the fundumental principles of genom functioning - combinatorial
nature of gene transcriptional regulation.
-
- RegulonDB: a database on transcriptional regulation in Escherichia
coli
- http://tula.cifn.unam.mx:8850/regulondb/regulon_intro.framese
- RegulonDB is a database on transcription regulation
and operon organization in Escherichia coli. It describes regulatory signals
of transcription initiation, promoters, regulatory binding sites of specific
regulators, ribosome binding sites and terminators, as well as information
on genes clustered in operons. These specific annotations have been gathered
from a constant search in the literature, as well as based on computational
sequence predictions. The genomic coordinates of all these objects in
the E. Coli K-12 chromosome are clearly indicated. Every known object
has a link to at least one MEDLINE reference.