Bioinformatics & Genomics
Courses, Journals, Definitions
Portals & Comprehensive Servers
| Opens resource in a 2nd browser window.
[Stanford University] This program takes a PDB file (or ID) as input and compare the query structure to 13,073 domains from the PDB.
AAA: Amino Acid Analysis Server
[EMBL-Heidelberg] Protein and amino acid identification in SwissProt and PIR using amino acid composition.
[ExPASY] A tool used to identify a protein from its amino acid composition.
[ExPASY] Compares the amino acid composition of a SWISS-PROT entry with all other SWISS-PROT entries to find the proteins whose amino acid compositions are closest to that of the selected entry.
AMAS: Analyze Multiple Alignments of Protein Sequences
[EBI] Allows the identification of functional residues by comparison of sub-groups of sequences arranged on a tree.
[Institute of Biology and Chemistry of Proteins] (ANalyse THE PROTeins) This free software application integrates into a single package most of the methods designed for
protein sequence analysis. Methods include: Sequence information,
Sequence edition, dot matrix plot,
FASTA/BLAST, sites/signatures detection, physico-chemical profiles, secondary structure prediction, helical wheel projections, multiple alignments,
prediction of signal peptide and cleavage sites, 3D display of molecules, amphiphilicity, and
[Glyko] Protein gel image acquisition, documentation, and analysis.
BCM Search Launcher: General Protein Sequence/Pattern Searches
[Baylor College of Medicine] A site with multiple tools to search a protein sequence for patterns.
[EMBL] BLITZ performs comparisons of protein sequences against the SWISS-PROT protein sequence database using the Smith and Waterman best local similarity algorithm.
[Fred Hutchinson Cancer Research Center] "Block Maker finds conserved blocks in a group of two or more unaligned protein sequences, which are assumed to be related, using two different algorithms. At least
two protein sequences must be provided to make blocks."
Top of Page
The Brutlag Bioinformatics Group
[Stanford University] This group is interested in the
problems of predicting biological function of genes and proteins from their primary sequence (sometimes known as functional genomics), predicting structure of protein and DNA from its sequence, and understanding how and when genes are expressed. This site links to tools being developed toward these interests.
CAME: Center of Applied Molecular Engineering
[Institute of Chemistry and Biochemistry, University of Salzburg] CAME offers a variety of internet services such as: a protein structure superimposition server [ProSup], the structural genome annotation for C. elegans [WILMA], and a protein structure analysis tool [PROSAII].
Computational Biology Tools
[Cornell Theory Center] This site provides several downloadable programs for analyzing protein sequences for predictions in the areas of folding, structure determination, energetics, and dynamics. It also is home to the LOOPP program (Learning, Observing and Outputting Protein Patterns), a program for potential optimization and alignments. LOOPP aligns sequence to sequence, sequence to structure, and structure to structure.
[EBI] Service for comparing protein structures in 3D. Protein structure coordinates are submitted for comparison against entries in the Protein Data Bank.
DNA vs Protein
[The Sanger Centre] Compares a protein sequence or a protein profile-HMM to genomic DNA. It uses the 'GeneWise' class of algorithms to provide a combined homology and gene prediction alignment of the protein to the DNA sequence.
[UCSF] DOCK addresses the problem of "docking" molecules to each other. It explores ways in which two molecules, such as a drug and an enzyme or protein receptor, might fit together.
[Stanford Univ.] Ranks the motifs that it finds by both their specificity (expected false postives) and the number of supplied sequences that it covers (true positives). The twenty highest-scoring motifs are returned. This site also contains several other tools for sequence alignment and similarity searching, protein function identification and genome analysis.
[Stanford Univ.] A protein identification tool allowing a protein sequence to be entered and submitted for identification.
[National Center for Biotechnology Information] Provides a General search for nucleotide sequences, protein sequences, biomolecule 3D structures, genomes, taxonomy or literature.
Finding 3-D Similarity in Protein Structures
[San Diego Supercomputing Center] This site provides two methods for comparing 3D protein structures, and structure neighbor databases derived from them. Combinatorial Extension (CE) determines an optimal alignment between aligned fragment pairs (AFPs). AFPs are
determined from local geometry averaged over 8 C alpha positions. Heuristics are used to prevent a combinatorial
explosion. Final alignments are by dynamic programming. Compound Likeness (CL) uses a probabalistic approach to comparing a wide variety of properties.
Top of Page
FSSP ftp Site
[EBI] The FSSP database is based on exhaustive all-against-all 3D structure comparison of protein structures currently in the Protein Data Bank (PDB).
GeneFIND: Gene Family Identification Network Design
[Protein Information Resource] A database search system combining search/alignment tools and the ProClass database. Output includes global and motif scores, alignments to the best-matched members of the Pro-Site protein groups and PIR superfamilies, motif pattern matches, and links to the corresponding ProClass family records.
[European Bioinformatics Institute] "GeneQuiz is an integrated system for large-scale biological sequence analysis, that goes from a protein sequence to a biochemical function, using a variety of search and analysis methods and up-to-date protein and DNA databases."
[CuraGen] This commercial sites offers a fully integrated genomics suite of
services, including discovery-oriented software for SeqCalling with sequence and SNP data, GeneCalling with gene expression data, PathCalling with protein -protein interactions and pathways, a pharmacogenomics database of drug profiles, and SNPCalling with our database of 200,000 cSNPs (gene-based single nucleotide polymorphisms).
[Washington Univ., St. Louis] A protein sequence analysis tool that profiles hidden Markov models (profile HMMs) that can be used to do sensitive database searching using statistical descriptions of a sequence family's consensus.
[Institute of Enzymology] Predicts transmembrane helices and topology of proteins.
ISREC Software Homepage
[Swiss Institute for Experimental Cancer] This site provides a number of servers in the areas of sequence analysis, domain/motif searches, gene prediction, and prediction of protein features.
[University of Cambridge] JOY is a program to annotate protein sequence alignments with three-dimensional (3D) structural features.
LAMA: Local Alignment of Multiple Alignments
[Fred Hutchinson Cancer Research Center] This program compares multiple protein sequence alignments with each other. The program can search databases of such multiple alignments. The search is for sequence similarities between conserved regions of protein families.
[Incyte Gemomics] LifeSeq provides fee access to Incyte's human gene sequence databases, complete with integrated bioinformatics tools.
Top of Page
Ligand-Protein Contacts & Contacts of Structural Units
[Weizmann Institute] LPC analyzes the interatomic contacts in ligand-protein complexes, and CSU analyzes the interatomic contacts in PDB protein entries.
[Univ. College London] A program for automatically plotting protein-ligand interactions. The interactions shown are those mediated by hydrogen bonds and hydrophobic contacts.
[Stanford University] A server that provides a hierarchical protein structure superposition.
[CBRG] A tool for searching SwissProt or EMBL by protein mass after digestion.
[University of Namur] "The Match-Box software proposes protein sequence alignment tools based on strict statistical criteria. The Match-Box program is particularly suitable for finding and aligning conserved structural motives, in particular in protein core."
[San Diego Supercomputing Center] Discovers motifs (highly conserved regions) in groups of related DNA or protein sequences using MEME and searches sequence databases using motifs using MAST.
[Columbia University Bioinformatics Center] This server submits your protein sequence to other servers through one interface. Analyses are in the areas of protein structure and function.
MOTIF: Searching Protein and Nucleic Acid Sequence Motifs
[Genome Net] Finds protein motifs in query sequence and gives structural information on the found motifs.
[ExPASy] Identifies proteins using pI, MW, amino acid composition, sequence tag, and peptide mass fingerprinting data.
[Weizmann Institute] OCAŠ is a browser/database for retrieving rich content annotation on structure and function for proteins found in the Protein Data Bank.
Top of Page
[Stanford University] A protein structure classification tool that ranks an input protein using its PDB file in a heirarchical classification of 600 representative structures from
PDB at a Glance
[National Institutes of Health] PDB At A Glance consists of a set of pre-defined biochemically meaningful search contexts (accessed by keyword) that represent the entire territory of the database.
[University College London] This service provides summaries and structural analyses of PDB data files.
[Munich Information Center for Protein Sequences] (Protein Extraction, Description, and ANalysis Tool) "PEDANT is a software system for completely automatic and exhaustive analysis of protein sequence sets - from individual sequences to complete genomes."
[ExPASy] Cleaves one or more protein sequences from the SWISS-PROT and TrEMBL databases, or a user-entered protein sequence with a chosen enzyme, and computes the masses of the generated peptides. The tool also returns theoretical isoelectric point and mass values for the proteins.
[ExPASy] Identifies proteins using pI, MW, and peptide mass fingerprinting data.
[EMBL Protein & Peptide Group] A tool for protein database searching by mass spectrometric data, such as peptide mass maps or partial amino acid sequences.
[Washington University, St. Louis] Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains based on the Swissprot 38 and SP-TrEMBL 11 protein sequence databases.
Pratt - A Pattern Discovery Tool
[University of Bergen] Discovers patterns conserved in sets of unaligned protein sequences.
Pratt Pattern Discovery
[EBML] A tool that allows the user to search for patterns conserved in a set of protein sequences.
Top of Page
PRINTS BLAST Search
[University of Manchester] "This is an interface to a BLAST search of all protein sequences contained within the PRINTS database. The user entered sequence may be a protein or DNA sequence."
[Alexey M. Eroshkin] This software is used in the analysis of multiple protein alignments and in the study of structure-function and structure-activity relationships in protein/peptide families.
PROCRUSTES: Gene Recognition via Spliced Alignment
[Univ. of Southern California] (Gelfand, Mironov, Pevzner, Roytberg, Sing-Hoi Sze) Based on the spliced alignment algorithm, which explores all possible exon assemblies and finds the multi-exon structure with the best fit to a related protein, it uses related proteins and cDNAs for gene prediction.
ProDom: The Protein Domain Database
[INRA] "The ProDom protein domain database consists of an automatic compilation of homologous domains."
[ISREC] Searches a single sequence against currently available profile databases. Also available is Frame-ProfileScan Server, which uses the new frame search option of the pfscan program to search a single DNA sequence against currently available protein profile databases.
[Univ. College London] Analyzes a protein coordinate file and provides details of the structural motifs in the protein.
[University College London] "Promotif analyzes a protein coordinate file and provides details of the structural motifs in the protein."
[EMBL] PROPSEARCH was designed to find the putative protein family if querying a new sequence has failed using alignment methods. By neglecting the order of amino acid residues in a sequence, PROPSEARCH uses the amino acid composition instead.
Protein Data Bank 3DB Browser
[PDB] Allows the user to rapidly search through the contents of the entire PDB Archive entries for obeying certain constraints.
[Virtual Genome Center] Searches to determine if a protein motif is encoded by a DNA sequence or a database of DNA sequences.
Top of Page
Protein Sequence Analysis
[Sanger Centre] Performs a secondary structure prediction or prosite pattern search.
Protein Sequence Analysis Launcher
[Swedish Foundation for Strategic Research] At ProSAL you paste a single protein sequence and are offered analyses in the following areas: sequence similarity (local and global), domain and motif searches, and predictions of chemical and structural properties.
Protein Structure Prediction Center and CASP
[Lawrence Livermore National Laboratory] "Our goal is to help advance the methods of identifying protein structure from sequence. The Center has been organized to provide the means of objective testing of these methods via the process of blind prediction. In addition to the support of the CASP meetings our goal is to promote an objective evaluation of prediction methods on a continuing basis." CASP stands for Critical Assessment of techniques for protein Structure Prediction.
Protein Topology Home Page
[EBI] This site offers services devoted to protein structural topology and protein topology cartoons.
[UCSF] Tools for mining sequence databases in conjunction with mass proteometry experiments.
Proteins Plus Search
[Nucleic Acid Database Project] A search engine for locating protein structures contained in the Protein Data Bank (PDB).
[ExPASy] An extensive annotated list of links to tools that provide analysis of protein sequences and structures.
[The Hospital for Sick Children, Toronto] (Alex Dong Li) Searches for DNA or protein sequence repeats.
SAPS: Statistical Analysis of Protein Sequences
[ISREC] Analyzes proteins for statistically-significant features.
[MRC Laboratory of Molecular Biology] A Perl5 program that evaluates the accuracy of protein sequence alignments with very low sequence identity.
Top of Page
[ExPASy] Allows the user to browse through a number of databases, such as SWISS-PROT, PROSITE, SWISS-2DPAGE, SWISS-3DIMAGE, ENZYME, CD40Lbase, and SeqAnalRef; as well as other cross-referenced databases (EMBL/GenBank/DDBJ, OMIM, Medline, FlyBase, ProDom, SGD, and SubtiList). It also allows access to many analytical tools for the identification of proteins, the analysis of their sequences, and the prediction of their tertiary structures.
[EBI] A local alignment method to search protein sequence databases with a query sequence or multiple alignment, and also allows all pairwise comparisons to be made between a set of sequences and can estimate the statistical significance of the alignments.
Search LITDB database using DBGET
[Protein Research Foundation] Search the literature of molecular aspects of proteins from about 1000 journals.
SMART: Simple Modular Architecture Research Tool
[EMBL] Performs a search for proteins with specific combinations of domains in defined taxonomic ranges, and allows for rapid identification and annotation of signaling domain sequences.
SOPM Secondary Structure Prediction Method
[Pole Bio-Informatique Lyonnais] Makes a binary comparison of all protein sequences and takes into account the prediction of structural classes of proteins.
[University College London] The Sequential Structure Alignment Program is a method for automatically comparing three dimensional protein structures using double dynamic programming.
[The Barton Group] A suite of programs for the comparison and alignment of protein three dimensional structures. The suite will multiply align structures and produce a corresponding sequence alignment with confidence values associated with each aligned position.
[EBI] "STAMP is a package for the alignment of protein sequence based on three-dimensional (3D) structure. It provides not only
multiple alignments and the corresponding `best-fit' superimpositions, but also a systematic and reproducible method for assessing the quality of such alignments."
[ExPASy] This program can display several molecules simultaneously. Each molecule is loaded into a individual layer and grouped according to its atomic composition and respective coordinates.
[ExPASy at the Swiss Institute of Bioinformatics] Data on proteins identified on various 2-D PAGE reference maps.
Top of Page
[ExPASy] Provides an interface allowing analysis of several proteins at the same time by superimposing them in order to deduce structural alignments and compare their active sites or any other relevant parts.
SWISS-PROT and TrEMBL
[ExPASy] SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotations (such as the description of the function of a protein, its domain structure, post-translational modifications, variants, etc.), a minimal level of redundancy, and high level of integration with other databases. TrEMBL is a computer-annotated supplement of SWISS-PROT that contains all the translations of EMBL nucleotide sequence entries not yet integrated in SWISS-PROT.
[ExPASy] Creates lists of proteins from one or more organisms that are within a user-specified pI or MW range. The program can also identify proteins from 2-D gels by virtue of their estimated pI, MW, and a short protein sequence tag of up to 6 amino acids.
[Purdue Univ.] Protein topological comparison program that can detect similarities between two protein structures.
TOPS: Protein Structural Topology
[European Bioinformatics Institute] This server allows searches of the TOPS database of topological patterns in protein structures.
[GBF] A database program that compiles data about gene regulatory DNA sequences; from this data programs have been developed to identify putative promoter or enhancer structures.
[UCSC] A hidden Markov Model protein structure prediction server. A library of hidden Markov Models, one per PDB structure and containing approximately 2500 HMMs is on this server.
[NCBI] The Vector Alignment Search Tool was used to compare all of the structures in the NCBI MMDB structure database to each other and can now be used as a search to find structural neighbors.
Yale Structure Query Server
[Yale Gerstein Lab] PartsList: a web-based system for dynamically ranking
protein folds based on disparate attributes, including
whole-genome expression and interaction information