Module 6: Structure-Function Relationships


Protein Structure and Function

introduction module 6 contents Protein domains back to the index of modules

  • Primary structure Gene sequence directly correlates with the primary structure of a polypeptide.

    • Open reading frame (ORF) search from a DNA sequence to identify a possible polypeptide is computationally feasible. An open reading frame in a DNA (or mRNA) is a sequence, when converted to triplet codons specifying amino acids, runs without the interruption of a stop codon(s). In general, a biologically relevant ORF should be at least 70-80 amino acids (triplet codons) long. In eukaryotes, conversion of genomic DNA to ORF's should take into account the removal of intervening sequences (introns) from the coding sequences (exons) in the process called splicing. (see Module 7 section on intron/exon boundary sequences & splice variants)
      • Grail: This service provides analysis of the protein coding potential of a DNA sequence, and an option for protein sequence database searches of putative coding regions. It has an e-mail server option and an interactive graphical X-based client-server option called Xgrail.

      • BCM GeneFinder: This is a comprehensive search server wherein one of the features can locate ORF's from either a prokaryotic or eukaryotic origin.

    • Two factors can contribute to differences or similarities in the primary sequences of polypeptides which share functional homology in certain regions of the molecules or the whole molecules. This fact is especially true for, let's say, two proteins with the same function (eg., b-globin) from two very closely related species (eg., humans and chimpanzees). (1) Amino acid properties are defined by R group - hydrophobic, hydrophilic, or charged (acidic or basic). This classification of amino acids leads to two homologous proteins with similar residues (based on R group chemistry) substituted at different positions of the polypeptides. (2) Due to the degeneracy of the genetic code, and the evolutionary nature of DNA substitutions at the third position of codons, it is possible to find two different DNA sequences giving rise to polypeptides with identical residues.

  • Secondary structure

    • Alpha helix (Figure 13 KB), Beta sheet (Figure 20 KB) & Beta turn (Figure 5 KB) are the three structural elements which proteins tend to acquire as localized conformations within a global tertiary structure. Current available software can predict secondary structures of a test protein much better by multiple alignment with proteins carrying similar structures than by pairwise comparisons.

    • Web servers available to perform secondary structure predictions are:
    • Hydropathy plots can show secondary structure using the Kyte-Doolittle hydrophilicity values or the Hopp-Woods hydrophilicity values. (

  • Tertiary structure The polypeptide, as a whole, is involved in shaping the tertiary structure of the molecule. An example of the tertiary structure of an insulin-like protein identified in C. elegans is shown as a Swiss Model at the Expasy site (

    • Three-dimensional structure of a polypeptide can not be accurately predicted from primary sequence alone (ab initio).

    • Threading is the concept of comparing a test protein with similar proteins containing at least 25% sequence similarity in order to deduce structure of the test protein.

    • Web sites for tertiary structure predictions
    • As might be expected, tertiary structure predictions require very prolonged computational efforts.

  • Quarternary structure

    • Polypeptide-polypeptide interactions, or polypeptide interactions with DNA, RNA or ligands (in ribosomes, chromosomes etc.) are some instances wherein quarternary structures are important. (Example Figure of p53 Bound to DNA)

    • Some limited computational analysis of structures is available. The DNA-Binding Protein Database, based at Rutgers University can be searched for DNA-binding protein structures as determined by x-ray crystallography.

introduction module 6 contents Protein Domain back to the index of modules

| Return to SWBIC home |

The Southwest Biotechnology and Informatics Center WWW server is located at "".
Please send comments and suggestions to: [email protected]
SWBIC 2001