Module 7: Molecular Computing with RNA Sequences


Exercise 7

control module 7 contents back to the index of modules

We will search for potential eukaryotic PolII promoter sequences and then model a secondary structure for a putative mRNA molecule.

I. Searching for promoter sequences with Promoter Scan II

  • Link to the ProScan site at the BioInformatics & Molecular Analysis Section of NIH.
  • Copy and paste in the sequence window the murine p53 intron 1 sequence from this file. Check the box labeled, "Echo input sequence" to return your input sequence as part of your results.
  • Click the "Submit" button.
  • Results from the promoter sequence scan on murine p53 intron 1 are saved in this file (in case, you have problems with the real-time search)
  • Interpretation of results
    • A promoter region was identified from position 4484 to 4734 on the forward strand with a promoter score of 88.36 (Promoter Cutoff = 53.00000. A TATA site was found at 4715 with the estimated Transcription Start Site (TSS) to be at 4745.
    • The scan detected signals for a variety of known transcription element binding sites in the sequence. The weight of a signal is a relative number based upon that particular signal's ability to discriminate promoter from non-promoter sequences, and is based upon the relative frequency with which that signal is found in promoter versus non-promoter sequences. Higher this number, more likely that particular transcription element binding site could be located in the test sequence. The results suggest that a Sp1 binding site may be located at 4733 on the reverse strand. A more significant result may be that an EARLY-SEQ1 binding site is located at 4526 of the forward strand if the promoter region identified on this strand is of any relevance in vivo.
  • You may wish to scan a sequence of interest from your research that has shown the presence of a promoter based on preliminary in vivo results.

II. Secondary structure prediction

  • Connect to the Mfold server for RNA secondary structure prediction. This site is maintained by Michael Zuker at the Rensselaer Polytechnic Institute.
  • Copy and paste the murine p53 intron 1 sequence from this file. This sequence contains a 5' truncated region of the intron 1 sequence to accommodate requirements of the RNA folding software (3000 base maximum)
  • Do not include any constraints for the search and leave the default values for all parameters shown. Since the query sequence is longer than 500 bases, the job will be done as a batch operation. Provide your e-mail address and select the 'A batch' option. Click the "Fold RNA" button for the results of the folding algorithm. The results will be e-mailed to you later.
  • The results from the RNA folding algorithm are stored on the Mfold server for one day (after the e-mail has been received) before being being deleted. The folding results are saved in this file for permanent review. This is actually a folder of zipped JPEG files for the various secondary structure foldings within the sequence. You will have to open this file with an unzipping utility (such as PKZIP) and then view the individual foldings. You can look at the format of the initial results from the Mfold server in this file. All the links within this HTML file have been broken since the day after the e-mail results were initially mailed. 
  • Interpretation of results
    • The results show a variety of data that includes folded structures, energy dot plots, computed folding structures, dot plot plot comparisons etc. The most relevant result from a structural stand point would be the secondary structure folding patterns. In our case, the results show 50 folding patterns with different initial free energies (dG) and decreased free energies as a result of the folding. These figures (Structure 1 and Structure 10) show examples of two RNA folding patterns. Various stem-and-loop structures can be seen in the two different structures and the resultant decreased free energies are shown.
    • The stem-and-loop structures may play a role in post-transcriptional and/or translational regulation. In the case of the p53 murine intron 1 sequence, there is no in vivo evidence for a transcript arising from it (although the corresponding human p53 intron 1 sequence has been shown to produce a 1.5 kb transcript).

As noted above, the murine p53 intron 1 sequence may not have any biologically significant role in transcription. That is, (1) the promoter site identified (in Part I) may not be an in vivo promoter of a transcript and (2) the predicted secondary structures (in Part II) may not be involved in in vivo transcriptional or post-transcriptional regulation. Pick a sequence of interest from your research studies which may be more relevant in these exercises or you may use the sequences from this file that contains human chromosome 9 unfinished sequences from the Sanger Laboratory.

control module 7 contents back to the index of modules

| Return to SWBIC home |

The Southwest Biotechnology and Informatics Center WWW server is located at "".
Please send comments and suggestions to: [email protected]
SWBIC 2001