SeqEST is a utility that helps in preparing an Expressed Sequence Tag (EST) submission for the NCBI dbEST database. Full instructions for submitting to dbEST are found at the NCBI dbEST information page. In SeqEST, you supply the EST sequences in FASTA-format, and fill in a form for the information to be supplied with each sequence. SeqEST then prepares a file, which you may download and edit further before submitting to NCBI. The values supplied in the form are placed in fields in each sequence header. Some fields will have information that is the same for all sequences, which SeqEST will fill out. Other fields will have information that is different and specific for each sequence; SeqEST will create these fields, but you will need to edit the output file and add the specific information. You may choose which fields SeqEST outputs in each sequence header.
Note: The NCBI dbEST submission page allows submission of several file “TYPES.” The only “TYPE” of file output by SeqEST is EST.
- Field: a specific type of information to be supplied with an EST sequence; it consists of a tag and descriptive information
- Tag: a short, capitalized word that defines the type of field (e.g., STATUS and CITATION)
- Sequence header: all tags and their descriptions associated with an EST sequence. Following the header is the tag SEQUENCE and the EST sequence.
- Output file – downloadable file created by SeqEST program; you will want to edit this file further to include information specific to each sequence.
The EST sequences are input from a file on your computer. This must be a text file with the sequences in FASTA format: each sequence has a definition line starting with the “>” character, followed by one or more sequence lines.
The EST information form is seperated into two sections, the obligatory and the non-obligatory. The first section consists of information that must be filled in for an output file to be created. The information in the second section (non-obligatory fields) will be printed in the output file only if the associated check box has been checked; if no information is entered in the text box for a field, the tag will be output so that you may edit the file on your computer to contain the information specific to that sequence.
Also note the following:
- Information in the form text fields is included in every sequence header in the output file.
- Information in the form text fields is not checked for errors, it is copied directly into the output file headers.
- The tag “EST#” is not in the form but the tag is printed in the output file. This is because the EST number is unique to every sequence and must be entered seperately.
Fields are presented in the following order in the form.
Information for PUBLIC field. There are two options:
- Enter an exact date of release. The first field is the day, second is the month and the third is the year. Use the pull-down menu to choose day, month and year.
- “Immediate” option. This indicates that the information is for immediate release. Choosing this option will leave the date blank.
This page contains a link to the dbEST submission file, which you may view or download using your browsers “save as” function. If the lines run together in the downloaded file, load the file into a word processing program (which will display it correctly) and save it as text.
If the input sequence file is not in valid FASTA format, the output file will not be created and a warning will be given.
If invalid information is entered in the form, the output file will be created but a warning will be shown with a message about the invalid information.