IEDB Analysis Resource

Epitope Conservancy Analysis - Tutorial

Tutorial
This tool computes the degree of conservancy of an epitope within a given protein sequence set at a given identity level. Conservancy is defined as the fraction of protein sequences that contain the epitope, and Identity is the degree of correspondence (similarity) between two sequences. Two types of calculations are available: 1) Linear and 2) Discontinuous epitope sequence conservancy analyses.
How to use the tool
Step 1. Specify epitope sequences
Epitope sequences can be either directly entered in the text area or upload from a file. To upload data from a file, click the "Browse" button to select a file, then click "Click here to upload" button. File content will then be shown in the text area. Two acceptable sequence formats are PLAIN and FASTA. A sequence in PLAIN format is separated by a new line. A sequence in FASTA format begins with a single-line description, followed by line(s) of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. Two acceptable types of epitopes are Linear and Discontinuous. Linear epitopes are specified as amino acid single-letter code sequences. Discontinuous epitopes are specified as alphanumeric residues separated by commas. A minimum of 3 residues for a discontinuous epitope sequence must be specified for calculation.

Step 2. Specify protein sequences
Protein sequences can be either directly entered in the text area or upload from a file. To upload data from a file, click the "Browse" button to select a file, then click "Click here to upload" button. File content will then be shown in the text area. Two acceptable sequence formats are PLAIN and FASTA. A sequence in PLAIN format is separated by a new line. A sequence in FASTA format begins with a single-line description, followed by line(s) of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. Only linear amino acid single-letter codes sequences are accepted.

Example of a linear protein sequence: MSASKEVRSFLWTQSLRRELSGYCSNIKLQVVKDAQALLHGLDFS

To obtain FASTA protein sequences for a given organism, click on "browse for sequences in NCBI" link and follow the screen instructions.

Step 3. Specify calculation options
Step 4. Submission
Click "Submit" to start calculation or "Reset" to clear input parameters.

Example input

How the results are presented
Calculation results are presented in a summary view (for all epitope sequences) and a detail view (for individual epitope). The summary view shows for each epitope, the calculated degree of conservancy (percent of protein sequence matches a specified identity level) and the matching minimum/maximum identity levels within the protein sequence set. To view the detail sequence mapping of each epitope, click on the "Go" link in the "View details" column. The detail view of an epitope shows the positions and the matching protein sub-sequences for all sequences in the protein data set. The corresponding matching identity level of the epitope in each protein sequence is also displayed. All the calculated results can be saved to a file by clicking on the "Download data to file" button.

Example summary view



Example detail view



Sequence conservancy view

From the detail view one can view the actual sequence conservancy as an alignment by clicking the link in the "Protein name" column.