DispHScan

DispHScan web server

Description

DispHScan is a multi-sequence web tool for predicting protein disorder as a function of pH. It is based on the assumption that both hydrophobicity and net charge depend on the solution pH (Ref DispHred). DispHScan allows pH-dependent disorder predictions at the single protein sequence level but is specially intended to analyze large datasets. The program computes hydrophobicity and net charge as a function of pH to predict the disorder tendency for each sequence and spots possible folding transitions in the selected pH range.

Input

Under the "Submission" section, upload a valid FASTA file or paste the sequence in FASTA format in the textbox area. Then, type the range of pH (from -1 to 15) you want to calculate disorder and the desired step and window size (default is 0.5 and 51, respectively).

Final scores and global protein analysis remain largely unaffected by the window size; however, it influences local predictions along the sequence and the resulting profile. The default window of 51 is suitable for the large majority of full-length proteins and long disorder analyses. For region-specific sequence analysis, we suggest window sizes of 11, or 21 residues depending on the protein length and the required detail.

If the user clicks on the "Example" button, five sequences with defined pH ranges will be uploaded, as model IPDs ready to be submitted to illustrate DispHScan functioning.

Users can also calculate disorder at only one pH by selecting the checkbox option below the text area.

Output

The output will display two clickable links for checking the results. The first one is a JSON file with raw results for implementation in external bioinformatic pipelines. The second link downloads a zip file with all graphs, tables, and files generated by the predictor. Below the links, an interactive table will be displayed with all the scores computed by the predictor, with clickable identifiers that will show the corresponding figure for each entry. Graphs represent how disorder varies across the pH range, with positive DispH scores predicted as folded and negative DispH scores as unfolded. For computational purposes, graphs will not be generated when more than 10 sequences are introduced.

Columns description in the online output and the downloadable summary table:

ID: sequence identifier, tag line.

Transition prediction: it informs whether a transition has been spotted (True) or not (False). It also warns about Possible Multitransitions.

Folding state: the predicted folding nature of the protein (Folded/Unfolded). If a transition is spotted, it details the conformation switch, a conditional folding (from disordered to ordered) or unfolding (from ordered to disordered) at the specified transition pH. For example, if a protein presents a transition with conditional unfolding at pH 3.75, that means that its conformation is predicted as unfolded from such point on.

Transition pH: the pH (or pHs) where transition occurs. If no transition is predicted, no transition pH appears.

Maximum DispH: maximum DispH score obtained in the prediction.

pH of maximum DispH: maximum pH associated with the maximum DispH score.

Minimum DispH: minimum DispH score obtained in the prediction.

pH of minimum DispH: minimum pH associated with the maximum DispH score.

If the option "Predict disorder only at one pH" is selected, the table will show only the ID, average DispH Score, hydrophobicity, NCPR, and Length. Positive DispH scores are predicted as folded at such pH, whereas negative DispH scores are predicted as unfolded. If the number of sequences is less or equal to 10, clickable IDs will be available to show the corresponding DispH score per residue for each sequence.

Background queues for big jobs

DispHScan architecture limits direct online output responses to 200 sequences at 13 pH datapoints or a similar load. For computational-effectiveness purposes, when bigger jobs are sent, DispHScan passes the task to a background queue. The user will receive a link with an estimated calculation time to access the results later on when the predictions finish. Be aware that working with long amino acid sequences (e.g., many globular proteins) might take a bit longer.

Pre-calulated data for model organisms

CSV files available for downloading pH-dependent disorder predictions in the proteomes of 4 different model organisms:

DispHScan web server

Description

Input

Output

Columns description in the online output and the downloadable summary table:

Background queues for big jobs

Primary citations:

Additional references

For any suggestions, questions or doubts, contact us at: