Sep 30, 2021

Public workspaceAnalysis of protein structure using Molprobity V.3

This protocol is a draft, published without a DOI.
  • 1James Madison University
Icon indicating open access to content
QR code linking to this content
Protocol CitationChris Berndsen 2021. Analysis of protein structure using Molprobity. protocols.io https://protocols.io/view/analysis-of-protein-structure-using-molprobity-bynvpve6Version created by Chris Berndsen
Manuscript citation:
Williams et al. (2018) MolProbity: More and better reference data for improved all-atom structure validation. Protein Science 27: 293-315.
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol and it’s working
Created: September 30, 2021
Last Modified: September 30, 2021
Protocol Integer ID: 53685
Abstract
Molprobity is a valuable collection of structural analysis tools for the validation or "correctness" of protein structures and homology models. This protocol walks through the basics of running the webserver version of Molprobity and provides some information on how to interpret the results.


Materials
computer with an internet connection
structural coordinates in PDB format
molecular viewing software (optional)
Before start
A .pdb file from experimental data such as X-ray crystallography or Cryo-EM or a model from one of many softwares is required.
This protocol describes submitting a .pdb file to the Molprobity server hosted at Duke University.

Molprobity has now been incorporated into SWISS-Model results as well. The analysis aspects of either method are comparable and the analysis section can be applied to data from this latter instance. See the sections on the multi-criterion table and Ramachandran plot in this case.
Prepare file
Prepare file
Navigate to the Molprobity home page
Upload the .pdb file and press the upload file button.

Upload screen for Molprobity
Note
The server will then process the file. This can take a few seconds to a few minutes depending on size of the molecule, format, etc.


If successful, the a screen similar to the following will appear. Press Continue to move to the analysis phase.

Successful upload to the server

Record the name of the PDB file that you used and make sure that it is recorded in your notebook.
In the analysis menu, select Add Hydrogens.

Note
Adding hydrogens is not required, but will improve the analysis reliability. Most homology modeling programs do not include hydrogens because hydrogens are not observed (for the most past) in X-ray crystallography experiments.

The defaults are fine for most analyses. Press Start adding H to begin adding hydrogens to the structure.

  1. Flips optimizes hydrogen bonding, which can be good in models. If you have a crystal structure, then maybe run the analysis twice with and without flips and see the difference.
  2. Electron cloud is appropriate for most structures.



Record any amino acids that were flipped in the table below. Add rows as needed.


A
Flipped amino acids

If hydrogens were added and geometry flipped. The model has been changed and the new model should be kept for further analysis.

Molprobity will ask if you want to download the file say that you do. Upload this to your project folder for this project and name the file:
[date]_[sequencename]_[team_name]_Model_h_added.pdb
Replace [Group_name] with your name/group name without the brackets. Replace [sequence_name] with the name of the sequence.
Indicate your project file location as a link within a note on this step.
Analyze the structure
Analyze the structure
When the addition and optimization is complete. The screen with show the options below. Select Analyze all-atom contacts and geometry.




The menu on the next page allows for selection parameters to analyze. In general the selections shown below are appropriate for analysis of protein homology models.

Press Run programs to perform analyses and wait about 60 seconds.




The results screen brings up many tables and files.
The summary table is at the top of the screen and should appear similar as that shown below.

Summary table of a homology model

Score descriptions:

  • Clashscore, all atoms: This score rates the sterics, lower number is better, while higher percentile is better. The percentile is the value that should be reported.
  • Rotamers: Refers to the geometry of the amino acid side chains. The number indicates the number of amino acids in the poor or favored category and the percentage of amino acids that fall into those categories. If there is a high percentage of poor rotamers (>1.5%), this can be concerning.
  • Ramachandran: Shows number of amino acids with poor or favored phi/psi angles. Phi/Psi angles are the dihedral angles in the protein backbone. Only certain angles are typically found in proteins.
  • Rama distribution Z-score: Can show over fitting of angles to unrealistic levels. A value -4 < x < 2 is considered appropriate for normal structures. This pre-print provides more details: Link
  • Molprobity score: Combines all the geometric scores into a single value to suggest quality. Lower values and higher percentiles are better.
  • Cβ, Bad bonds, band angles, cis-Prolines: Indicate number of amino acids with poor geometry or chemical parameters.
  • CaBLAM outliers: C-Alpha Based Low-resolution Annotation Method, looks at the backbone geometry.

Multi-criterion table
Multi-criterion table
At the bottom of the page are several additional files that can be viewed for deeper analysis.



The Multi-Criterion chart is the highest level of detail and resolution and shows the residue level problems in the protein.

  • Boxes shown in pink indicate the bad parameter at that amino acid position.
  • More than 5 consecutive amino acids with bad parameters may mean an issue with that part of the protein and should be inspected manually as well as reported.



Save the multi-criterion chart as a PDF. Upload this to your folder for this project and name the file:

--> Typically print to PDF works best in Chrome with a landscape orientation

[date]_[sequencename]_[groupname]_molprobity.pdf
Replace [Group_name] with your name/group name without the brackets. Replace [sequence_name] with the name of the sequence.

Indicate your file location as a link within a note on this step.

THIS IS ONE OF YOUR DATA FILE FOR THE ANALYSIS!
Ramachandran analysis
Ramachandran analysis
The Ramachandran plot is also important for analysis. Select the Ramachandran plot PDF.


Note




The Ramachandran plot PDF should appear similar to those shown below.

  • There are 6 plots, the main one to be concerned with early on is the general case shown at top left.
  • Ideally the dots (which represent the angles for each amino acid) are all enclosed inside the blue lines.
  • Outliers are marked with amino acid three letter code and position number.
  • More than 5 consecutive amino acids with bad parameters may mean an issue with that part of the protein and should be inspected manually as well as reported.



Save the Ramachandran plot as a PDF. Upload this to your project folder for this project and name the file:

[date]_[sequencename]_[teamname]_ramachandran.pdf
Replace [Group_name] with your name/group name without the brackets. Replace [sequence_name] with the name of the sequence.
Indicate your file location as a link within a note on this step.
Save this protocol as a PDF and include in your project folder.