application_documentation/analysis/cl complex rescore

Creator Names:

Zachary Drake (drake.463@osu.edu) PI: Steffen Lindert (lindert.1@osu.edu) Updated: March 26, 2025

Use changes in residue covalent labeling (CL) data obtained by mass spectrometry to penalize decoys generated by RosettaDock by predicting modification changes from interface distances. Developed to supplement the REF15 scorefunction to quantify the agreement of docked complexes with CL data.

Reference: https://www.nature.com/articles/s41467-022-35593-8

Description of algorithm

This application is designed to rescore large sets of docked models with differential covalent labeling data to improve the quality of the top-scoring model. The application can be used for situations such as a generating a structure for a protein complex of interest which has been studied using covalent labeling but does not have an existing crystal structure.

Full tutorial is available in the SI of the source paper.

Usage

To use the application, the following command line options can be specified:

-in:file:l         #file listing PDBs of docked complexes
-cl_data           #file containing per-residue labeling data. Formatting is specified in later section
-interface         #chains comprising each docking partner with '_' indicating the interface

Example

cl_complex_rescore.default.linuxgccrelease -in:file:l pdblist.txt -cl_file insulin_data.txt -interface ABCDEFGH_IJKL

Format of Input CL

Text file should contain a tab-separated list of each labeled residue and the degree of modification for each labeled residue for both the unbound and bound states as well as which chain(s) the residue is located on. The first column should indicate the sequence number of each residue (Note: number residues by treating the first residue of each chain as being at position 1). The second and third column should contain the modification rate (or other metric of modification) of the unbound and bound state of the complex, respectively. The fourth column should contain each chain the residue is located on (no spaces).

Example

#ResidueIndex   UnboundModif.  BoundModif.   Chains
7                   56.4          34.7       ACEGIK 
8                   56.4          34.7       ACEGIK 
10                  56.4          34.7       ACEGIK 
13                  56.4          34.7       ACEGIK 
14                  56.4          34.7       ACEGIK 
19                  56.4          34.7       ACEGIK 
1                   1.4           0.4        BDFHJL 
5                   14.9          7.6        BDFHJL 
7                   14.9          7.6        BDFHJL 
10                  14.9          7.6        BDFHJL

Output

After successfully running the application, a “cl_complex_score.out” file will be generated. The first column specifies the model, the second column is the raw penalty score for the model (before normalization and unweighted), and the third column is the weighted and normalized penalty score.

Example

Example output file:

Model                              Raw_Penalty      Weighted_CL_Score_Term
input_files/model_00001.pdb          7.71483               12.2308
input_files/model_00002.pdb          7.48138               11.8607
... 
input_files/model_10000.pdb          7.00656               11.108