Author: Steven Lewis (smlewi@gmail.com)
Last edited 8/23/13. Code by Steven Lewis. Corresponding PI Brian Kuhlman (bkuhlman@email.unc.edu).
This page documents three applications, UBQ_E2_thioester, UBQ_Gp_CYD-CYD, and UBQ_Gp_LYX-Cterm. The code and documentation were originally written for the thioester version. The other versions are modifications of the first, and they all behave nearly the same from the outside, so they are all documented here. Assume the document is talking about the thioester case, but also assume what it says is true for the other cases.
The code is at rosetta/main/source/src/apps/public/scenarios/chemically_conjugated_docking/
; there's an integration test+demo at rosetta/main/tests/integration/tests/UBQ_E2_thioester/
. The same is true for the other versions of the application (altering the path as appropriate). Note that the integration test is vastly under-cycled relative to getting it to do anything useful: the number of cycles it demonstrates should be sufficient to show some remodeling but not enough to get anywhere useful. To run that demo, go to that directory and run [path to executeable] -database [path to database] (at-symbol)options
The code was written for this paper, which treats the attachment of ubiquitin to an E2. You may want to look at the online supplemental info for that paper for a different presentation of how the code works.
The UBQ_Gp series of executables were created for this paper, which needed a disulfide connection as an experimental mimic of the native ubiquitin isopeptide linkage, and the native isopeptide itself.
The code was further updated (particularly with multi-ubiquitin code written for the first paper, genericizations written for the paper, and new multibody code) for this paper:
This code (UBQ_E2_thioester) was written for a relatively singular application: modeling the thioester-linked state of an E2 enzyme charged with ubiquitin. We had hypotheses about what this state might look like and used this code to generate models to examine those hypotheses and generate testable mutations. It was initially expected to serve other purposes. Reading the previous section, boy did that guess prove wrong...
A second, unrelated experiment was built on the code for that first project. Here, we wished to examine how a G-protein (ras) would behave when chemically conjugated to ubiquitin. The LYX-cterm and CYD-CYD versions are to mimic the native and experimental versions of the linkage. These are separate executables because the differences between them, while slight, do not lend themselves to straightforward code modularization and reuse...it really needs to be rewritten.
A third set of experiments forced us to upgrade the code to be able to handle multiple "fixed" bodies (meaning, the thing that "ubiquitin" (the moving chain) is attached to may be an internally rigid multiprotein complex [except it can have loops]), as well as a second ubiquitin chemically attacking the first ubiquitin at the active site.
So...at this point, it's code that has a "moving thing", chemically attached via either its C-terminus or a terminal disulfide, to some collection of nonmoving things, and you can also have loops moving and a second "moving thing" attacking the first "moving thing".
In ALL CASES, the code does not require use of an E2 enzyme, a GTPase, or ubiquitin. Use whatever proteins you need. I am willing to entertain renames to something that captures their function better.
There are two important novel chunks of code associated with this algorithm.
The thioester-linked structure contains an E2 enzyme (treated as a rigid body) and a ubiquitin molecule (treated mostly rigidly). The C-terminus of ubiquitin (glycine) is chemically linked to a cysteine of the E2, resulting in a thioester bond between the proteins. It is this bond that this protocol remodels. Phil Bradley deserves credit for helping set up this chemical conjugation code.
The remodeling algorithm is straightforward. It uses Rosetta's standard Metropolis/Monte Carlo random sampling tools. A series of possible Pose modifications are chosen from on each Monte Carlo cycle. These include modification of all torsions close to the thioester, including chi1 and chi2 of the cysteine, the thioester bond, and the effective psi and phi angles of ubiquitin's terminal glycine. These are treated directly by TorsionDOFMover instead of more familiar sidechain/backbone movers because the extra chemical bond changes the torsional preferences at these bonds, meaning that the Ramachandran and Dunbrack libraries do not apply. TorsionDOFMover internally checks against a molecular-mechanics bond torsion term (although this term is not in the broader scorefunction). Other possible Monte Carlo moves include standard Small/Shear moves on the penultimate ubiquitin residues (the number of mobile residues is command-line flagged), and also KIC loop modeling. After a random move, the Pose runs through Rotamer Trials (to quickly pack sidechains) and a minimization step before the Metropolis criterion is applied. Some fraction of MC cycles instead perform a full repack of the UBQ/E2 interface.
The thioester-building code first replaces the E2's cysteine with a CYX residue type, which has an open residue connection on its sulfur atom (and no hydrogen there). The UBQ glycine is then appended to this with a chemical bond, removing its terminus type. The remainder of ubiquitin is the prepended before the final residue, resulting in a normal residue ordering but a reversed atom-tree folding order (Ubiquitin folds from its C-terminus, rather than its N-terminus).
The protocol once included embedded machinery for generating constraints which are based on experimental data from Saha, Kleiger, and Deshaies; these are now provided in the integration test. (Use -publication to get the original publication behavior, but also supply those constraints!) The remaining original-system-hardcodedness is the initial thioester geometry, based on PDB 1FXT, which is also a model of a thioester-linked E2/UBQ complex.
At the end of the protocol, there is filtering machinery to automatically reject models with no signficiant interface.
This code is not safe to use with silent-file output. The contortions used to build the sidechain hookup cannot be re-processed by Rosetta's silent file input–output works fine but you can never read the files back in again. PDB-silent-file output might work but I've found that to be buggy in general so I can't recommend it.
The code was not originally meant for generic use and may be fragile for new uses.
See tests/integration/tests/UBQ_E2_thioester/ for example usage. Basically all you need is an input structure.
UBQ_E2_thioester supports three types of options: general rosetta options (packing, etc.), generic protocol options like "how many cycles" borrowed from the AnchoredDesign application, and UBQ_E2_thioester specific options.
UBQ_E2_thioester options
Options introduced for that third paper above Experimental UBQ_E2_thioester options
Experimental UBQ_Gp* options
AnchoredDesign options (borrowed for simplicity, not tied to AnchoredDesign in any other way); all are in the AnchoredDesign namespace
General options: All packing namespace options loaded by the PackerTask are respected. jd2 namespace options are respected, although input modes are not. Anything very low-level, like the database paths, is respected.
Pick the best models by total score and look at the satisfaction of your experimentally-derived constraints to decide which you think is most plausible. We used the models to successfully predict a mutation to rescue a defect caused by UBQ I44A.
Most of the terms in the scorefile will be the regular scorefunction terms. Assuming you ran with relatively default options, this will be score12; you can change that with -weights. Here is a brief description of the remaining terms. These do not contribute to total_score, but may be useful for filtering/postprocessing. Some terms are from InterfaceAnalyzer and not relevant/applicable.
Rosetta 3.3 was the first release. For the 3.4 release, the UBQ_Gp series of applications was added. For 3.5, the constraint code was factored out into a constraint file publication.cst, and some system-specific details in the code were altered to run under a "publication" flag. With these changes, UBQ_Gp_CYX-Cterm was deprecated and deleted. Two-ubiquitins, multi-body, and loops modes were added. Also for 3.5, an issue with omega angles near the conjugation bond was corrected. The omega nearest the bond now starts at 180 degrees instead of 140.