Author: P. Douglas Renfrew (renfrew@nyu.edu) Author: Andrew Watkins (amw579@nyu.edu)
The documentation was updated in February 2016, by Andy Watkins. Notes about how current usage differs from what is documented here were added on 9 November 2018 by Vikram K. Mulligan (vmulligan@flatironinstiutute.org) and on 27 November 2018 by Andy Watkins. Minor tweaks were made on 13 May 2021 and 17 May 2021 by Vikram K. Mulligan. The PI for this application is Brian Kuhlman (bkuhlman@email.unc.edu).
The main mover is located in the application code at main/source/src/protocols/make_rot_lib
. The application code is located at main/source/src/apps/public/ncaa_utilities/MakeRotLib.cc
. The integration tests can be found at main/tests/integration/tests/make_rot_lib/
. The demo can be found at demos/public/make_rot_lib/
.
P. Douglas Renfrew, Eun Jung Choi, Brian Kuhlman, "Using Noncanonical Amino Acids in Computational Protein-Peptide Interface Design" (2011) PLoS One. It is strongly recomended to read the paper as it provided addition details.
A very early version of this code was used to generate a bound dye rotamer library for Gulyani A, Vitriol E, Allen R, Wu J, Gremyachinskiy D, Lewis S, Dewar B, Graves LM, Kay BK, Kuhlman B, Elston T, Hahn KM. A biosensor generated via high-throughput screening quantifies cell edge Src dynamics. Nat Chem Biol. 2011 Jun 12;7(7):437-44. doi: 10.1038/nchembio.585. Erratum in: Nat Chem Biol. 2012 Aug;8(8):737. PubMed PMID: 21666688; PubMed Central PMCID: PMC3135387.
This code creats a Noncanonical Amino Acid (NCAA) rotamer library and is the second of three steps toward being able to use a NCAA in Rosetta.
Rotamer libraries are sets of common side chain conformations that generally correspond to local minima on the side chain conformational energy landscape. Side chain conformations are usually represented as a set of mean angles and a standard deviation to indicate variability. Rotamer libraries are used in for two main purposes in Rosetta: to provide starting points for side chain optimization routines, and the relative frequency is used as a pseudo-energy. Traditionally rotamer libraries are created by collecting statistics from protein structures. Rosetta uses the backbone dependent Drunbrack rotamer libraries. Since there are not enough structures containing NCAAs they must be generated.
Running the MakeRotLib protocol consists of four steps
This code was originally designed to make Dunbrack 2002 style rotamer libraries for noncanonical side chains with an alpha-peptide backbone. It is reasonably well extended to handle peptoid backbones and beta+-peptide backbones; moreover, it has the facility to generate the continuous distribution necessary for a Dunbrack 2010-style approach to a non-rotameric chi (such as chi2 of aspartate).
Rosetta primarily uses backbone dependent rotamer libraries. Backbone-dependent rotamer libraries list provide side chain conformations sampled from residue positions whose backbone dihedral angles fall in particular bins. In the case of the Drunbrack rotamer libraries used by Rosetta the bins are in 10 degree intervals for for both phi and psi for a total of 1296 (36*36) phi/psi bins. (For beta amino acids, for the moment, we employ 30 degree bins, for a total of 1728 (12*12*12).) To replicate this for the NCAAs we need to create a set of side chain rotamers for each member of a set of phi/psi bins.
The MakeRotLib protocol takes an option file as input. It requires an options file for each phi/psi bin. The first step in running it is creating these 1296 options files. Continuing from the HowToMakeResidueTypeParamFiles protocol capture we are again using ornithine as an example. Ornithine has 3 sidechain dihedral angles (chi). We want to sample each chi angle from 0 to 360 degrees in 30 degree intervals, and based on the chemistry of the side chain we predict that were will probably be three preferred angles for each chi angle at 60, 180, and 300 degrees for a total of 27 rotamers (3x3x3). We setup our MakeRotLib options file template as shown bellow.
The C40_rot_lib_options_XXX_YYY.in
file is shown bellow.
AA_NAME C40
OMG_RANGE 180 180 1
PHI_RANGE XXX XXX 1
PSI_RANGE YYY YYY 1
EPS_RANGE 180 180 1
NUM_CHI 3
NUM_BB 2
CHI_RANGE 1 0 330 30
CHI_RANGE 2 0 330 30
CHI_RANGE 3 0 330 30
CENTROID 300 1 300 1 300 1
CENTROID 300 1 300 1 180 2
CENTROID 300 1 300 1 60 3
CENTROID 300 1 180 2 300 1
CENTROID 300 1 180 2 180 2
CENTROID 300 1 180 2 60 3
CENTROID 300 1 60 3 300 1
CENTROID 300 1 60 3 180 2
CENTROID 300 1 60 3 60 3
CENTROID 180 2 300 1 300 1
CENTROID 180 2 300 1 180 2
CENTROID 180 2 300 1 60 3
CENTROID 180 2 180 2 300 1
CENTROID 180 2 180 2 180 2
CENTROID 180 2 180 2 60 3
CENTROID 180 2 60 3 300 1
CENTROID 180 2 60 3 180 2
CENTROID 180 2 60 3 60 3
CENTROID 60 3 300 1 300 1
CENTROID 60 3 300 1 180 2
CENTROID 60 3 300 1 60 3
CENTROID 60 3 180 2 300 1
CENTROID 60 3 180 2 180 2
CENTROID 60 3 180 2 60 3
CENTROID 60 3 60 3 300 1
CENTROID 60 3 60 3 180 2
CENTROID 60 3 60 3 60 3
TEMPERATURE 1
TEMPERATURE | Boltzmann distribution style temperature to translate from the energies generated by the protocol into probabilities.
CENTROID input is laborious, especially for residues with four chis and multiple nonrotameric chi angles (for example, tryptophan has twelve bins for chi2 in its Dun10 representation); manually specifying a four-chi residue with two such chis would require almost 1500 centroids. Consequently, we have developed an alternative syntax:
ROTWELLS 1 3 60 180 300
ROTWELLS 2 3 60 180 300
ROTWELLS 3 3 60 180 300
To generate the 1296 input files we use a provided script that simply replace the XXX and YYY with the phi and psi values. The script is run as shown bellow. (Note: As of 9 Nov 2018, it is not necessary to generate thousands of input files. Ranges can be specified in a single input file.)
$ cd inputs
$ ../scripts/make_inputs C40_rot_lib_options_XXX_YYY.in
The number of chi angles and the CHI_RANGE sampling interval are the primary determinants of the run time as they determine the number of rotamers that will be tested for each phi/psi bin. It is recommended to have at least 500 samples per chi. In the ornithine example we sample in 30 degree intervals for each of the 3 chi angles giving us a total of 1728 (12x12x12) conformations tested for each phi/psi bin. For a residue with a single chi 1 degree bins will suffice.
The next step is to run the MakeRotLib protocol on each of the input files we created in step one. This is the most time consuming portion of the process and should probably be done on a cluster. As cluster setups vary, an example for a single MakeRotLib options file is provided. The other 1295 should be run identically. (Note: As of 9 Nov 2018, it is not necessary to generate thousands of input files. Ranges can be specified in a single input file.)
$ cd outputs
$ PATH/TO/ROSETTA/bin/MakeRotLib.default.macosclangrelease -database PATH/TO/rosetta/main/database -options_file ../inputs/C40_rot_lib_options_-60_-40.in >C40_rot_lib_options_-60_-40.log &
Note: The extension on your executable may be different.
The only options passed to the executable are the path to the database and the MakeRotLib options file. After the run completes a file called C40_-60_-40_bbdep.rotlib
should be in the output directory. This is the backbone dependent rotamer library for a phi of -60 and a psi of -40.
The log file from the rosetta run in includes quite a bit of useful output.
There are three main sections to the log output, ROTAMERS
, CENTROIDS
and FINAL ROTAMERS
sections. Each one shows the following data: phi, psi, omega and epsilon backbone dihedral angles, probability, total energy, torsion energy, intra-residue repulsive, intra-residue attractive, the number of chi angles, the assigned cluster number, the set of input chi angles, the set of minimized chi angles, the standard deviation, and the distance from that point to each of the cluster centroids.
The log file also displays the number of conformations per cluster, the average distance between the cluster center and the members of that cluster. Lack of conformations in a cluster and a large (>30) average cluster centroid distance suggests that that cluster is higher in energy.
Mandatory input flags:
Optional input flags:
You should have 1296 rotamer libraries, one for each phi/psi bin.
After the MakeRotLib protocol has been run on all of the MakeRotLib options files the individual rotamer libraries for each phi psi bin need to be assembled in to a single file. This is accomplished with a provided script as shown bellow.
$ cd outputs
$ ../scripts/make_final_from_phi_psi.pl C40
The single file rotamer library should be called C40.rotlib
. The file should be placed in the ncaa_rotlibs directory in the database.
$ cp C40.rotlibs PATH/TO/DATABASE/rosetta/main/database/ncaa_rotlibs/
The last step is modifying the residue type parameter file to use the new rotamer library. To do this we need to add the name of the rotamer library, the number chi angles it describes, and how many bins there are for each chi angle to the Residue type parameter file. The ornithine rotamer library is called C40.rotlib and the rotamer library describes 3 chi angles and each of those 3 chi angle has 3 rotamer numbers. So we would use the following commands to add that information to the file we created in HowToMakeResidueTypeParamFiles.
$ echo "NCAA_ROTLIB_PATH C40.rotlib" >> PATH/TO/rosetta/main/database/chemical/residue_type_sets/fa_standard/residue_types/l-ncaa/ornithine.params
$ echo "NCAA_ROTLIB_NUM_ROTAMER_BINS 3 3 3 3" >> PATH/TO/rosetta/main/database/chemical/residue_type_sets/fa_standard/residue_types/l-ncaa/ornithine.params
This application is new for the Rosetta 3.4 release.