The scripts and input files that accompany this demo can be found in the
demos/public
directory of the Rosetta weekly releases.
KEYWORDS: EXPERIMENTAL_DATA STRUCTURE_PREDICTION
Written October 26, 2010 Modified August 27, 2013 Modified June 24, 2016
This document briefly walks through the use of Rosetta to solve difficult molecular replacement problems. These tools assume that the user has access to the Phenix suite of crystallographic software (in particular, phaser and the mapbuilding script mtz2map); however, all intermediate files are included so that if the user does not, most of the demo may still be run.
The basic protocol is done in 5 steps; each step has a corresponding script in the folder:
Using HHSearch, find potential homology to the target sequence. Use a Rosetta "helper script" to prepare templates (and Rosetta inputs for subsequent computations).
Use PHASER to search for placement of the trimmed templates within the unit cell.
Generate a map correspoding to each putative MR solution.
Using Rosetta, rebuild gaps and refine each template/orientation in Rosetta, constrained by the density of each solution. After rescoring with PHASER, the best template/orientation should be clear (if the correct solution was among the starting models).
This command-line illustrates the use of my script for preparing templates for an initial phaser run. Functionally, it's doing the same thing as the crystallographic software 'Sculptor' but it doesn't remap the residues as sculptor does (and makes it easier to run with different alignments). The script takes just one arguments: an HHR format alignment file.
Alignments generally come from HHsearch's web interface (http://toolkit.tuebingen.mpg.de/hhpred). After submitting the sequence through their website, export the results to a .hhr file. Results may be trimmed so only alignments with a reasonable e-value and sequence coverage are included.
The script parses the .hhr file, downloads each template PDB, and trims the PDB to the aligned residues. In addition, the script produces a 'rosetta-style' alignment file; the format is briefly introduced below. These alignment files are used in Rosetta model-building.
## 1CRB_ 2qo4b
# hhsearch
scores_from_program: 0 1.00
2 DFNGYWKMLSNENFEEYLRALDVNVALRKIANLLKPDKEIVQDGDHMIIRTLSTFRNYIMDFQVGKEFEEDLTGIDDRKCMTTVSWDGDKLQCVQKGEKEGRGWTQWIEGDELHLEMRAEGVTCKQVFKKV
0 AFSGTWQVYAQENYEEFLRAISLPEEVIKLAKDVKPVTEIQQNGSDFTITSKTPGKTVTNSFTIGKEAEIT--TMDGKKLKCIVKLDGGKLVCRTD----RFSHIQEIKAGEMVETLTVGGTTMIRKSKKI
--
The first line is '##' followed by a code for the target and one for the template. The second line identifies the source of the alignment; the third just keep as it is. The fourth line is the target sequence and the fifth is the template ... the number is an 'offset', identifying where the sequence starts. However, the number doesn't use the PDB resid but just counds residues starting at 0. The sixth line is '--'.
The results for this demo appear in the folder 'templates'. For each alignement in the starting .hhr file, 3 files are produced.
You can run the file either by running the provided .sh file or:
$> $ROSETTA3/src/apps/public/electron_density/prepare_template_for_MR.pl inputs/1crb.hhr
where $ROSETTA3
=path-to-Rosetta/main/source
This command line shows the use of Phaser to generate initial molecular replacement solutions. For each template we run phaser to find potential placements of each template in the unit cell.
NOTES these steps require havin PHENIX installed.
The example scripts here only generate a single model from a single template, but for a real-world case, one will often want to use many different templates and may want to generate more than one possible solution using 'TOPFILES n'. In general, though, we have found it is better to use fewer potential solutions from more templates than many solutions from few templates.
Sometimes weak hits may be found by lowering the rotation function cutoff in phaser by adding the line 'SELECT ROT FINAL PERCENT 0.65' (or even 0.5) to the phaser script. Increasing the packing function threshold (with PACK 10) may also help in some cases.
Finally, for each template/orientation, we generate the 2mfo-dfc map for input to Rosetta in the next step.
The final step illustrate the use of rosetta's comparative modeling into density. After running the script and an initial phaser run, density maps are generated from each phaser hit, and cm-into-density is done. The flag -MR::mode cm is used to run this mode. This first application does not try to rebuild gaps in the alignment, it just performs the threading and runs relax into density. Thus, the only inputs needed are: the target fasta file, the rosetta-style ali file, and the template pdb. Because there is no rebuilding, not many models are needed to adequately cover conformational space, generally 10-20 is sufficient.
This script is the same as above, but also rebuilds gaps in the alignment. The
main difference is that a non-zero value is given for
-MR::max_gaplength_to_model
; additionally, some flags must be given that
describe how rosetta should rebuild gaps.
<<<<<<< HEAD Several additional input files must be provided as well. Rebuilding of gaps is done by fragment insertion (as in Rosetta ab initio); thus two backbone fragment files (3-mers and 9-mers) must be given. The application for building these is included with rosetta but requires a bunch of external tools/databases. The easiest way to generate fragments is to use the Robetta server (http://robetta.bakerlab.org/fragmentsubmit.jsp). The fragment files
$ bin/extract_pdbs.default.linuxgccrelease -database $DB -in:file:silent <silent_filename> -silent_struct_type binary