The scripts and input files that accompany this demo can be found in the
demos/public
directory of the Rosetta weekly releases.
KEYWORDS: MEMBRANES STRUCTURE_PREDICTION
This demo shows how to generate a homology model of a membrane protein. It only works for helical membrane proteins. In the example, a dopamine receptor structure is used as the template for a homologous dopamine receptor. This is a relatively easy homology modeling problem.
There is a non-trivial amount of preparatory work that is required before running a homology modeling task in Rosetta.
Find a structural template for homology modeling.
The first step is to find a structural alignment to a template protein. In
this demo, the fasta sequence file for the target (DXDR_.fasta
) was
pasted into the input panel of the HHpred web server. The server gives you
a list of structural homologs ranked by the quality of alignment. For
each, you get a secondary structure prediction and a template structure.
The next task is to manipulate this information to create a combined fasta
file that shows the alignment between the target and the template file. In
this demo this combined fasta file is named DXDR_D3DR.HHp
.
Process the template structure to accept the target sequenced.
Next you need to acquire the pdb file for the template, and you need to renumber the residues in the file to match the target protein numbering. Scripts for performing this are provided. First, the perl script fastAln2zones.pl is invoked on the combined fasta file:
fastAln2zones.pl DXDR_D3DR.HHp <zones_file>
This creates a "zones" file that shows the correspondence between residues in the target and residues in the template.
Using several of these pieces of information, the script
createTemplate.pl
creates the renumbered template structure:
createTemplate.pl -zonesfile <zones_file> -fastafile DXDR_.fasta -parentpdb 3PBL_A_renum.pdb -outpdb DXDR_D3DR.pdb
3PBL_A_renum.pdb was previously created by taking 3PBL.pdb and cutting out
the T4 lysozyme insertion from a loop in the membrane domain, and
renumbering the remaining residues with the awk script renum.awk
.
Similar manipulations may be required for each specific application.
The end product of step 2 is the file DXDR_D3DR.pdb
Create a fragment library for the target protein.
You also need to generate a fragments file as for any other loop/homology modeling application. See the appropriate demo for fragment generation.
Identify the regions of the protein that must be rebuilt by Rosetta.
Now we need to tell Rosetta the regions that must be rebuilt by Rosetta.
These are the regions that are poorly defined. This is defined in
DXDR.loopfile. The format is: first line, a comment, other lines are
LOOP start-res end-res cutpoint skip-rate extend
for each flexible region
to rebuild. The cutpoint column is used to tell Rosetta where to cut the
loop. A zero tells Rosetta to use its default. Skip rate allows you to
spend more time on one loop versus the others. Here we don't ask for this.
Extend "X" indicates not to use any information from the starting template
as part of the flexible regions.
Create membrane specific input files.
There are two membrane protein-specific input files that are required. The first is a weight file with specialized scoring function weights. It also triggers the initialization of a number of membrane-specific scoring function capabilities. The second membrane specific file tells Rosetta where the membrane is. This file contains more information than is necessary. It is called DXDR.span. The format is:
To sum up, we now have:
DXDR_D3DR.pdb
– the template pdb file generated aboveDXDR.span
– the membrane helix informationFrags/
a directory with fragment data for the target proteinDXDR.loopfile
– the file that tells which regions are to be rebuiltThe following command line options are used as arguments to the minirosetta executable (minirosetta.(os)(options)(mode), for example:
minirosetta.linuxgccrelease
minirosetta.macosclangrelease
Provide the path to your own rosetta database (not necessary if you would like to use the database located in the same Rosetta/main/ directory as the executable):
-database /Users/patrickbarth/RosettaCon2011/tutorial/trunk_r43621/minirosetta_database
Membrane proteins use the same protocol as regular proteins:
-run:protocol looprelax
Information for fragments files:
-loops:frag_sizes 9 3 1
-loops:frag_files ./rosetta_inputs/frags/aaDXDR_09_05.200_v1_3 ./rosetta_inputs/frags/aaDXDR_03_05.200_v1_3
Some input information. For benchmark purposes, replace the following PDB by the experimentally-determined native structure:
-in:file:native ./rosetta_inputs/DXDR_D3DR.pdb
-in:file:fullatom
-s DXDR.pdb
The following PDB corresponds to the starting template:
-loops:input_pdb /Users/patrickbarth/RosettaCon2011/tutorial/dopamine/input/DXDR_D3DR.pdb
Loop file generated from the zone file:
-loops:loop_file /Users/patrickbarth/RosettaCon2011/tutorial/dopamine/input/DXDR.loopfile
Membrane specific input information:
-score:weights ./rosetta_inputs/membrane_highres_t1.wts
-in:file:spanfile ./rosetta_inputs/DXDR.span
Output format:
-out:file:fullatom
-out:file:silent_struct_type binary
-out:file:silent DXDR_lprlxmb.out
Number of structures generated:
-nstruct 1
Loop remodeling step: for additional details, see the regular looprelax protocol options:
-loops:remodel quick_ccd
#-loops:refine refine_ccd
-loops:random_order
-loops:idealize_before_loop_close
If you want FULL STRUCTURE relax then do this, otherwise don't:
-loops::relax fastrelax
Fail on bad H-bonds:
-fail_on_bad_hbond false
The expected output is a silent file with the model coordinates. In this demo, the name of the file is:
DXDR_lprlxmb.out.
For example, your full commandline could look like this:
$> $ROSETTA3/bin/minirosetta.default.linuxgccrelease -run:protocol looprelax -loops:frag_sizes 9 3 -loops:frag_files ./rosetta_inputs/frags/aaDXDR_09_05.200_v1_3 ./rosetta_inputs/frags/aaDXDR_03_05.200_v1_3 -in:file:native ./rosetta_inputs/DXDR_D3DR.pdb -in:file:fullatom -s ./rosetta_inputs/DXDR_D3DR.pdb -loops:input_pdb ./rosetta_inputs/DXDR_D3DR.pdb -loops:loop_file ./rosetta_inputs/DXDR.loopfile -score:weights ./rosetta_inputs/membrane_highres_t1.wts -in:file:spanfile ./rosetta_inputs/DXDR.span -out:file:fullatom -out:file:silent_struct_type binary -out:file:silent DXDR_lprlxmb.out -nstruct 1 -loops:remodel quick_ccd -loops:random_order -loops:idealize_before_loop_close -fail_on_bad_hbond false