Author: Steven Lewis, P. Ben Stranges, Jared Adolf-Bryfogle
Last updated June 29, 2016 by Steven Lewis. The PI is Brian Kuhlman, bkuhlman@email.unc.edu .
InterfaceAnalyzer has a suite of integration test/demos at rosetta/main/tests/integration/tests/InterfaceAnalyzer. The application resides at rosetta/main/source/src/apps/public/analysis/InterfaceAnalyzer.cc, and its mover at rosetta/main/source/src/protocols/anchored_design/InterfaceAnalyzerMover.*
The underlying Mover is documented for use (for example in RosettaScripts here: InterfaceAnalyzerMover.
No references are directly associated with this protocol. It was used with the AnchoredDesign application (see that app's documentation), CAPRI21 interface discrimination, and the reference below.
Stranges, P.B. and B. Kuhlman, A comparison of successful and failed protein interface designs highlights the challenges of designing buried hydrogen bonds. Protein Science, 2013. 22(1): p. 74-82.
This application combines a set of tools to analyze protein-protein interfaces. It calculates binding energies, buried interface surface areas, and other useful interface metrics. The bulk of the code is a Mover intended to be used at the tail end of interface modeling protocols; it considers that situation to be its primary client. This application is a front-end directly onto that Mover; it is considered a secondary purpose. This application will not work on protein-ligand interfaces.
This code does not purposefully modify the input; instead it scores it using interface-related metrics and reports the scores.
For the purposes of packing, the PackerTask is set up like so:
None of these operations (except the resfile) affect the choice of what residues are at the interface.
The interface is detected via InterfaceNeighborDefinitionCalculator for two-chain interfaces. The interface is detected via InterGroupNeighborsCalculator for the multichain constructor. It serves basically the same purpose, but can detect interfaces between groups of chains. The set of residues allowed to pack from either of these calculators is logical-ANDed with the set allowed to pack from the resfile (if present); whatever both says can pack forms the interface for the purposes of repacking. Note that design is not possible; if the resfile specifies design, the design commands are ignored.
This code does not directly support ddG's of binding, but you can get that effect by running it a few times with pre-mutated structures. Any metric not listed in this file can't be directly calculated. The -fixedchains option has not been thoroughly tested beyond three chains. It will probably work but use it with care.
There is no way to output the intermediate separated chains when using the separated packing options - you are free to poke around in the code and place dump_pdb statements if you want them.
The two major modes are for two-chain and multichain interfaces. For two-chain interfaces, you need to do nothing - but defining the interface won't hurt. For interfaces involving more than two chains, you need to tell the code which chains are in which group. Define the interface either through the -fixedchain or -interface options.
Another variable that might count as a "mode" is tracer vs. PDB output. For this standalone executable, printing results to the screen (and not bothering to output "result" PDBs matching the input) is desired (and defaulted). When InterfaceAnalyzerMover is used as part of a protocol it is more common to prefer the latter.
Describe the options your protocol uses.
-fixedchains (string) - Multichain option. Which chains are in the two groups to define the interface? example: -fixedchains A B to keep chains A and B together, and C separate, out of a pose that contains A, B, and C. Note a space between A and B. Analogous to -interface option. Includes all chains of the pose. Not tested thoroughly beyond three chains.
-compute_packstat (bool) - activates packstat calculation; can be slow for large interfaces so it defaults to off. See the paper on RosettaHoles to find out more about this statistic (Protein Sci. 2009 Jan;18(1):229-39.). Packstat has a significant random component; if you are interested in the score, -packstat::oversample 100 is recommended as a companion flag; this increases runtime of packstat but reduces variance.
General Rosetta/JD2 options are accepted: in:file:s, in:file:l, in:file:silent for input; -database for the database, etc.
The SASA radii have changed since 3.5 to be more correct. Previously, they were using a set that was parameterized for a no-longer-in-use score function. That said, the changes are relatively minute. Rosetta now uses the radii from reduce by default (see options attached for original references to this radii set). You can now change both the radii set that is used and whether hydrogens are considered implicitly or explicitly. Also note that currently the only method is the LeGrand sasa method, which does not calculate the exact SASA. Most methods do not calculate the exact SASA for speed.
Ok, now, as for the hSASA. Polar atoms were included in the calculation before and they are not now. This should also be more correct, but it depends on what you think the hSASA should be. There is very little systematic analysis of this number. The default is to exclude the atom from hSASA if the charge on it is greater than .4 which, in terms of protein, only means to exclude carbonyl and carboxyl carbons as these are polar. Again, you can change this via an option. These general options are now in the SASA option namespace. Refer to this page for more.
InterfaceAnalyzer generates a ton of data about your input structure. The following are fields in the scorefile or pdb (if -tracer_data_print is false) that will be added by this application.
The following is output to either tracer or the output pdb (depending on the -tracer_data_print option), but (as of this writing) cannot be sent to a scorefile:
The following are fields that appear in the tracer output if -tracer_data_print is true:
This application is part of your post processing for other executables; there is really no post-processing to do for itself.
However, a typical situation for design is to select the top 10 models by interface dG of the top 10 percent of total energy.