Back to SimpleMetrics page.
Autogenerated Tag Syntax Documentation:
A PerResidueProbabilitiesMetric that stores amino acid probabilities predicted by the MIF-ST model.
References and author information for the MIFSTProbabilitiesMetric simple metric:
MIFSTProbabilitiesMetric SimpleMetric's author(s): Moritz Ertelt, University of Leipzig moritz.ertelt@gmail.com
<MIFSTProbabilitiesMetric name="(&string;)" custom_type="(&string;)"
residue_selector="(&string;)" feature_selector="(&string;)"
multirun="(true &bool;)" use_gpu="(false &bool;)" />
A metric for estimating the probability of an amino acid at a given position, as predicted by the Masked Inverse Folding with Sequence Transfer (MIF-ST) model from Yang et al.. This metric requires to be build with extras=torch
, see Building Rosetta with TensorFlow and Torch for the compilation setup.
By default, the MIFSTProbabilitiesMetric will use multiple processors during prediction. (The number of processors to use is autodetermined by Torch, based on the number of processors on the machine.)
To limit the number of processors being used, set the following environment variables prior to running Rosetta (commands assuming Bash, and assuming one CPU used):
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
export TORCH_NUM_THREADS=1
export TORCH_INTRAOP_NUM_THREADS=1
export TORCH_INTEROP_NUM_THREADS=1
This, of course, will increase the runtime, but may be necessary when running on systems where you explicitly need to control CPU usage.
The example predicts the amino acid probabilities for chain A using only the coordinates and sequence of chain A.It does so by running one prediction for each position while masking its residue type. With multirun=true
& use_gpu=true
all predictions are batched together and run on the GPU (if available). Lastly it uses these predictions to score the current sequence using the pseudo-perplexity metric.
<ROSETTASCRIPTS>
<RESIDUE_SELECTORS>
<Chain name="res" chains="A" />
</RESIDUE_SELECTORS>
<SIMPLE_METRICS>
<MIFSTProbabilitiesMetric name="prediction" residue_selector="res" feature_selector="res" multirun="true" use_gpu="true"/>
<PseudoPerplexityMetric name="perplex" metric="prediction"/>
</SIMPLE_METRICS>
<FILTERS>
</FILTERS>
<MOVERS>
<RunSimpleMetrics name="run" metrics="perplex"/>
</MOVERS>
<PROTOCOLS>
<Add mover_name="run"/>
</PROTOCOLS>
</ROSETTASCRIPTS>
@article {Yang2022.05.25.493516,
author = {Kevin K. Yang and Hugh Yeh and Niccol{\`o} Zanichelli},
title = {Masked Inverse Folding with Sequence Transfer for Protein Representation Learning},
elocation-id = {2022.05.25.493516},
year = {2023},
doi = {10.1101/2022.05.25.493516},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2023/03/19/2022.05.25.493516},
eprint = {https://www.biorxiv.org/content/early/2023/03/19/2022.05.25.493516.full.pdf},
journal = {bioRxiv}
}