Each features database contains information about the features extracted from the input structures.
See the FeaturesReporters organization
You can see a graphical schema based on the following here .
<AtomTypesFeatures/>
The atom-level chemical information stored in the Rosetta AtomTypeSet. This includes base parameters for the Lennard Jones Van der Waals term and Lazaridis Karplus solvation model.
CREATE TABLE IF NOT EXISTS atom_types (
atom_type_set_name TEXT,
name TEXT,
element TEXT,
lennard_jones_radius REAL,
lennard_jones_well_depth REAL,
lazaridis_karplus_lambda REAL,
lazaridis_karplus_degrees_of_freedom REAL,
lazaridis_karplus_volume REAL,
PRIMARY KEY(atom_type_set_name, name));
CREATE TABLE IF NOT EXISTS atom_type_property_values (
property TEXT,
PRIMARY KEY(property));
INSERT INTO atom_type_property_values VALUES ( 'ACCEPTOR' );
INSERT INTO atom_type_property_values VALUES ( 'DONOR' );
INSERT INTO atom_type_property_values VALUES ( 'POLAR_HYDROGEN' );
INSERT INTO atom_type_property_values VALUES ( 'AROMATIC' );
INSERT INTO atom_type_property_values VALUES ( 'H2O' );
INSERT INTO atom_type_property_values VALUES ( 'ORBITALS' );
INSERT INTO atom_type_property_values VALUES ( 'VIRTUAL' );
INSERT INTO atom_type_property_values VALUES ( 'SP2_HYBRID' );
INSERT INTO atom_type_property_values VALUES ( 'SP3_HYBRID' );
INSERT INTO atom_type_property_values VALUES ( 'RING_HYBRID' );
CREATE TABLE IF NOT EXISTS atom_type_properties (
atom_type_set_name TEXT,
name TEXT,
property TEXT,
FOREIGN KEY(atom_type_set_name, name) REFERENCES atom_types (atom_type_set_name, name) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY(property) REFERENCES atom_type_property_values (property) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(atom_type_set_name, name));
CREATE TABLE IF NOT EXISTS atom_type_extra_parameters (
atom_type_set_name TEXT,
name TEXT,
parameter TEXT,
value REAL,
FOREIGN KEY(atom_type_set_name, name) REFERENCES atom_types (atom_type_set_name, name) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(atom_type_set_name, name));
<AtomAtomPairFeatures min_dist="(&real 0)" max_dist="(&real 10)" nbins="(&integer 15)"/>
The distances between pairs of atoms is an indicator of the packing of a structure. Since there are a large number of atom pairs, here, the information is summarized by atom pair distributions for each pair of atom types (Rosetta AtomType -> element type). See AtomInResidueAtomInResiduePairFeatures for an alternative binning of atom-atom interactions.
CREATE TABLE IF NOT EXISTS atom_pairs (
struct_id INTEGER,
atom_type TEXT,
element TEXT,
lower_break REAL,
upper_break REAL,
count INTEGER,
FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, atom_type, element, lower_break));
The distances between pairs of atoms is an indicator of the packing of a structure. Since there are a large number of atom pairs, here, the information is summarized by atom pair distributions for each pair of atom types (residue type + atom number). This is very similar in spirit to Lu H, Skolnick J. A distance-dependent atomic knowledge-based potential for improved protein structure selection. Proteins. 2001;44(3):223-32 , however, they use different distance bins. Here, (0,1], ..., (9,10] are used because they are easy. It may make sense to come up with a better binning upon further analysis. The molar fraction of atom types can be computed by joining with the Residues table since the types are unique within each residue type. If this is turns out to be too cumbersome, it may need to be pre-computed. WARNING : Currently, this generates an inordinate amount of data!!! ~250M per structure. WARNING
CREATE TABLE IF NOT EXISTS atom_pairs (
struct_id INTEGER,
residue_type1 TEXT,
atom_type1 TEXT,
residue_type2 TEXT,
atom_type2 TEXT,
distance_bin TEXT,
count INTEGER,
FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED,
CONSTRAINT dist_is_nonnegative CHECK (count >= 0),
PRIMARY KEY (struct_id, residue_type1, atom_type1, residue_type2, atom_type2, distance_bin));
<BetaTurnDetectionFeatures/>
This reporter scans all available windows of four residues and determines if a β-turn is present, determines the type of β-turn and then writes the starting residue number and turn type to a database.
CREATE TABLE IF NOT EXISTS beta_turns (
struct_id INTEGER,
residue_begin INTEGER,
turn_type TEXT,
FOREIGN KEY (struct_id, residue_begin) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, residue_begin));
CREATE TABLE IF NOT EXISTS geometric_solvation (
struct_id INTEGER,
hbond_site_id TEXT,
geometric_solvation_exact REAL,
FOREIGN KEY (struct_id, hbond_site_id) REFERENCES hbond_sites(struct_id, site_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, hbond_site_id));
The HBondFeatures (rosetta/main/source/src/protocols/features/HBondFeatures.hh) measures the geometry of hydrogen bonds. The most current reference is Tanja Kortemme, Alexandre V. Morozov, David Baker, An Orientation-dependent Hydrogen Bonding Potential Improves Prediction of Specificity and Structure for Proteins and Protein-Protein Complexes, (JMB 2003) .
The features associated with hydrogen bonding include
CREATE TABLE hbond_sites (
struct_id INTEGER,
site_id INTEGER,
resNum INTEGER,
atmNum INTEGER,
is_donor BOOLEAN,
chain INTEGER,
resType TEXT,
atmType TEXT,
HBChemType TEXT,
FOREIGN KEY(struct_id, resNum) REFERENCES residues(struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, site_id));
hbond_site_atoms : Each hydrogen bond site defines a portion of a frame by bonded atoms.
Donor atoms:
Acceptor atoms:
base2 : The alternate second base atom of the acceptor.
The base to acceptor unit vector is defined by the [hybridization type (see rosetta/main/database/chemical/atom_type_sets/fa_standard/atom_properties.txt) of the acceptor atom and the above atoms.
CREATE TABLE IF NOT EXISTS hbond_site_atoms (
struct_id INTEGER,
site_id INTEGER,
atm_x REAL,
atm_y REAL,
atm_z REAL,
base_x REAL,
base_y REAL,
base_z REAL,
bbase_x REAL,
bbase_y REAL,
bbase_z REAL,
base2_x REAL,
base2_y REAL,
base2_z REAL,
FOREIGN KEY(site_id, struct_id) REFERENCES hbond_sites(site_id, struct_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, site_id));
CREATE TABLE IF NOT EXISTS hbond_site_environment (
struct_id INTEGER,
site_id INTEGER,
sasa_r100 REAL,
sasa_r140 REAL,
sasa_r200 REAL,
hbond_energy REAL,
num_hbonds INTEGER,
FOREIGN KEY(struct_id, site_id) REFERENCES hbond_sites(struct_id, site_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, site_id));
CREATE TABLE IF NOT EXISTS hbond_sites_pdb (
struct_id INTEGER,
site_id INTEGER,
chain TEXT,
resNum INTEGER,
iCode TEXT,
heavy_atom_temperature REAL,
heavy_atom_occupancy REAL,
FOREIGN KEY(struct_id, site_id) REFERENCES hbond_sites(struct_id, site_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, site_id));
CREATE TABLE IF NOT EXISTS hbond_chem_types (
chem_type TEXT,
label TEXT,
PRIMARY KEY(chem_type));
CREATE TABLE IF NOT EXISTS hbonds (
struct_id INTEGER,
hbond_id INTEGER,
don_id INTEGER,
acc_id INTEGER,
HBEvalType INTEGER,
energy REAL,
envWeight REAL,
score_weight REAL,
donRank INTEGER,
accRank INTEGER,
FOREIGN KEY (struct_id, don_id) REFERENCES hbond_sites (struct_id, site_id) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY (struct_id, acc_id) REFERENCES hbond_sites (struct_id, site_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, hbond_id));
CREATE TABLE IF NOT EXISTS hbond_geom_coords (
struct_id INTEGER,
hbond_id INTEGER,
AHdist REAL,
cosBAH REAL,
cosAHD REAL,
chi REAL,
FOREIGN KEY(struct_id, hbond_id) REFERENCES hbonds(struct_id, hbond_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, hbond_id));
CREATE TABLE IF NOT EXISTS hbond_lennard_jones (
struct_id INTEGER,
hbond_id INTEGER,
don_acc_atrE REAL,
don_acc_repE REAL,
don_acc_solv REAL,
don_acc_base_atrE REAL,
don_acc_base_repE REAL,
don_acc_base_solv REAL,
h_acc_atrE REAL,
h_acc_repE REAL,
h_acc_solv REAL,
h_acc_base_atrE REAL,
h_acc_base_repE REAL,
h_acc_base_solv REAL,
FOREIGN KEY (struct_id, hbond_id) REFERENCES hbonds (struct_id, hbond_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, hbond_id));
The parameters for the hydrogen bond potential are specified in the Rosetta database as parameter sets (rosetta/main/database/scoring/score_functions/hbonds). Each parameter set specifies polynomials, fade functions, and which are applied to which hydrogen bond chemical types. To indicate parameter set, either use -hbond_params <database_tag> set on the command line, or set score_function.energy_method_options().hbond_options()->params_database_tag(<database_tag>) . See the HBondDatabase class (rosetta/main/source/src/core/scoring/hbonds/HBondDatabase.hh) for more information.
CREATE TABLE IF NOT EXISTS hbond_fade_interval(
database_tag TEXT,
name TEXT,
junction_type TEXT,
min0 REAL,
fmin REAL,
fmax REAL,
max0 REAL,
PRIMARY KEY(database_tag, name));
CREATE TABLE IF NOT EXISTS hbond_polynomial_1d (
database_tag TEXT,
name TEXT,
dimension TEXT,
xmin REAL,
xmax REAL,
root1 REAL,
root2 REAL,
degree INTEGER,
c_a REAL,
c_b REAL,
c_c REAL,
c_d REAL,
c_e REAL,
c_f REAL,
c_g REAL,
c_h REAL,
c_i REAL,
c_j REAL,
c_k REAL,
PRIMARY KEY(database_tag, name));
CREATE TABLE IF NOT EXISTS hbond_evaluation_types (
database_tag TEXT,
don_chem_type TEXT,
acc_chem_type TEXT,
separation TEXT,
AHdist_short_fade TEXT,
AHdist_long_fade TEXT,
cosBAH_fade TEXT,
cosAHD_fade TEXT,
AHdist TEXT,
cosBAH_short TEXT,
cosBAH_long TEXT,
cosAHD_short TEXT,
cosAHD_long TEXT,
weight_type TEXT,
FOREIGN KEY(database_tag, AHdist_short_fade) REFERENCES hbond_fade_interval(database_tag, name) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY(database_tag, AHdist_long_fade) REFERENCES hbond_fade_interval(database_tag, name) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY(database_tag, cosBAH_fade) REFERENCES hbond_fade_interval(database_tag, name) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY(database_tag, cosAHD_fade) REFERENCES hbond_fade_interval(database_tag, name) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY(database_tag, AHdist) REFERENCES hbond_polynomial_1d(database_tag, name) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY(database_tag, cosBAH_short) REFERENCES hbond_polynomial_1d(database_tag, name) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY(database_tag, cosBAH_long) REFERENCES hbond_polynomial_1d(database_tag, name) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY(database_tag, cosAHD_short) REFERENCES hbond_polynomial_1d(database_tag, name) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY(database_tag, cosAHD_long) REFERENCES hbond_polynomial_1d(database_tag, name) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (database_tag, don_chem_type, acc_chem_type, separation));
Store string , string - string , and string - real data associated with a job. As an example, the ligand docking code this way when it uses the DatabaseJobOutputter.
CREATE TABLE IF NOT EXISTS string_data (
struct_id INTEGER,
data_key TEXT,
FOREIGN KEY (struct_id) REFERENCES structures(struct_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, data_key));
CREATE TABLE IF NOT EXISTS string_string_data (
struct_id INTEGER,
data_key TEXT,
data_value TEXT,
FOREIGN KEY (struct_id) REFERENCES structures(struct_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, data_key));
CREATE TABLE IF NOT EXISTS string_real_data (
struct_id INTEGER,
data_key TEXT,
data_value REAL,
FOREIGN KEY (struct_id) REFERENCES structures(struct_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, data_key));
<LoopAnchorFeatures min_loop_length=5 max_loop_length=7/>
This reporter scans all available windows of a specified number of residues and calculates the translation and rotation to optimally superimpose the landing onto the takeoff of the loop. The translation and rotation data can then be used to compare different "classes" of loop anchors.
CREATE TABLE IF NOT EXISTS loop_anchors (
struct_id INTEGER,
residue_begin INTEGER,
residue_end INTEGER,
FOREIGN KEY (struct_id, residue_begin)
REFERENCES residues (struct_id, resNum)
DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY (struct_id, residue_end)
REFERENCES residues (struct_id, resNum)
DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, residue_begin, residue_end));
CREATE TABLE IF NOT EXISTS loop_anchor_transforms (
struct_id INTEGER,
residue_begin INTEGER,
residue_end INTEGER,
x REAL,
y REAL,
z REAL,
phi REAL,
psi REAL,
theta REAL,
FOREIGN KEY (struct_id, residue_begin, residue_end)
REFERENCES loop_anchors (struct_id, residue_begin, residue_end)
DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, residue_begin, residue_end));
The OrbitalFeatures stores information about chemical interactions involving orbitals. Orbitals are atomically localized electrons that can form weak, orientation dependent interactions with polar and aromatic functional groups and other orbitals. Orbital geometry are defined in the residue type sets in the database. Following the orbitals score term, orbitals are defines between residues where the action center is at most 11A apart.
CREATE TABLE IF NOT EXISTS orbital_polar_hydrogen_interactions (
struct_id TEXT,
resNum1 INTEGER,
orbNum1 INTEGER,
orbName1 TEXT,
resNum2 INTEGER,
hpolNum2 INTEGER,
dist REAL,
angle REAL,
FOREIGN KEY (struct_id, resNum1) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY (struct_id, resNum2) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, resNum1, orbNum1, resNum2, hpolNum2));
CREATE TABLE IF NOT EXISTS orbital_aromatic_hydrogen_interactions (
struct_id TEXT,
resNum1 INTEGER,
orbNum1 INTEGER,
orbName1 TEXT,
resNum2 INTEGER,
haroNum2 INTEGER,
dist REAL,
angle REAL,
FOREIGN KEY (struct_id, resNum1) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY (struct_id, resNum2) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, resNum1, orbNum1, resNum2, haroNum2));
CREATE TABLE IF NOT EXISTS orbital_orbital_interactions (
struct_id TEXT,
resNum1 INTEGER,
orbNum1 INTEGER,
orbName1 TEXT,
resNum2 INTEGER,
orbNum2 INTEGER,
dist REAL,
angle REAL,
FOREIGN KEY (struct_id, resNum1) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY (struct_id, resNum2) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, resNum1, orbNum1, resNum2, orbNum2));
PairFeatures measures the distances between residues.
CREATE TABLE IF NOT EXISTS residue_pairs (
struct_id INTEGER,
resNum1 INTEGER,
resNum2 INTEGER,
res1_10A_neighbors INTEGER,
res2_10A_neighbors INTEGER,
actcoord_dist REAL,
polymeric_sequence_dist INTEGER,
FOREIGN KEY (struct_id, resNum1) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY (struct_id, resNum2) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
CONSTRAINT res1_10A_neighbors_is_positive CHECK (res1_10A_neighbors >= 1),
CONSTRAINT res2_10A_neighbors_is_positive CHECK (res2_10A_neighbors >= 1),
CONSTRAINT actcoord_dist_is_nonnegative CHECK (actcoord_dist >= 0));
PdbDataFeatures records information that is stored in the protein databank structure format.
<PdbDataFeatures/>
CREATE TABLE IF NOT EXISTS residue_pdb_identification (
struct_id INTEGER,
residue_number INTEGER,
chain_id TEXT,
insertion_code TEXT,
pdb_residue_number INTEGER,
FOREIGN KEY (struct_id, residue_number)
REFERENCES residues (struct_id, resNum)
DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, residue_number));
CREATE TABLE IF NOT EXISTS residue_pdb_confidence (
struct_id INTEGER,
residue_number INTEGER,
max_temperature REAL,
max_bb_temperature REAL,
max_sc_temperature REAL,
min_occupancy REAL,
min_bb_occupancy REAL,
min_sc_occupancy REAL,
FOREIGN KEY (struct_id, residue_number)
REFERENCES residues (struct_id, resNum)
DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, residue_number));
Arbitrary textual information may be associated with a pose in the form of (key, val) comments. The PoseCommentsFeatures stores this information as a feature.
CREATE TABLE IF NOT EXISTS pose_comments (
struct_id INTEGER,
key TEXT,
value TEXT,
FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, key));
PoseConformationFeatures measures the conformation level information in a Pose. Together with the ProteinResidueConformationFeatures, the atomic coordinates can be reconstructed. To facilitate creating poses from conformation structure data stored in the features database, PoseConformationFeatures has a load_into_pose method.
CREATE TABLE IF NOT EXISTS pose_conformations (
struct_id INTEGER PRIMARY KEY,
annotated_sequence TEXT,
total_residue INTEGER,
fullatom BOOLEAN,
FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED);
CREATE TABLE IF NOT EXISTS fold_trees (
struct_id INTEGER,
start_res INTEGER,
start_atom TEXT,
stop_res INTEGER,
stop_atom TEXT,
label INTEGER,
keep_stub_in_residue BOOLEAN,
FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED);
CREATE TABLE IF NOT EXISTS jumps (
struct_id INTEGER,
jump_id INTEGER,
xx REAL,
xy REAL,
xz REAL,
yx REAL,
yy REAL,
yz REAL,
zx REAL,
zy REAL,
zz REAL,
x REAL,
y REAL,
z REAL,
FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED);
CREATE TABLE IF NOT EXISTS chain_endings (
struct_id INTEGER,
end_pos INTEGER,
FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED);
The ProteinBackboneAtomAtomPairFeatures reporter measures all the atom pair distances between backbone atoms in pairs residues where the action coordinate is within 10A. This follows the analysis done in Song Y, Tyka M, Leaver-Fay A, Thompson J, Baker D. Structure guided forcefield optimization. Proteins: Structure, Function, and Bioinformatics. 2011 . There, they looked at these distances for pairs of residues that form secondary structure.
CREATE TABLE IF NOT EXISTS protein_backbone_atom_atom_pairs (
struct_id TEXT,
resNum1 INTEGER,
resNum2 INTEGER,
N_N_dist REAL, N_Ca_dist REAL, N_C_dist REAL, N_O_dist REAL, N_Ha_dist REAL,
Ca_N_dist REAL, Ca_Ca_dist REAL, Ca_C_dist REAL, Ca_O_dist REAL, Ca_Ha_dist REAL,
C_N_dist REAL, C_Ca_dist REAL, C_C_dist REAL, C_O_dist REAL, C_Ha_dist REAL,
O_N_dist REAL, O_Ca_dist REAL, O_C_dist REAL, O_O_dist REAL, O_Ha_dist REAL,
Ha_N_dist REAL, Ha_Ca_dist REAL, Ha_C_dist REAL, Ha_O_dist REAL, Ha_Ha_dist REAL,
FOREIGN KEY (struct_id, resNum1) REFERENCES residues (struct_id, resNum1) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY (struct_id, resNum2) REFERENCES residues (struct_id, resNum2) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, resNum1, resNum2));
The ProteinBackboneTorsionAngleFeatures reporter stores the backbone torsion angle degrees of freedom needed represent proteins made with canonical backbones.
CREATE TABLE IF NOT EXISTS protein_backbone_torsion_angles (
struct_id TEXT,
resNum INTEGER,
phi REAL,
psi REAL,
omega REAL,
FOREIGN KEY (struct_id, resNum) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, resNum));
The conformation of protein residues is described by the coordinates of each atom. A reduced representation is just specifying the values for each torsional angle degree of freedom, these include the backbone and sidechain torsional angles. Since Proteins have only canonical amino acids, there are at most 4 torsional angles in the sidechains.
CREATE TABLE IF NOT EXISTS protein_residue_conformation (
struct_id INTEGER,
seqpos INTEGER,
secstruct STRING,
phi REAL,
psi REAL,
omega REAL,
chi1 REAL,
chi2 REAL,
chi3 REAL,
chi4 REAL,
FOREIGN KEY (struct_id, seqpos) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED);"
CREATE TABLE IF NOT EXISTS residue_atom_coords (
struct_id INTEGER,
seqpos INTEGER,
atomno INTEGER,
x REAL,
y REAL,
z REAL,
FOREIGN KEY (struct_id, seqpos) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED);
Compute the atom-wise root mean squared deviation between the conformation being reported and a previously saved conformation. The usage of this mover is more involved than other feature movers:
<ROSETTASCRIPTS>
<MOVERS>
<SavePoseMover name=spm_init_struct reference_name=init_struct/>
<ReportToDB name=features_reporter db="features_SAMPLE_SOURCE_ID.db3" sample_source="SAMPLE_SOURCE_DESCRIPTION">
<ProteinRMSDFeatures reference_name=init_struct/>
</ReportToDB>
</MOVERS>
<PROTOCOLS>
<Add mover_name=spm_init_struct/>
<Add mover_name=features_reporter/>
</PROTOCOLS>
</ROSETTASCRIPTS>
CREATE TABLE IF NOT EXISTS protein_rmsd (
struct_id INTEGER,
reference_tag TEXT,
protein_CA REAL,
protein_CA_or_CB REAL,
protein_backbone REAL,
protein_backbone_including_O REAL,
protein_backbone_sidechain_heavyatom REAL,
heavyatom REAL,
nbr_atom REAL,
all_atom REAL,
FOREIGN KEY (struct_id) REFERENCES structures (struct_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, reference_tag));
A protocol is represented as all the information necessary to reproduce the results of the Rosetta application execution. The features associated of each application execution are ultimately linked with a single row in the protocols table. Note, that since the struct_id is an autoincremented primary key of the structures table, often the results from different application executions are attached attached but not merged.
CREATE TABLE IF NOT EXISTS protocols (
protocol_id INTEGER PRIMARY KEY AUTOINCREMENT,
command_line TEXT,
specified_options TEXT,
svn_url TEXT,
svn_version TEXT,
script TEXT);
Measure the radius of gyration for each structure. The radius of gyration measure of how compact a structure is in O(n). It is the expected displacement of mass from the center of mass. The Wikipedia page is has some information . Also see, Lobanov MY, Bogatyreva NS, Galzitskaya OV. Radius of gyration as an indicator of protein structure compactness . Molecular Biology. 2008;42(4):623-628.
CREATE TABLE IF NOT EXISTS radius_of_gyration (
struct_id INTEGER,
radius_of_gyration REAL,
FOREIGN KEY(struct_id) REFERENCES structures(struct_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id));
Measures of burial are important for determining solvation and desolvation effects.
CREATE TABLE IF NOT EXISTS residue_burial (
struct_id TEXT,
resNum INTEGER,
ten_a_neighbors INTEGER,
twelve_a_neighbors INTEGER,
neigh_vect_raw REAL,
sasa_r100 REAL,
sasa_r140 REAL,
sasa_r200 REAL,
FOREIGN KEY (struct_id, resNum) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, resNum));
Store the geometry of residues that have canonical backbones but possibly non-canonical sidechains. The geometry is broken into backbone torsional degrees of freedom, nonprotein_residue_conformation , sidechain degrees of freedom, nonprotein_residue_angles , and atomic coordinates residue_atom_coords .
This differs from ProteinResidueConformationFeatures in that the residue angles are stored as a chinum -> chiangle lookup and atomic xzy-coordinates, rather than a table with slots for 4 chi values. If you know you are going to be only working with protein residues, you can conserve space by using the ProteinResidueConformationFeatures.
CREATE TABLE IF NOT EXISTS nonprotein_residue_conformation (
struct_id INTEGER,
seqpos INTEGER,
phi REAL,
psi REAL,
omega REAL,
FOREIGN KEY (struct_id, seqpos) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, seqpos));
CREATE TABLE IF NOT EXISTS nonprotein_residue_angles (
struct_id INTEGER,
seqpos INTEGER,
chinum INTEGER,
chiangle REAL,
FOREIGN KEY (struct_id, seqpos) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, seqpos));
CREATE TABLE IF NOT EXISTS residue_atom_coords (
struct_id INTEGER,
seqpos INTEGER,
atomno INTEGER,
x REAL,
y REAL,
z REAL,
FOREIGN KEY (struct_id, seqpos) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, seqpos, atomno));
The ResidueFeatures stores information about each residue in a conformation.
CREATE TABLE IF NOT EXISTS residues (
struct_id INTEGER,
resNum INTEGER,
name3 TEXT,
res_type TEXT,
FOREIGN KEY (struct_id)
REFERENCES structures (struct_id)
DEFERRABLE INITIALLY DEFERRED,
CONSTRAINT resNum_is_positive CHECK (resNum >= 1),
PRIMARY KEY(struct_id, resNum));
<ResidueScoresFeatures scorefxn="(&scorefxn)"/>
The ResidueScoresFeatures stores the score of a structure at the residue level. Terms that evaluate a single residue are stored in residue_scores_1b . Terms that evaluate pairs of residues are stored in residue_scores_2b . Terms that depend on the whole structure are stored via the StructureScoresFeatures.
CREATE TABLE IF NOT EXISTS residue_scores_1b (
struct_id INTEGER,
resNum INTEGER,
score_type TEXT,
score_value REAL,
context_dependent INTEGER,
FOREIGN KEY (struct_id, resNum) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, resNum, score_type));
CREATE TABLE IF NOT EXISTS residue_scores_2b (
struct_id INTEGER,
resNum1 INTEGER,
resNum2 INTEGER,
score_type TEXT,
score_value REAL,
context_dependent INTEGER,
FOREIGN KEY (struct_id, resNum1) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY (struct_id, resNum2) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, resNum1, resNum2, score_type));
Secondary structure is a classification scheme for residues that participate in regular, multi-residue interactions.
CREATE TABLE IF NOT EXISTS residue_secondary_structure(
struct_id INTEGER,
resNum INTEGER,
dssp TEXT,
FOREIGN KEY(struct_id, resNum) REFERENCES residues(struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, resNum));
ResidueTypes store information about the chemical nature of the residue. The information is read in from the from /path/to/rosetta/main/database/chemical/residue_type_sets/<residue_type_set_name>/residue_types/ .
CREATE TABLE IF NOT EXISTS residue_type (
residue_type_set_name TEXT,
version TEXT,
name TEXT,
name3 TEXT,
name1 TEXT,
aa TEXT,
lower_connect INTEGER,
upper_connect INTEGER,
nbr_atom INTEGER,
nbr_radius REAL,
PRIMARY KEY(residue_type_set_name, name));
CREATE TABLE IF NOT EXISTS residue_type_atom (
residue_type_set_name TEXT,
residue_type_name TEXT,
atom_index INTEGER,
atom_name TEXT,
atom_type_name TEXT,
mm_atom_type_name TEXT,
charge REAL,
is_backbone INTEGER,
FOREIGN KEY(residue_type_set_name, residue_type_name) REFERENCES residue_type(residue_type_set_name, name) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(residue_type_set_name, residue_type_name, atom_index));
CREATE TABLE IF NOT EXISTS residue_type_bond (
residue_type_set_name TEXT,
residue_type_name TEXT,
atom1 INTEGER,
atom2 INTEGER,
bond_type INTEGER,
FOREIGN KEY(residue_type_set_name, residue_type_name) REFERENCES residue_type(residue_type_set_name, name) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(residue_type_set_name, residue_type_name, atom1, atom2));
CREATE TABLE IF NOT EXISTS residue_type_cut_bond (
residue_type_set_name TEXT,
residue_type_name TEXT,
atom1 INTEGER,
atom2 INTEGER,
FOREIGN KEY(residue_type_set_name, residue_type_name) REFERENCES residue_type(residue_type_set_name, name) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(residue_type_set_name, residue_type_name, atom1, atom2));
CREATE TABLE IF NOT EXISTS residue_type_chi (
residue_type_set_name TEXT,
residue_type_name TEXT,
chino INTEGER,
atom1 TEXT,
atom2 TEXT,
atom3 TEXT,
atom4 TEXT,
FOREIGN KEY(residue_type_set_name, residue_type_name) REFERENCES residue_type(residue_type_set_name, name) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(residue_type_set_name, residue_type_name, atom1, atom2));
CREATE TABLE IF NOT EXISTS residue_type_chi_rotamer (
residue_type_set_name TEXT,
residue_type_name TEXT,
chino INTEGER,
mean REAL,
sdev REAL,
FOREIGN KEY(residue_type_set_name, residue_type_name) REFERENCES residue_type(residue_type_set_name, name) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(residue_type_set_name, residue_type_name, chino, mean, sdev));
CREATE TABLE IF NOT EXISTS residue_type_proton_chi (
residue_type_set_name TEXT,
residue_type_name TEXT,
chino INTEGER,
sample REAL,
is_extra BOOL,
FOREIGN KEY(residue_type_set_name, residue_type_name) REFERENCES residue_type(residue_type_set_name, name) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(residue_type_set_name, residue_type_name, chino, sample));
CREATE TABLE IF NOT EXISTS residue_type_property (
residue_type_set_name TEXT,
residue_type_name TEXT,
property TEXT,
FOREIGN KEY(residue_type_set_name, residue_type_name) REFERENCES residue_type(residue_type_set_name, name) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(residue_type_set_name, residue_type_name, property));
CREATE TABLE IF NOT EXISTS residue_type_variant_type (
residue_type_set_name TEXT,
residue_type_name TEXT,
variant_type TEXT,
FOREIGN KEY(residue_type_set_name, residue_type_name) REFERENCES residue_type(residue_type_set_name, name)DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(residue_type_set_name, residue_type_name, variant_type));
Measure how constrained each residue is, following Fleishman, Khare, Koga, & Baker, Restricted sidechain plasticity in the structures of native proteins and complexes .
CREATE TABLE IF NOT EXISTS rotamer_boltzmann_weight (
struct_id INTEGER,
resNum INTEGER,
boltzmann_weight REAL,
FOREIGN KEY (struct_id, resNum) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, resNum));
The RotamerRecoverFeatures is a wrapper for the rotamer_recovery scientific benchmark so it can be included as a feature.
<RotamerRecovery scfxn="(&string)" protocol="(&string)" comparer="(&string)" mover="(&strong)"/>
See the above link for explanations of the parameters.
CREATE TABLE IF NOT EXISTS rotamer_recovery (
struct_id INTEGER,
resNum INTEGER,
divergence REAL,
PRIMARY KEY(struct_id, resNum));
The SaltBridgeFeatures represent salt bridges and related interactions following the definition in:
Donald JE, Kulp DW, DeGrado WF. Salt bridges: Geometrically specific, designable interactions. Proteins: Structure, Function, and Bioinformatics. 2010:n/a-n/a. Available at: http://doi.wiley.com/10.1002/prot.22927 [Accessed November 14, 2010].
CREATE TABLE IF NOT EXISTS salt_bridges (
struct_id INTEGER,
don_resNum INTEGER,
acc_id INTEGER,
psi REAL,
theta REAL,
rho REAL,
orbital TEXT,
FOREIGN KEY (struct_id, don_resNum) REFERENCES residues (struct_id, resNum) DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY (struct_id, acc_id) REFERENCES hbond_sites (struct_id, site_id) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY(struct_id, don_resNum, acc_id));
A structure is a group of spatially organized residues. The definition corresponds with a Pose in Rosetta. Unfortunately in Rosetta there is not a well defined way to identify a Pose. For the purposes of the the features database, each structure is assigned a unique struct_id. To facilitate connecting structures in the database with structures in structures Rosetta, the tag field is unique.
CREATE TABLE IF NOT EXISTS structures (
struct_id INTEGER PRIMARY KEY AUTOINCREMENT,
protocol_id INTEGER,
tag TEXT,
UNIQUE (protocol_id, tag),
FOREIGN KEY (protocol_id) REFERENCES protocols (protocol_id) DEFERRABLE INITIALLY DEFERRED);
The StructureScoresFeatures stores the overall score information for all enabled EnergyMethods.
CREATE TABLE IF NOT EXISTS structure_scores (
struct_id INTEGER,
score_type_id INTEGER,
score_value INTEGER,
FOREIGN KEY (struct_id)
REFERENCES structures (struct_id)
DEFERRABLE INITIALLY DEFERRED,
FOREIGN KEY (score_type_id)
REFERENCES score_types (score_type_id)
DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (struct_id, score_type_id));
The ScoreTypeFeatures store the score types for as for all EnergyMethods.
<ScoreTypeFeatures scorefxn="(default_scorefxn &string)"/>
CREATE TABLE IF NOT EXISTS score_types (
protocol_id INTEGER,
score_type_id INTEGER PRIMARY KEY,
score_type_name TEXT,
FOREIGN KEY (protocol_id)
REFERENCES protocols (protocol_id)
DEFERRABLE INITIALLY DEFERRED);