Rosetta has support for interacting with SQLite3, MySQL and PostgreSQL database backends. This page describes the backends, how to get started using them, and what has already been done. The SQLite3 backend is tested extensively in the integration_tests with every commit, and the PostgreSQL and MySQL support is tested through the BuildBot framework ( PostgreSQL , MySQL).
SQLite3
MySQL
To build Rosetta with MySQL support:
mysql/lib
directory containing libmysqlclient_r.so
into your path by creating a site.settings
file.You will also need to install libmysqlclient
in the LD_LIBRARY_PATH
environment variable (and at runtime).
site.settings
file:source/tools/build/site.settings.template
into source/tools/build/site.settings
library_path
dictionary, add a string containing the path to the directory containing the libmysql client libraries (e.g. "/home/jdoe/lib/mysql-connector-c-6.1.2-linux-glibc2.5-x86_64/lib"
)Symlink or place mysql.h (and other required headers) into external/dbio/mysql
extras=mysql
Ubuntu-specific (and maybe other Linux) instructions to build Rosetta with MySQL support:
Install libmysqlclient
by installing the libmysqlclient-dev
client from your distro's package manager (e.g. apt-get install libmysqlclient-dev
)
Symlink header files into external/dbio/mysql
by running ln -s /usr/include/mysql/* .
extras=mysql
as abovePostgreSQL
libpq
and install it in the LD_LIBRARY_PATH
environment variable (note: make sure to use the same client library version as the database server).postgreSQL
direcotry into Rosetta/main/source/external/dbio/
.extras=postgres
database connection information can be specified with these RosettaScripts and or command line options .
For many applications, one would like to store and retrieve information about a set of structures, for example maybe its relevant to store the atomic coordinates, how similar each structure is to the native and the predicted binding energy (maybe the project is protein interface design, and the set consists of all the structures in various rounds of prediction). We have developed a modular database schema , where each FeaturesReporter
is responsible for a set of tables in the database. Using a particular schema, features for a set of structures is stored as a batch in the database.
Structures can be read from, and written to a relational database when using the JD2 job distributor. Advantages over pdb files or silent files include:
Database IO is implemented simply as a fixed set of FeaturesReporters:
Meta Features:
Whole Structure Features:
Per Residue Features
Possible issues for cluster based jobs:
Databases can be merged because the features have composite primary keys that includes the structure primary key, struct_id , that at least partially randomized. to merge sqlite3 database consider using the merge script in main/tests/features/sample_sources/merge.sh.
-in:use_database
Rosetta can input poses from a database, and output poses to a database. Support for this behavior is supported in any application which utilizes the JD2 job distributor. The DatabaseJobOutputter is compatible with both serial and parallel jobs, and automatically detects non-ideal poses and properly handles output.
Multiple executions of Rosetta can be stored in the same database. Each execution will have a separate protocol_id. If -out:database_protocol_id is not specified, the protocol_id field auto-increments. The Rosetta SVN version, command line, XML script (if available) and flags are stored in the database.
Poses can be extracted from the database into PDB or Silent files using the application score_jd2. MySQL and sqlite3 interfaces are also available for perl, python, R and other scripting languages, making it possible to directly parse and analyze data without extracting it. Poses can be extracted from a database in code using protocols::features::ProteinSilentReport::load_pose() function.
Database filters allow you to only output poses that meet some criteria based on the existing poses in the database. Database filters are invoked from the command line with the following syntax:
-out:database_filter <database filter name> <list of database filter options>
At present 4 database filters are implemented: