This page is geared towards new members of the Rosetta community as a "fast" way to learn about Rosetta, how to use it and how to develop in Rosetta, if needed. The page is organized by increasing difficulty, starting out with basic tools needed, then tutorials for users and later for developers. If you are going through and find that important parts are missing, please add them to the page!
Most people in the community are using PyMol (download from http://www.pymol.org/). Documentation on how to use it can be found at:
Some people are using Chimera (http://www.cgl.ucsf.edu/chimera/), which seems to have some additional functionality for density maps if you are working with X-ray electron density maps or Cryo-EM maps. Tutorials can be found at http://www.cgl.ucsf.edu/chimera/docs/UsersGuide/frametut.html.
There is also the Discovery Studio Visualizer for Scientific Linux and Windows systems:
To work in Rosetta, you need to know some basic biochemistry about amino acids and protein structure. You can learn this by picking up a biochemistry book (such as Voet & Voet: Biochemistry) or look at:
Foldit is a videogame created for the general public to solve real-life scientific puzzles involving protein structure and function. It is an excellent starting point to learn about protein structure as it does not require any previous knowledge. Once you download it, it will take you through tutorials where you are learning about all the moves and modifications you can do to a protein structure with a lot of fun along the way. Keep in mind that the terminology that Foldit uses is a bit different than what we are using in Rosetta (even though the code underneath Foldit is Rosetta code), but if you understand the concepts using Foldit, you can easily apply them in Rosetta. Also, the score in Foldit is opposite from Rosetta: in Foldit a higher score is better, in Rosetta, a lower energy is desired.
Download it at http://fold.it/portal/
A similar tool to FoldIt is available for RNA. It was developed by Rhiju Das' lab at Stanford (http://www.stanford.edu/~rhiju/). You can play and learn about it here: http://eterna.cmu.edu/web/
To know about how to navigate a terminal in Linux, you need to learn some basic commands. Learning some basic tricks with BASH coding often proves useful in expediting your workflow. There are plenty of resources available on the web:
Rosetta is a unified software package for protein structure prediction and functional design. It has been used to predict protein structures with and without the aid of sparse experimental data, perform protein-protein and protein-small molecule docking, design novel proteins and redesign existing proteins for altered function. Rosetta allows for rapid tests of hypotheses in biomedical research which would be impossible or exorbitantly expensive to perform via traditional experimental methods. Thereby, Rosetta methods are becoming increasingly important in the interpretation of biological findings, e.g., from genome projects and in the engineering of therapeutics, probe molecules and model systems in biomedical research.
Rosetta concepts: The following pages describe a handful of very important Rosetta concepts - like FoldTree, AtomType, MoveMap, etc.
Some really useful papers about Rosetta:
Here are some useful links that will help you understand terminology that is used in Rosetta.
Some concepts are also explained in the PyRosetta tutorials:
There is considerable documentation available within this wiki. Additional documentation is available here: https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/index.html.
Excellent in-depth tutorials covering many aspects of Rosetta can be found here: https://www.rosettacommons.org/demos/latest/Home#tutorials
The main RosettaCommons page is https://www.rosettacommons.org/. There is a Forum available on this page where you can post questions or look for answers for your specific problem (https://www.rosettacommons.org/forum).
There is course material available from several labs who gave lectures about Rosetta. The material can be found on github under teaching resources: https://github.com/RosettaCommons/teaching_resources
Lectures on Rosetta (closely following the PyRosetta book): http://goo.gl/GuUNDK
If you won't be developing in Rosetta, you can download the release version here : https://www.rosettacommons.org/
But if you're reading this, you will most likely be developing or at least running new code. In this case you have to download and compile Rosetta from Github. Follow the instructions on how to download and compile Rosetta here: https://wiki.rosettacommons.org/index.php/GithubWorkflow
You might want to look at the test-server (https://benchmark.graylab.jhu.edu/) to figure out the latest revision for which all tests are running and get this revision. If you get the latest revision, it might be that not all tests are running so you might be downloading a 'broken' version. If you don't really understand much about git and github yet, don't worry, you will be going into much more detail later once you are actively developing in Rosetta (see below).
The Meiler lab has hosted one-week workshop in the past (usually some time in the spring) for members of the community, outside people and people from industry. The are an excellent starting point on how to run Rosetta, give plenty of information on how to prepare input files and how to process Rosetta output. The zip folder can be downloaded and unpacked - it contains subfolders for several Rosetta applications, including protein folding, docking, design, ligand docking, and loop modeling. Input files are provided and only minimal previous knowledge is assumed to go through these (like PyMol or Linux). You need to download and compile Rosetta to run these applications!
Download the tutorials from https://meilerlab.org/tutorials/
Once you downloaded and compiled Rosetta, you can also check out the demos folder at Rosetta/demos
. This folder contains subfolders with easy-to-follow demonstrations and protocol captures on specific applications. Input files, readme's and commandlines are available in the folders.
It is useful to know python when you want to use PyRosetta or to write scripts to prepare input files or analyze data.
interactive tutorial: http://www.learnpython.org/
cheat sheet: http://www.cogsci.rpi.edu/~destem/gamedev/python.pdf
another one: http://pythonkit.com/Python-Cheat-Sheet-download-w370.pdf
If you want to develop in Rosetta, you need to learn C++. If you have never scripted before or know nothing about programming, it is recommended that you start with something easier than C++, such as Python (or Perl) to get the basics down. If you have scripting or other programming experience, here are useful references:
short online videos - Bucky's tutorials: http://www.youtube.com/watch?v=tvC1WCdV1XU&list=PLC6E50B89DA30C77A
book: Sam's teach yourself C++ in 21 days: http://www.amazon.com/Sams-Teach-Yourself-Edition-ebook/dp/B0028CK0GW/ref=sr_1_1?ie=UTF8&qid=1375469877&sr=8-1&keywords=c%2B%2B+sams+5th+edition
ROSIE is an abbreviation for "Rosetta Online Server that Includes Everyone". It is a server that runs several Rosetta applications without requiring the knowledge of Python, C++ or anything difficult. Perfect to use for the Newbie, however, only few applications are available and some of them are very specialized - more to come in the future. Check it out at https://rosie.rosettacommons.org/. Also, if you are a RosettaCommons developer and are planning on setting up a server on ROSIE for your own application, contact Sergey Lyskov at sergey.lyskov[at]gmail.com and he will give you instructions on how to do it. It should be pretty quick and easy with about two pages of code or less!
If you know Python and you are planning on creating your own protocols, PyRosetta (developed in Jeffrey Gray's lab at Johns Hopkins) is an excellent way of using Rosetta and manipulating protein structures (If you don't know Python, it is much easier if you learn it before diving into PyRosetta). PyRosetta is basically a Python wrapper around the Rosetta C++ code so you can do pretty much everything with Python without knowing C++. For rewriting code, however, it is recomended to do that in C++. The tutorials for PyRosetta can be found at https://www.pyrosetta.org/documentation/tutorials
RosettaScripts is an XML-scriptable interface that allows users to mix and match protocols and carry out customizable tasks for various protocols. Detailed instructions on how to set them up are available at RosettaScripts
CS-Rosetta is a version of Rosetta that uses NMR chemical shift restraints for ab initio folding or structure prediction. The toolbox of CS-Rosetta also includes automatic assignment of NOE resonances.
If you are planning to develop in Rosetta and want to put together your own protocols and write code, now is the time to learn more about version control and all that comes with it. Version control is basically a history of the code base (similar to when you hit the 'save' button in a document) which is needed when over 100 developers are working on the code from all over the world at the same time. It makes it much easier to resolve coding conflicts (when multiple people are working on the same code) and much more difficult to "break" Rosetta - even though it is not completely impossible. ;o)
Extended information from the above intro is available here: GithubWorkflow but it is suggested to become intimately familiar with various git commands. Several online resources are available, among them:
To write and develop code, it is much easier if you have an IDE (Integrated Development Environment) available. It is basically an editor and compiler, which also links to version control and has tons of nice little features like linking to functions somewhere else in the code, bracket completion, indentation, finding errors, etc. The two IDEs commonly used are
Now that you know about version control, have set up your IDE and are knowledgeable about C++, you should read through the coding conventions and comply with them - no excuse!!! This will make it much easier in the long run for other people (and yourself) to read, understand and use your code in the way it is intended for. This also applies for Python code:
When you download Rosetta, there will be a number of different top-level directories
The executables are located in Rosetta/source/bin - they are links from executables somewhere deeper in the code
For Rosetta, people always say "The code is the documentation" - which is nice if you know how to read code, but not so nice, if you don't. Therefore, when you write code: document it in two places:
Write the documentation such that people who are not familiar with every detail of your code will be able to understand it but keep in mind who you are aiming your documentation for: the user or the developer. Please keep the correct documentation style in the correct places.
User level documentation is a useful resource for learning how protocols in Rosetta work and how to use them. However, as developers it is important to also maintain more detailed documentation of the API (Application Programming Interface) which details how to navigate specific concepts and methods in the Rosetta library.
We maintain API documentation using Doxygen - a software package that will parse the Rosetta source code and generates html documentation from in-code documentation. This can be done by using specialized tags (i.e. @brief, @details, @author, @note, etc) when commenting code. Details for writing doxygen documentation in-code can be found at:
There are plenty of useful links available that are connected with Rosetta, in addition to the ones described above:
New developers page: (For RosettaCommons internal people only) https://wiki.rosettacommons.org/NewDevelopersPage
Robetta server - online server for several applications: http://robetta.bakerlab.org/
RosettaBackrub server - Kortemme lab, UCSF: https://kortemmeweb.ucsf.edu/backrub/
RosettaDesign server - Kuhlman lab, University of Chapel Hill, NC: http://rosettadesign.med.unc.edu/
Rosetta@cloud - pay-per-use service for the experimental community: http://rosetta.insilicos.com/what/
Rosetta@home - donate computer time to solve scientific problems to RosettaCommons when you are not using your computer: http://boinc.bakerlab.org/rosetta/
source code on github - requires a github account: https://github.com/organizations/RosettaCommons
How to generate code templates of common Rosetta classes/apps/unit tests to save development time. Seriously, want to write a mover? Start here! Rosetta/source/code_templates
Here is an example of creating a new mover for carbohydrates:
./generate_templates.py --type mover --class_name TestMover --brief "A simple testing Mover" --namespace protocols carbohydrates
Rosetta is currently licensed on a free-for-noncommercial use basis. Licensing fees for for-profit use of Rosetta help to fund the development & maintenance of Rosetta, as well as funding the scientific mission of the RosettaCommons. For a commercial license, see the information at https://www.rosettacommons.org/software/license-and-download
As such, any code you add to Rosetta must be compatible with those terms. To get the ability to add code to Rosetta as a developer, you signed the developers agreement (http://rosettadesign.med.unc.edu/agreement/agreements.html) or a Contributor Licensing Agreement (insert link) which also contained some information about licensing third party software. Please also check out the wiki page here (https://wiki.rosettacommons.org/index.php/Licensing). A good rule of thumb is NOT to use anything under GPL or LGPL license.
The Rosetta Bootcamp was given in April 2013 for the first time as a one-week workshop for beginning Rosetta developers. It contained both lectures as well as labs where participants applied their just-learned knowledge under the supervision of several Rosetta developers.
The full 24-hour material (split up into lectures/labs up to 90mins videos) can be watched on youtube: https://www.youtube.com/watch?v=2qQLdc0tmdg&list=PLaF-DHLR9l7wGTMldDNnZK7nA00eFHEH6
It may be better to not follow the schedule that youtube follows but rather the original schedule (https://docs.google.com/spreadsheet/ccc?key=0ArYNPpJXYWK8dHdjZEc4dGxIWnBvYko5OTZ5RFB3cHc#gid=0) since it was taught in this order.
Below are the contents of the individual lectures/labs:
When you are starting a coding project, make a new branch and start coding. DO NOT PUT YOUR CODE IN /devel/ AS IT WILL BE PHASED OUT!!! If you don't want your code to be released yet, just keep it in your branch until you are ready to merge your branch to main.
If you are writing a completely new framework, it MIGHT make sense to create a subdirectory in /core/. Also, be sure to consider which other code or libraries you are using in your code because they need to be compiled BEFORE any of your code will be compiled, otherwise you will run into unresolvable errors at compile time. If you are unsure, ask someone.
src
and test
are (sort-of) mirrored directories. Put your classes in src
and your unit tests into test
.src
:
<class>.hh
, <class>.cpp
into <library>.src.settings
filescons.py
<main>.cc
into pilot_apps.src.settings.all
filescons.py bin mode=release
bin
directory<name>.cxxtest.hh
file in the appropriate test/<module>.test.settings
filetracer.u
, input.pdb
) in the testinputfiles
section.scons.py cat=test
void
and test_mytest()
<module>_init();
into the setUp()
methodtest/run.py -d \<database\> -1 \<class_name, w/o extensions or prefixes\> -c \<compiler\>
scons.py -j4
and then compile and run the testscxxtest.hh
, don't use block comments (/* */
) to comment out tests: compilation will fail Below are some useful techniques for debugging code in Rosetta. Most programmers would be incredibly lucky if their code compiles and runs at the first try. Debugging is trial error and requires some creativity. It is often better to assume your code is wrong (it most likely is) and be surprised when your code compiles and all tests succeed! :o)
Tools/Tricks
When compiling and debugging code, spend your time on the harder bugs and less on the trivial errors. Before compiling, always check:
Some useful tips for debugging: If compilation fails, some good first things to check:
Other tips:
./scons.py -j 8 mode=debug bin
).gdb --args <path_to_Rosetta>/main/source/bin/<your_app>.default.linuxgccdebug @rosetta.flags
or
lldb -- <path_to_Rosetta>/main/source/bin/<your_app>.default.macosdebug @rosetta.flags
catch throw
(gdb) or break set -E C++
(lldb).run
.bt
(gdb or lldb) to get a backtrace. If you want to get debugging help, post the backtrace for other developers. It shows what was calling what when the segfault happened.f 0
, f 1
, f 2
, etc. (gdb or lldb). You can look at where you are in the code using list
, print values of variables using print <variable_name>
, etc.
At this point, it's detective work. A segfault generally means that memory is being accessed in an invalid way (e.g. accessing element 11 of a 10-element vector). It's pretty easy to figure out what was being accessed, but you have to figure out why this is happening.Below is a non-comprehensive list of errors that were pretty ambiguous to debug. Feel free to add as you solve:
Library Code:
Constructor():
utility::pointer::ReferenceCount(),
bool_field1_(true),
int_field2_(0),
{
// code for my constructor
}
Unit Testing:
void
test_myTest() {} // incorrect
vs.
void test_myTest() {} // correct!
Many protocols in Rosetta can be set up by specifying options and flags through the options system. As you begin to develop code, you will most likely find yourself adding an option or using an existing option.
To add an option to the options system:
To include your option in a new application/class, you must #include:
Where 'group' is the name of the option group.
The options system is very useful for switching on/off parameters in protocols and recognizing some basic settings. However, many inputs into Rosetta are often resources, such as a PDB file that will be loaded into a pose or symmetry information that will be stored in a SymmData object. Because the options system is static, we would have to keep recreating these resources which takes a lot of memory, time, and energy.
The resource manager is Rosetta's solution to this problem. By defining your resources initially in a resource definition file, Rosetta will load these resources once and provide you access to them whenever needed.
More info can be found about the Resource manager and how to integrate it into your code can be found here: ResourceManager
Unit tests are tests of a small piece of code in isolation. Examples are testing a single function or the correct setup of a constructor (= the creation of an object). Code should be written like an onion and most of the code should be unit tested. Actually, ALL of the code should be unit tested, however, there are functions that are impossible to unit test.
test/devel
:
main/source/test/devel.test.settings
main/source/src/devel.src.settings
core_init()
and a tearDown()
method. ./scons.py -j cat=test
python test/run.py -d -1
$ROSETTA3_DB
is defined (type echo $ROSETTA3_DB
), you don't need the flag -dNote: All unit tests run every time modifications are made to the master code base
Integration tests check how your piece of code "integrates" into the rest of Rosetta. While running an integration test, the output of the code BEFORE the change is compared to the output of the code AFTER the change. Since Rosetta developers cannot understand all 3 million lines of code, integration tests only serve to test whether there is an expected change in output or not. They do NOT test whether the functions do what they are supposed to do (this is what unit tests are for) NOR do they tell you whether the output of a protocol is scientifically valid (this is what scientific tests are for).
Note: All integration tests run every time modifications are made to the master code base
Scientific tests (or benchmarks) are required to test whether the output of a protocol is scientifically sound. If you are refining a protocol, you want to make sure that the results you are getting aren't any worse than from the previous implementation of the protocol. Likewise, when you are implementing a new protocol, you want to see how good or bad your results are and how they compare to other methods.
Note: Scientific tests are run every two weeks
Once you have branched Rosetta, developed your feature and fully tested it, you are ready to contribute it to the main Rosetta code base. To do so, you have probably already reviewed the Rosetta Git conventions Rosetta GithubWorkflow (if not, do so now!). However, this page links to the required detailed process for committing code to Rosetta master: Rosetta Wiki Page: Committing Code