Rosetta 3 (formerly MiniRosetta) is an object-oriented implementation of Rosetta that has been rewritten in C++ from the ground up by a core team of developers. These guidelines are intended to help new (andeveld to remind old/current) Rosetta developers to learn, maintain, and improve the reliability, clarity, and performance of the code while we continue its development and modernization.
The Rosetta Commons copyright header is required for every source code file in Rosetta. Do not make modifications. The header you should use for all C++ source files is:
// -*- mode:c++;tab-width:2;indent-tabs-mode:t;show-trailing-whitespace:t;rm-trailing-spaces:t -*-
// vi: set ts=2 noet:
//
// (c) Copyright Rosetta Commons Member Institutions.
// (c) This file is part of the Rosetta software suite and is made available under license.
// (c) The Rosetta software is developed by the contributing members of the Rosetta Commons.
// (c) For more information, see http://www.rosettacommons.org. Questions about this can be
// (c) addressed to University of Washington UW CoMotion, email: license@uw.edu.
Namespaces (media wiki link) are used to wrap associated classes that make up a conceptual component. Namespaces in Rosetta are expected to match the directory hierarchy (i.e., code in namespace core::scoring
can be found in the directory src/core/scoring
— src
is not part of the namespace hierarchy,) and vice versa. (All code in src/core/scoring/Blah.[cc,.hh,.fwd.hh]
must live inside namespace core::scoring
).
Avoid raw pointers.
Collection
and class Object
, where a Collection
points to many Object
s and Object
s point upward to the Collection
in which they are contained. The Collection
should hold owning pointers to Object
s, and Object
s should hold an access pointer to their Collection
.Avoid C-style arrays (media wiki link). Especially avoid writing to C-style char
arrays with the scanf
family of functions. This invariably introduces buffer overflow bugs that can hide unnoticed and are hard to track down. Instead, use std::string
buffers and C++ stream i/o. scanf
has also been found to be a security weakness (due to "stack smashing").
Use include guards (media wiki link) in all header files. The include variables defined should be named INCLUDED_namespace(*_subnamespace)_filename(_FWD)_HH
for consistency, e.g.:
#define INCLUDED_core_scoring_EnergyMap_HH
Use assertions (media wiki link) to test for events that should not occur for any normal program inputs and operations: function pre- and post-conditions, object invariants, divisions by zero, and so forth.
PyAssert()
s to prevent PyRosetta from segment faulting.Use if
blocks to test for errors that could happen in normal program operation. Errors should generate useful reports that include the values of relevant variables.
Fix everything that causes a legitimate compiler warning on any platform. ** Click here to view a List of Common Warnings and How to Avoid Them.
No code duplication.
If you declare a public function, you have to define it (preferably in a .cc
file).
** Undefined private member functions (e.g., private constructor, copy constructor, etc.) are OK.
.cc
file,.hh
file, and.fwd.hh
file. (See File Inclusion below for details about what to include in each file.)All persistent data must live as member data of a class. No global data. (See Global Data in Rosetta ).
All data must be private; protected data is forbidden.
Make sure every piece of member data for a class is copied in its copy constructor and assignment operator (operator=
).
If you add data to a class, and that class has user-defined copy constructors or assignment operators, be sure your new data is copied inside of those methods.
Member functions should be "const correct" (See `constant` below). If they do not change their object, they should be declared const; if they do, leave off the const
at the end of the function to implicitly declare them non-const.
Any base class with virtual functions must have a virtual destructor.
Any derived class that overrides a virtual function from a base class must use the C++11 override
identifier to allow the compiler to catch unintended differences between the base class function signature and the derived class function signature. Less important: the community stylistic convention is to omit the virtual
identifier in the derived class function override when the override
identifier is used. The most important thing is to use the override
keyword, though -- this is functional, not merely stylistic. (New as of 6 April 2018. For more on what override
does and why it's useful, see this blog post.)
Pass objects into methods by reference and not by value — except when the intent is to make a copy of the object. If the object does not change, pass it as a const reference. Passing objects by value also requires additional, unwanted #includes
in header files (See File Inclusion below).
Pass primitive data types by value and not by reference when they are strictly input parameters. Pass by non-const reference if the methods are intended to change the values held in the input parameters. (E.g., void max( int first, int second, int & answer );
). Passing primitives by reference is slower than passing by value.
Check that polymorphic methods are correctly overridden in derived classes. Function signatures for polymorphic methods must match exactly.
If a class includes owning pointers to other objects, explicitly define the destructor, copy-constructor, and assignment operator (operator=
) in the .cc
file. (See the FAQ for more details.) Don't let the compiler do it for you. If you create any constructor, it prevents the default constructor from being synthesized. Class designers should always say exactly what the class should do and keep the class entirely under control. If you do don't want a copy-constructor or operator=
, declare them as private.
Watch for long member function definitions. Complicated functions with too many lines are difficult and expensive to maintain — and they are difficult to expand upon. Break up long member functions into several smaller member functions.
Watch for long argument lists. Function calls then become difficult to write, read, and maintain. Instead, try to move the member function to a class where it is more appropriate and/or pass objects in as arguments — e.g., if a function needs to know several things about a rotamer, (its chi angles, the way those chi angles were produced, the amino acid type, the chi bins, etc.,) wrap all of those individual pieces of data into a Rotamer
object. Then a single parameter (a Rotamer const &
perhaps) may be passed to the function instead of many.
Define OP typedefs for classes in their corresponding .fwd.hh
files. (See explanation about owning pointers in #General. above.) **Note: It is no longer necessary for classes to be derived from ReferenceCount. See How to use pointers correctly for information on the new pointer system.
** Dynamically allocated objects must be put into owning pointers at the time of their allocation.
Avoid multiple inheritance.
Don't use protected inheritance. Private inheritance can be a useful alternative to containment, but we generally prefer containment.
Why does file inclusion matter? Compilation speed — and sometimes, the ability to compile at all! Reckless #inclusion produces long build times when compiling from scratch and long re-build times after making modifications. The majority of development is thus spent compiling and recompiling; you waste your time and the valuable time of your colleagues with bad #inclusion practices.(Consider that Rosetta++, because of its circular #include
s, could not be compiled on any BlueGene machines without a special modification to the BlueGene compiler.)
PREFACE: Class forward declarations belong in .fwd.hh files.
class MyClass;
Within the .fwd.hh file, forward declare the class(es) and, if they derive from utility::pointer::ReferenceCount, typedef the owning pointers and const-owning pointers with OP and COP suffixes:
namespace core{ namespace scoring { class MyClass; typedef utility::pointer::owning_ptr< MyClass > MyClassOP; typedef utility::pointer::owning_ptr< MyClass const > MyClassCOP; } }
PREFACE: Class declarations belong in .hh files
class MyClass { public: void set_int( int input_int ); private: int my_int_; };
void MyClass::set_int( int input_int ) { my_int_ = input_int; }
Your #includes should not appear in random order, or alphabetical order, or the order you decided you needed them for the file. Ideally, they should be organized. This organization applies to headers in .cc or .hh files, although in .hh files you should have .fwd.hh inclusions, not .hh inclusions. Count on approximately 5 groupings:
Cyclic inclusion complicates the code and build system and slows compiling. The code is separated into discrete libraries, which compile independently (and more quickly than they would otherwise). Code in any given library may only #include headers from the same library, or a lower-level library.
flow-chart and overview of the different libraries
In order from lowest to highest, the libraries are (with examples of familiar code):
CamelCase
for class names
box_car
with underscores separating words for variable names
box_car
with underscores for namespaces & directories
box_car_
with underscores and one trailing underscore for class member variables
Class definition files must be named after their classes — e.g., class PackerTask
is forward declared in PackerTask.fwd.hh
.
class MyClass {
public:
int my_int() const {
return my_int_;
}
int & my_int {
return my_int_;
}
private:
int my_int_;
};
An instance of this class may be passed to a function as a const & with the wonderful guarantee that its internal data will not be modified within that function. The compiler will prevent calls to the non-const my_int() function!refold()
evaluation), they should be declared mutable
so that they may be modified in const methods, and should be made thread-safe. If you are uncertain about how to ensure that data access is thread-safe, please ask the community.mutable
data so that these can be loaded in a const
context. However, these must load their data in a threadsafe manner. Utility functions in utility/thread/threadsafe_creation.hh
can be used for this, and there are many examples in core/scoring/ScoringManager.hh
. When in doubt, ask someone from the community about how to do this safely.const_cast
to discard the const-ness of a const object.mutable
if it can be avoided. If it cannot, ensure that mutable data are handled in a threadsafe manner. The community can help you to ensure thread-safety, particularly during the pull request review process. Usually, though, it is possible to simply pass data to the functions that need them, rather than caching data in mutable variables and accessing them further down a chain of function calls. The former is thread-safe; the latter is not, since different threads might invalidate one another's caches.typedef
sDo not use raw literals for numeric values: 0.0
vs. Real( 0.0 )
Primitive variables should be declared using the typedef
s defined in core/types.hh
.
Real
, Distance
, DistanceSquared
, or Size
, etc.
double
, float
, or int
.
int
— but ask.float
— do not typedef
to float
. Double precision computations yield better minimizations.
typedef
s for owning pointers should be placed in the same namespace as the class the owning pointers are templating.
typedef
s for typedef
s may be (publicly) declared inside class scope, but should not be made outside of class scope.
** The following is OK:
// This is OK namespace protocols { namespace movers { class MyMover { public: typedef scoring::ScoreFunctionOP ScoreFunctionOP; };
} } The following is NOT OK! It produces name ambiguity and compiler errors
// This is NOT OK! namespace protocols { namespace movers { typedef scoring::ScoreFunctionOP ScoreFunctionOP; class MyMover { … }; } }
symm_something( core::Conformation const & conf )
{
// first assert the dynamic cast
assert( dynamic_cast< core::conformation::symmetry::SymmetricConformation const * >( &conf ) );
core::conformation::symmetry::SymmetricConformation const &
symmcomf = static_cast< core::conformation::symmetry::SymmetricConformation const & >
( conf );
}
A convenience function that perform the dynamic_cast assertions is provided to simplify this task
utility::down_cast< Bar * >( foo ); // returns a pointer to a Bar instance
:* If you are down casting an access_ptr or an owning_ptr, you can use the following function, which is available if you've included the owning_ptr or access_ptr headers
utility::pointer::down_pointer_cast< Bar >( foo ); // returns a BarOP
auto
is strongly encouraged in for
loops.auto
elsewhere is allowed, but it is strongly preferred that the type of the variable can be induced from code within the same function.const_cast
ever. The ability to make something const
is there to help you: it ensures that the compiler will catch you if you accidentally try to modify something that should not be modified. It's a safety harness. Casting away the const-ness of something is like throwing away your safety harness.
####Style
#####Semi-colonsDo not add excess semi-colons. Semi-colons are not needed after for-loops or function bodies. Excess semi-colons following function bodies are flagged as errors by some compilers:
for ( Size ii = 1; ii <= 10; ++ii ) { std::cout << ii << “ “; }; // this semi-colon is unnecessary. class MyClass : public utility::pointer::ReferenceCount{ MyClass : my_int_( 0 ) {}; // this semi-colon is unnecessary void my_function() { ++my_int; }; // this semi-colon is unnecessary. }; // This semi-colon is absolutely neccessary.
Label all virtual functions in derived classes as virtual, even though their virtual status is controlled by the base class declaration.
for ( Size ii = 1; ii <= 10; ++ii ) { // this is correct spacing std::cout << ii << “ “; } if ( i == 2 ) { // this is correct spacing } std::cout << "this string" << " should be written to output" << std::endl; // this is correct spacing
// now examples with incorrect spacing for( core::Size i=linker_start_; i<=linker_end_; ++i ) {//missing space after keyword } for(Size j=start;j<=end;++j) { //missing space after keyword, missing space after parenthesis, missing space after semicolon } if ( i==2 ) { // missing spaces around operator } std:cout<<"this string"<<" should be written to ouput"<
Use tabs to indent on the beginning of a line, not spaces.
Use spaces, not tabs, after the first non-whitespace character on a line. Spaces may be used to align statements on successive lines. Do not try to line up the first non-whitespace character of one line with some non-whitespace character other than the first non-whitespace character on the line above by combining tabs and spaces.
Indent one additional tab per nested level.
Indent one additional tab when function arguments must be wrapped beyond the first line.
Indent two additional tabs when for
loop declarations must be wrapped beyond the first line, (e.g., when iterator classes have long names).
Do not indent for namespace nesting.
Do not indent for raw scoping.
public:
and private:
labels inside classes should be indented to the same level as the class itself.
and
and or
written in English. Some do not. In C++, always use &&
and ||
.Remove trailing whitespace from source before committing it.
Remove unused variables and functions; if functions are left for debugging purposes, comment about this.
Conditional checks should happen inside the called function rather than in the calling function when possible. This helps keep things a bit more modular and also ensures that your function has no bad side effects if someone calls it but forgets to check for the essential condition. For example:
Instead of
if ( condition_exists ) my_function();
use
my_function();
where my_function
begins with
if ( !condition_exists ) return;
Every new file should have the Emacs mode information, copyright information, and the Doxygen information — including file name and author name — at the top of it. (You also can copy it from similar file.)
Don't use the <cstdio>
functions, such as printf()
. Learn to use "io streams" instead; they are type-safe and type-extensible.
Don't use the form MyType a = b;
to define an object. This one feature is a major source of confusion because it calls a constructor instead of the operator=
. For clarity, always be specific and use the form MyType a(b);
instead.
The comment blocks at the beginning of Rosetta functions have been formatted to be compatible with the http://www.stack.nl/~dimitri/doxygen/Doxygen auto-documentation program.
@brief
Doxygen command.
@brief
inside header files (.hh
files) where the function is declared.@details
in implementation files (.cc
files) where the function is defined./// @brief finds the correct fold for every protein sequence void fold_protein( std::string const & sequence );
core.scoring.constraints.EnzConstraintIO.custom_fancy_name_for_channel
/// Tracer instance for this file
static basic::Tracer TR("core.io.pdb.file_data");
----The following templates can assist in learning the coding conventions. They also will save you a lot of time. Please add your own Rosetta 3 coding-convention-compliant templates to list below, which is organized by IDE and file type.
settings = {
"user" : {
"prepends" : {
},
"appends" : {
"projects" : { "src" : [ "devel", "demo" ], },
},
"overrides" : {
},
"removes" : {
},
}
}
To deal with the formatting problem we could use a code beautifiers like...
To Speed up code...