This page was created on 25 July 2019 by Vikram K. Mulligan, Flatiron Institute (vmulligan@flatironinstitute.org).
Historically, Rosetta has operated around a paradigm using only a single CPU core. Systems with many cores per node could run many, entirely independent instances of Rosetta in order to increase sampling, but the speed at which an individual sample was generated could not be increased with more CPU cores.
More recently, effort has been made to ensure that Rosetta's core is fully threadsafe, and to implement infrastructure to facilitate multithreaded protocol development. At the time of this writing (25 July 2019), very little of Rosetta actually supports multithreading, but we anticipate that much more will in the near future.
This document serves to describe how to build Rosetta with multithreading support, the core infrastructure for thread safety and for threaded code execution, and the modules that currently take advantage of multithreading.
To compile Rosetta with support for multithreading, append the extras=cxx11thread
statement to your scons
command. For example, to build all binaries in the bin directory in release mode with multi-threading support, use the following:
./scons.py -j 8 mode=release bin extras=cxx11threads
The RosettaThreadManager is a global singleton that maintains a pool of threads which persist for the entire Rosetta session, to which work can be assigned. This avoids the overhead of repeatedly launching and destroying threads, and also ensures that nested requests for threading don't result in explosions in the number of running threads. All multithreaded code should assign work to threads by passing it to the RosettaThreadManager. For more information, see the documentation page for this class.
Note that a handful of multithreaded classes existed prior to the implementation of the RosettaThreadManager, and these launch threads of their own. We plan to switch these to use the RosettaThreadManager in the near future.
TODO
TODO
Interaction graph pre-calculation is currently carried out in threads, using the RosettaThreadManager
.
Version 3 of our job distribution system (JD3) includes a multithreaded job distributor class, called MultiThreadedJobDistributor
. Currently, this maintains a thread pool of its own, but will soon be switched to use the RosettaThreadManager.
The simple_cycpep_predict application, which is used for peptide macrocycle structure prediction, has a hybrid MPI/thread-based job distributor. At present, this launches threads of its own, but will soon be switched to use the RosettaThreadManager.
The StepWise protocol is able to parallelize its work. Currently, it maintains its own thread pool, but will be switched to use the RosettaThreadManager in the near future.
Work is underway to parallelize the ScoreFunction::operator()
. This will also parallelize the minimizer's line search.
Gradient calculation is a planned future target for multi-threading.
At least one scoreterm in the nearly-deprecated score12
scorefunction is not threadsafe.