The overview of a Rosetta protocol is extraordinarily generic: you take a set of input data that specifies large parts of the sequence and conformation space of a putative protein, you do some sampling of the degrees of freedom in question, and then you pick the "best" results drawn from that sampling procedure. This document concerns itself with one aspect of the second step and, primarily, with the third step. The glaring question the second step leaves open is the precise amount of sampling necessary to arrive at a reasonable set of structures to analyze. Like in a molecular dynamics simulation, you can generally determine whether you have gotten as close to the "right answer" as possible given your sampling criteria.
The classic example is in ab initio folding, where one will plot score versus rmsd to native and obtain a "folding funnel," where large rmsds correspond to poor scores (and a relatively low score derivative) and, as rmsd approaches the native, rapidly improving scores, leading to a cluster of near-native conformations. Not every sequence yields a folding funnel at all (granted, not every sequence yields a folded protein in reality either!) but for the most part, a folding funnel is some evidence of convergence. (Not to overwhelm with caveats, but in theory you could imagine that conformational space might simply be fifty times larger than what you've actually sampled, and your observed funnel is merely a sub-funnel describing some portion of that space. So unsurprisingly, unless you have a true native structure, you can't tell how close you are.) But, much as in an MD simulation, you can observe the likelihood that you have not converged declining, and that's good enough. After all, you're just converging with respect to the scoring function you're using. By your millionth decoy of ab initio, the error in the scoring function is almost certainly much greater than imperfection in your sampling. (Doing fixed backbone design, you have likely reached that point by decoy one hundred.)
The point is that the very first thing you should do is make sure that you are making enough structures. As a work-around to help you accomplish more effective sampling with a certain number of structures, you can break your protocol into multiple phases. For example, suppose you are running FastRelax with some amount of design. This is a great protocol and gives people wonderful results! But if you are redesigning twenty residues to ALLAAxc, that's quite a number of configurations to visit in addition to backbone and sidechain minimization and sidechain repacking. If real-life constraints prevent you from generating 20,000-100,000 decoys, which would be ideal for this space, you could instead generate 5,000 decoys through FastRelax without design, process the results to obtain a smaller number of structures (perhaps the 10-20 best cluster centroids, or something similar) and then run a more conservative design protocol (like fixed-backbone design) on the results, for a hundred decoys or so.
So, processing results can take many forms. Important highlights of this process include: