John Forrest talked us through some of the C++ objects and organization of the CBC code. One useful note is that the multi-threading code in CBC is entirely contained within CbcModel.cpp. Another point worth remembering is that the CBC code is used in the BONMIN project, as well as for LP/MIP with Clp; there are a number of #ifdefs for BONMIN in the CBC code, and also comments referring to 'odd solvers' - John says that these comments do usually apply to BONMIN. We'll endeavor to add notes about this discussion here.
=== Representation of the Branch and Bound tree ===
See source files CbcTree.hpp and CbcTree.cpp. The nodes on the current frontier of the tree are stored as a 'heap' (or priority queue) in an STL vector container (declared as std::vector nodes_); you'll see references to this 'heap' in the CBC comments. The nodes in the heap are maintained in an order so that, at all times, the node at top of the heap is the one that should be explored next by the Branch and Bound algorithm. To add a node to the tree, the code calls the CbcNode 'push' method. This method uses the comparison function associated with the tree to insert the new node into its proper place in the tree. The comparison function simply compares two nodes and decides which one is 'more promising' i.e. should be explored by Branch and Bound sooner.
Hence, the comparison function makes the 'next node' decision in the Branch and Bound process. There are several comparison functions available in the CBC code, that implement different 'next node' decision rules, and users could write their own comparison functions. The heap can be reordered (e.g. from bread-first search to depth-first search to best-bound, etc.) at any time by setting a new comparison function via the 'setComparison' method.
The function defined in CbcCompareDefault.cpp is the 'default choice' normally used by CBC. Others are defined in CbcCompareDepth.cpp (favors nodes deeper in the tree, closer to depth-first), CbcCompareObjective.cpp (favors nodes with better objectives, closer to breadth-first) and CbcCompareEstimate.cpp (favors better estimated objectives, which are computed using pseudocosts after strong branching has been performed). CbcCompareDefault.cpp uses a more elaborate strategy that combines (i) breadth-first then depth-first, (ii) 'diving' or complete exploration of a promising subtree, and (iii) a criterion using the objective plus a weight times the number of integer infeasibilities.
=== Processing Nodes ===
At the beginning of CbcModel::BranchAndBound(), the root node has been solved. CBC will add one node to the heap, with a 'reference count' of 2 for conventional MIPs (two potential children) and an initial branch direction of -1 for 'branch down'. (CBC is designed to support N-way branching; if used this way, the initial reference count could be greater than 2.) Think of this node as representing the potential children of the root node.
When a node is ready to be processed (for cuts etc.) and solved, it is 'popped' off the heap. For conventional MIPs, a node with a reference count of 2 will be put back on the heap with a reference count of 1, and a change of branch direction; then the LP relaxation of the first subproblem will be solved, by calling CbcModel::solveWithCuts(). If this subproblem yields an infeasible solution, an integer solution, or a solution worse than the current Best Bound, the code simply cleans up and proceeds with whatever node is now at the top of the heap. If this subproblem yields a non-integer solution whose objective is better than the current Best Bound ("objective cutoff"), a new node for this subproblem's potential children is added to the heap, again with reference count = 2 and branch direction = -1 (by default).
When a node with a reference count of 1 is processed, it is 'popped' off the heap, and it won't be pushed back on; CBC will just solve its remaining 'child' subproblem. Hence, the net result of processing a node may be one fewer node on the heap (child is infeasible or worse than cutoff, and the reference count is 1), one more node on the heap (child is feasible and better than cutoff, and the reference count is 2), or the same number of nodes on the heap (the other two cases). All of this happens in CbcModel:DoOneNode().
A variable to be branched on is selected at the time a newly created node is pushed on to the heap. This also occurs in CbcModel:DoOneNode() -- it calls one of the 'choose branch' methods of CbcNode (at present by default this is CbcNode::chooseDynamicBranch). Branch variable selection is very important to overall performance, and can itself involve a fair amount of work, including strong branching.
=== Strong Branching ===
CBC must determine which variables are the 'highest priority' for branching (with complete solution of subproblems as described above). The user can specify branch variable priorities, and these are considered first, but if there are no user-specified priorities, or for all variables with the same priority, CBC will determine 'pseudocosts', values which are used for priority-setting. Strong branching is a powerful method for determining these pseudocosts. In strong branching, CBC chooses a number of integer variables and begins branching on each one, but performs a limited number of LP iterations (say 10), to see how much the objective is worsened (degraded) by the bounds imposed on each variable. At a minimum, pseudocosts are computed in strong branching, but this process may yield other information; for example, an LP may be found to be infeasible within the allowed number of iterations.
Strong branching is performed by calling solver->setIntParam(OsiMaxNumIterationHotStart, N) and solver->SolveFromHotStart() within the 'choose branch' methods of CbcNode (again CbcNode::chooseDynamicBranch() is at present the primary method).
=== Cut Generation ===
Cut generation is another key element of LP/MIP performance. In CBC, cuts are added in the method solveWithCuts(), which iteratively solves, looks for cuts that can be generated, and re-solves. This iterative process stops when one of several criteria are satisfied: The maximum number of passes is reached, at the root or deeper in the tree (separate options); a maximum number of cuts have been generated; an added cut doesn't change the objective when re-solved, or the re-solve executes no LP iterations.
CBC uses the specific cutting plane algorithms available in CGL, the Cut Generation Library. An instance of class CbcCutGenerator is created and bound to a CglCutGenerator and to a CbcModel; it contains parameters that control when and how the generateCuts method of the CglCutGenerator will be called. See further comments in the header file CbcCutGenerator.hpp.
The CGL cut classes in the CBC build include CglFlowCover, CglGomory, CglKnapsackCover, CglLandP (Lift and Project), CglMixedIntegerRounding, CglMixedIntegerRounding2, CglOddHole, CglPreProcess, CglProbing, CglResidualCapacity, and CglTwoMir (two-stage Mixed Integer Rounding).
Classes CbcCutModifier and CbcCutSubsetModifier provide a framework for users to modify (e.g. strengthen, weaken, remove) cuts, but these classes are not used in the CBC code at present.
=== Heuristics ===
Heuristics can either find, or (by fixing variables etc.) help find integer solutions early. CBC has a class for each heuristic, derived from the base class CbcHeuristic which is defined in CbcHeuristic.hpp. The base class defines some common options/controls that determine how often the heuristic should be run. It also defines the methods solution() - called to execute the heuristic, shouldHeurRun() - a lightweight test to determine whether the heuristic is worth running at this node, and smallBranchAndBound() which is discussed below.
Three special heuristic classes are defined in CbcHeuristic.hpp: CbcRounding, CbcHeuristicPartial and CbcSerendipity. Others are defined in their own .hpp and .cpp files: CBCHeuristicDive, CBCHeuristicDiveCoefficient, CBCHeuristicDiveFractional, CBCHeuristicDiveGuided, CBCHeuristicDiveLineSearch, CBCHeuristicDivePseudoCost, CBCHeuristicDiveVectorLength, CBCHeuristicFPump (the Feasibility Pump), CBCHeuristicGreedy, CBCHeuristicLocal, CBCHeuristicPivotAndFix, CBCHeuristicRandRound, and four heuristics originally in CBCHeuristicRINS.cpp (Relaxation Induced Neighborhood Search): CBCHeuristicRINS, CBCHeuristicRENS, CBCHeuristicDINS and CBCHeuristicVND.
Heuristics that successfully fix several variables, thereby creating a smaller problem, will call the CbcHeuristic method smallBranchAndBound(), which does a "small/inner" Branch and Bound search starting from the heuristic result, seeking a new integer solution. Again there must be enough fixing to make the problem materially smaller; the test is new(rows+columns)/old(rows+columns) < fractionSmall_, adjusted by a multiplier for large problems. This "inner" Branch and Bound explores at most numberNodes_ (default 200).
In addition, the main Branch and Bound may reach a point where it has fixed a sufficient number of variables, without necessarily using heuristics, so that a "small/inner" Branch and Bound search may be justified. This occurs in CBCModel.cpp, for both the root node and subtree nodes, where the code creates an instance of the special heuristic class CbcSerendipity and calls its smallBranchAndBound() method.