## How to find biconnected components

Every serious competitive programmer knows how to find all bridges and articulation points in an undirected graph in linear time. Recently, however, I came across a problem that required finding the biconnected components explicitly, and I couldn’t figure out how to do it. The usual suspects, surprisingly, don’t explain it. Sedgewick’s Algorithms, 3rd ed., Part V discusses biconnectivity and explains how to find bridges, but does not explain how to find biconnected components. Skiena’s The Algorithm Design Manual gives code for finding articulation points, but similarly does not explain how to find biconnected components. After consulting some online sources, I saw that the idea is to maintain a stack of visited edges, and upon discovering a complete biconnected component in the course of DFS, pop the corresponding edges off.

Let’s review some definitions. A bridge in an undirected graph is an edge that, if removed, would increase the number of connected components. (Alternatively, a bridge is an edge that, if removed, would disconnect the connected component that it belongs to.) An articulation point of an undirected graph is a vertex that, if removed, would increase the number of connected components. Each endpoint of a bridge is an articulation point (unless its degree is 1) but the converse is false; some articulation points are not associated with any bridge. Consider a graph with 5 vertices, consisting of two triangles (i.e., complete graphs of order 3) that share a common vertex. It has no bridges, but the node common to the two triangles is an articulation point. A biconnected graph is a connected graph that has no articulation points; it remains connected if any single vertex is removed. A biconnected component of an undirected graph is a maximal biconnected subgraph. Every edge in an undirected graph belongs to exactly one biconnected component. Any two biconnected components may have at most one vertex in common; such a vertex is an articulation point.

The well-known classic DFS-based approach to find articulation points reachable from a given starting vertex is due to Hopcroft and Tarjan. Here is pseudocode:

``````// `num_visited` is the next preorder number
global num_visited := 0;
// `pre[v]` is the preorder number of `v`, if it has been visited
global pre: array;
global ap := {};
function dfs(v, parent) {
pre[v] := num_visited;
num_visited := num_visited + 1;
// `low[v]` is the lowest preorder number reachable from `v` by following
// zero or more tree edges and then at most one back edge
low := pre[v];
num_children := 0;
for {w | w is a neighbour of v} {
if (w = p) continue;
if (pre[w] = nil) {
// (v, w) is a tree edge
num_children := num_children + 1;
child_low := dfs(w, v);
if ((p != nil and child_low >= pre[v]) or
(p  = nil and num_children > 1)) {
// a non-root vertex `v` is an articulation point iff from `w` we
// could not reach anything visited strictly before `v`, while the
// root is an articulation point iff it has more than one child
}
low := min(low, child_low);
} else {
// (v, w) is a back edge
low := min(low, pre[w]);
}
}
return low;
}``````

(In case you forgot, the code to find bridges is very similar. The edge `(v, w)` is a bridge if and only if `child_low` is strictly greater than `pre[v]`, with no special case for the root.)

Knowledge of the entire set of articulation points is not enough to construct the biconnected components explicitly. We know that for a non-articulation point, all incident edges belong to the same biconnected component, whereas for an articulation point, not all incident edges belong to the same biconnected component. Running the above algorithm to find all articulation points does not tell us which edges incident upon an articulation point belong to a common biconnected component. Instead, we have to modify the DFS algorithm. Some of the code online was either overly complicated or of dubious correctness, so I wrote my own. In the below pseudocode, changes from the previous code block are bolded, and comments from the previous code block are not repeated.

``````global num_visited := 0;
global pre: array;
global bicomp := {};
global stk: stack;
function dfs(v, parent) {
pre[v] := num_visited;
num_visited := num_visited + 1;
low := pre[v];
num_children := 0;
for {w | w is a neighbour of v} {
if (w = p) continue;
if (pre[w] = nil) {
num_children := num_children + 1;
stk.push((v, w));
child_low := dfs(w, v);
if ((p != nil and child_low >= pre[v]) or
(p  = nil and num_children > 1)) {
// We just finished exploring a biconnected component starting with
// the edge (v, w)
current_bicomp = {};
do {
e := stk.pop();
} while (e != (v, w));
}
low := min(low, child_low);
} else {
low := min(low, pre[w]);
if (pre[w] < pre[v]) {
// (v, w) is a back edge, but still needs to be pushed because it
// is part of the enclosing biconnected component
stk.push((v, w));
}
}
}
return low;
}``````

One change from the code to find the articulation points is that we no longer need to keep track of the number of children. If the root has multiple children, then each time we take a tree edge from the root, we explore one biconnected component containing the root before returning. (The presence of multiple such biconnected components implies that the root is an articulation point.) For any non-root vertex `v`, each time we take a tree edge to a child from which no node that was visited strictly before `v` is reachable, we explore one biconnected component before returning. In each case, upon returning, we pop off all the edges explored and package them as a single biconnected component.

I don’t usually like to write pseudocode, because actual code is more precise, and I’m proficient enough in C++ that I can write actual code in C++ and have it be almost as readable as pseudocode (you might think of this as a weird flex). However, in this particular case, the implementation details are sort of annoying: when we mark an edge as belonging to a particular biconnected component, we generally will want to store the biconnected component ID of that edge alongside the edge itself in the adjacency list structure (for efficient traversal and lookup), and need to make sure that the reverse edge (which is, of course, the same edge, since this is an undirected graph, but is stored in a different row of the adjacency list) is also so marked. To see C++ code that does do this, see my solution to SPOJ STOPCITY. To see (Java) code that doesn’t mark the edges but just the vertices, indy256 comes to the rescue as always. (This representation doesn’t directly give you enough information to solve STOPCITY. You need to convert it into a forest where each node represents a biconnected component of the original graph, and two nodes are joined by an edge if the corresponding biconnected components share a vertex. Then, there exists exactly one pair of nodes `(s', d')` in this tree such that `s'` contains `s`, `d'` contains `d`, and the path from `(s', d')` does not pass through any other vertex that contains `s` or `d`. The answer is the set of all vertices that belong to any biconnected component corresponding to any node on the path from `s'` to `d'`.) 