Mini-explanation of BFS and DFS

Disclaimer: all code in this post is untested-as-posted, but hopefully illustrates some basic concepts. Also, compared to a few of the folks I know, I suck at “algorithmic problems” such as these (i.e. those used on algorithm competitions such as IOI, Code Jam, Topcoder, etc) – so the solutions are almost certainly suboptimal. Post your own solutions in comments.

So you need to perform a search within fenced area in a 2D array of data. Searching includes counting for objects you want to find and flood-filling this area. Think “bucket” in various raster painting programs that fill pixels until reaching a fence – a “different” color. There are two primary algorithms to do this.

Depth-first search DFS or depth first search goes as deep as possible in a particular direction before reaching fence and backing off just as much is needed to change the direction. It’s implemented using a stack. Due to this, it commonly involves simply using recursion to explore surroundings of the “current” tile. (Let’s use “tile” as a name for cells of a 2D array.) Each tile is an entry on the stack — when implemented via recursion, it’s a single function call or a ‘stack frame’.

In the code below, I’ll be using a one-dimensional array. Two-dimensional arrays have… semantical particularities that I prefer to avoid.

#define WIDTH 16
#define HEIGHT 7
char field[WIDTH*HEIGHT] =
"................" // This is a single string.
"...#######.###.." // It's perfectly valid to break them like this.
"..##.....###.#.." // Also, never forget: C strings are arrays of characters.
"..#..........#.."
"..############.."
"..#.#..........."
"..###...........";

void dfs(int x, int y)
{
if(field[y*WIDTH + x] != '.')
{
// we can't do anything here, tile is not 'empty'.
return;
}

// do whatever you want to do here...
// for example, fill tile with a different character.
// in your paint program, that'd be a different color
field[y*WIDTH + x] = '!';

// now let's spread further.
// in case we should not have spread there, we'll
// abort when we get there.
dfs(x, y-1); // up
dfs(x, y+1); // down
dfs(x-1, y); // left
dfs(x+1, y); // right;
}

int main()
{
dfs(6, 3);
return 0;
}

Before utilizing this algorithm, consider that stack may not be as large as you may want it to be, so recursively calling like this might not be a good idea.

dfs(6, 3) makes the algorithm begin somewhere within the large fenced area. (Can you figure out where?) As implemented, it paints the current tile, goes up, paints the current tile, goes up, gives up and steps back, goes down, gives up and steps back, goes left, paints the current tile, goes up, gives up and steps back, goes down, fills the current tile, goes up, gives up and steps back, goes down, gives up and steps back, …

In other words, take a piece of paper and a pencil, and trace the algorithm yourself.

Algorithm terminates when all four tests have failed (and it’s bound to happen eventually, since you’ve been coloring the tiles you’ve covered all along).

If you’re not modifying underlying data, you’ll want to have an additional array filled with “not covered” marks (e.g. boolean value of false – perhaps memset() to zero). As you pass certain tiles, you mark them as having been handled, and you use the “must be not covered” as an additional fencing condition. BFS or breadth first search has a somewhat more natural search order. Like a bucket of spilt water, it spreads from the point where it started, instead of weirdly going into a single direction, then spreading around while it’s backing off. (Again, use paper and pencil and draw both algorithms if you can’t visualize the behavior.)

BFS involves the use of a queue in place of a stack. It can’t be implemented recursively; instead, you can think of it as a “list of tasks we want to execute in the future, first come, first served” (or FIFO – first in, first out).

So, since the earliest tiles were queued earliest, they’ll be processed earliest.

In this example, I’ll only show you the new bfs() function. For implementing the queue, I’ll use STL vectors, since I’m too lazy to code queue myself and I’m too lazy to look up a proper data structure offered in STL.

struct bfs_task // define data type "bfs_task".
{
int x, y;
};

void bfs(int x, int y)
{
std::vector bfs_queue;

// add initial position to the task queue
bfs_queue.push_back((bfs_task){.x = x, .y = y});

// start rolling over the queue.
// not using iterators since we'll be modifying the vector,
// which would break the iterators.
while (bfs_queue.size()) // at least one task in the queue
{
int x = currentTask.x, y = currentTask.y;

// remove the first task
bfs_queue.erase(bfs_queue.begin(), bfs_queue.begin()+1);
// note! the line before actually copies everything "one element up".
// but I'm too lazy to look up an optimal STL template to use.
// I don't use C++ daily anymore, and this was a quickly written post.

// can we be here? if not, skip
if(field[y*WIDTH + x] != '.')
{
// we can't do anything here, tile is not 'empty'.
return;
}

// do whatever you want to do here...
// for example, fill tile with a different character.
// in your paint program, that'd be a different color
field[y*WIDTH + x] = '!';

// now let's spread further.
// in case we should not have spread there, we'll
// abort when we get there.
// please note: we're actually just queueing tasks for future use.
bfs_queue.push_back((bfs_task){.x = x, .y = y-1}); // up
bfs_queue.push_back((bfs_task){.x = x, .y = y+1}); // down
bfs_queue.push_back((bfs_task){.x = x-1, .y = y}); // left
bfs_queue.push_back((bfs_task){.x = x+1, .y = y}); // right
}
}

Once again: std::vector is a bad data structure to use here due to slow .erase().