By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,952 Members | 1,722 Online
Bytes IT Community
Submit an Article
Got Smarts?
Share your bits of IT knowledge by writing an article on Bytes.

Graphs and the BFS algorithm

prometheuzz
Expert 100+
P: 197
Hello (Java) enthusiasts,

In this article I’d like to tell you a little bit about graphs and how you can
search a graph using the BFS (breadth first search) algorithm.

I’ll address, and hopefully answer, the following questions:
• what is a graph?
• how can a graph be represented as an ADT?
• how can we search/walk through a graph using the BFS algorithm?


What is a graph?

A graph is a (data) structure consisting of a set of vertices and a collection
of pairs of vertices from that set. These pairs denote how the vertices from
the graph are connected to each other.
Think of Hollywood’s film industry as a graph: there are movies made which are
produced by actors (for simplicity, I left out actresses, directors, producers,
etc. which are also a part of a movie, of course!). The movies and actors are
the vertices from that graph and the relation from an actor to a movie is
called an edge.
Here’s an example of a small graph where the Roman numerals are movies and the
lower-case alphabetic characters are the actors:



Such a graph is called an unweighted, undirected graph. Undirected means there
is a relation (edge) from an actor to a movie and that same relation goes back.
Think of it as a two-way street, you can go both ways. And unweighted means that
it doesn’t "cost" anything to go from one vertex to the other.
A nice example of a weighted, bidirected (both un- and directed) graph is a
street network. The crossings (and dead-ends) are the vertices and the roads
connecting those vertices are the edges. It is bidirected because there are
two- and one-way streets, and it is weighted because there’s a distance between
the vertices.
Well that’s all very nice, I hear you thinking, but …


How can a graph be represented as an ADT?

As been stated in the previous paragraph: the vertices in a graph are unique.
And each vertex "points to" a list of vertices which it is connected to. So,
the java.util.Map interface is an ideal structure for such a thing. For those
of you that don't know what it is, a java.util.Map is an ADT from the standard
JDK which holds a unique key and maps a value to that key. For details, please
see the API documentation:
http://java.sun.com/j2se/1.5.0/docs/.../util/Map.html
And to store a collection of edges, we simply use a java.util.List. This kind
of representation of a graph is called an adjacency list.

What else do we need in our Graph class? To begin with, we will need to build
the following functionality:
• we need to be able to add and connect vertices;
• a way to parse a text file that holds the data of our graph;
• perhaps to override the toString() method to check if the graph is properly built;

and, for our BFS algorithm in the next section, we will need to be able to:
• get the edges of a specific vertex;
• check if a certain vertex is connected to another vertex.

Here’s an implementation of the Graph class so far:

Expand|Select|Wrap|Line Numbers
  1. import java.io.*;
  2. import java.util.*;
  3.  
  4. public class Graph {
  5.  
  6.     private Map<String, List<String>> adjacencyList;
  7.  
  8.     /**
  9.      * Instatiates the 'adjacencyList' and then parse the data file.
  10.      */
  11.     public Graph(String fileName) throws FileNotFoundException {
  12.         adjacencyList = new HashMap<String, List<String>>();
  13.         parseDataFile(fileName);
  14.     }
  15.  
  16.     /**
  17.      * This is an undirected graph, so we connect 'vertexA' to 'vertexB' 
  18.      * and the other way around.
  19.      */
  20.     public void addConnection(String vertexA, String vertexB) {
  21.         connect(vertexA, vertexB);
  22.         connect(vertexB, vertexA);
  23.     }
  24.  
  25.     /**
  26.      * A private helper-method to connect 'vertexA' to 'vertexB'.
  27.      * If 'vertexA' alreay exists in our 'adjacencyList', get it's 
  28.      * edges-list and add 'vertexB' to it. If it doesn't exist,  
  29.      * create a new ArrayList, add 'vertexB' to it and put it all 
  30.      * in our 'adjacencyList'.
  31.      */
  32.     private void connect(String vertexA, String vertexB) {
  33.         List<String> edges;
  34.         if(adjacencyList.containsKey(vertexA)) {
  35.             edges = adjacencyList.get(vertexA);
  36.             edges.add(vertexB);
  37.         } else {
  38.             edges = new ArrayList<String>();
  39.             edges.add(vertexB);
  40.             this.adjacencyList.put(vertexA, edges);
  41.         }
  42.     }
  43.  
  44.     /**
  45.      * Returns true iff 'vertexA' poits to to 'vertexB'.
  46.      * Note that since this is an undirected graph, we do not 
  47.      * need to check the other way around, the case if 'vertexB' 
  48.      * is points to 'vertexA'.
  49.      */
  50.     public boolean isConnectedTo(String vertexA, String vertexB) {
  51.         List<String> edges = getEdges(vertexA);
  52.         return edges.contains(vertexB);
  53.     }
  54.  
  55.     /**
  56.      * Returns all the edges of a certain vertex, or throws an 
  57.      * exception if the vertex doesn't exist in this graph.
  58.      */
  59.     public List<String> getEdges(String vertex) {
  60.         List<String> edges = adjacencyList.get(vertex);
  61.         if(edges == null) {
  62.             throw new RuntimeException(vertex+" not present in the graph.");
  63.         }
  64.         return edges;
  65.     }
  66.  
  67.     /**
  68.      * Reads a text file with the graph-data. The text file contains 
  69.      * N-blocks of lines where each block starts with the movie followed
  70.      * by N-lines of text representing the actors and ending with an 
  71.      * empty line.
  72.      */
  73.     private void parseDataFile(String fileName) throws FileNotFoundException {
  74.         Scanner file = new Scanner(new File(fileName));
  75.         while(file.hasNextLine()) {
  76.             String movie = file.nextLine().trim();
  77.             while(file.hasNextLine()) {
  78.                 String actor = file.nextLine().trim();
  79.                 if(actor.length() == 0) break;
  80.                 addConnection(movie, actor);
  81.             }
  82.         }
  83.     }
  84.  
  85.     /**
  86.      * A Sting representation if this Graph.
  87.      */
  88.     public String toString() {
  89.         StringBuilder builder = new StringBuilder();
  90.         Iterator<String> vertices = adjacencyList.keySet().iterator();
  91.         while(vertices.hasNext()) {
  92.             String vertex = vertices.next();
  93.             List<String> edges = adjacencyList.get(vertex);
  94.             builder.append(vertex);
  95.             builder.append(" --> ");
  96.             builder.append(edges);
  97.             builder.append('\n');
  98.         }
  99.         return builder.toString();
  100.     }
  101.  
  102.     /**
  103.      * main
  104.      */
  105.     public static void main(String[] args) {
  106.         try {
  107.             Graph graph = new Graph("data.txt");
  108.             System.out.println(graph);
  109.         } catch (FileNotFoundException e) {
  110.             e.printStackTrace();
  111.         }
  112.     }
  113. }
  114.  
Here’s the contents of the file data.txt for the graph I posted an image of:

Expand|Select|Wrap|Line Numbers
  1. I
  2.   a
  3.   b
  4.   c
  5.   d
  6.   e
  7.  
  8. II
  9.   c
  10.   h
  11.   q
  12.   r
  13.  
  14. III
  15.   h
  16.   o
  17.   p
  18.  
  19. IV
  20.   e
  21.   f
  22.   g
  23.  
  24. V
  25.   c
  26.   g
  27.   j
  28.  
  29. VI
  30.   k
  31.   m
  32.   n
  33.   o
  34.  
As you can see if you run this code, the graph is now properly built. So,
we're ready to do some brute-force searching on our graph.


How can we search/walk through a graph using the BFS algorithm?

First of all: what is BFS anyway? BFS is a brute, and rather simple way of
traversing your graph starting at a given vertex. It divides the graph in
several levels. The starting point of a BFS is called level 0, it's direct
edges level 1, the edges of it's edges level 2, and so on. You could compare
it when you throw a stone in a pool of water: the starting vertex is where
the stone hits the water and the ripples from the impact are "discovering"
unknown territory. So, this algorithm can be used to find a shortest path
between two vertices. By a shortest path in this case I mean the path from
one vertex to another while traversing the least possible number of edges.
Note that I said "in this case", because in the case of a weighted graph, the
shortest path is not necessarily the one with the least edges: one direct road
between two vertices of a length of 10 miles, is longer than two roads with a
length of 4 miles, of course.

Let's say would like to find the shortest path between vertices g and
n. As you can see there are two possible paths between them:
Expand|Select|Wrap|Line Numbers
  1.  g -> IV -> e -> I -> c -> II -> h -> III -> o -> VI -> n 
and
Expand|Select|Wrap|Line Numbers
  1.  g -> V -> c -> II -> h -> III -> o -> VI -> n 
Needles to say, we are interested in finding the second path, which is the
shortest.

To keep things orderly, I made a separate class which creates a graph from a
text file and finds a shortest path between two vertices.
The first thing that happens is that during the BFS traversal a 'container' is
being created of newly discovered vertices where the first vertex (the starting
point) is on index 0 (level 0), it's edges at index 1 (level 1), and so on.
So, after discovering our end vertex, that 'container' will look like this:

Expand|Select|Wrap|Line Numbers
  1. level 0 = g
  2. level 1 = IV, V
  3. level 2 = e, f, c, j
  4. level 3 = I, II
  5. level 4 = a, b, d, h, q, r
  6. level 5 = III
  7. level 6 = o, p
  8. level 7 = VI
  9. level 8 = n
  10.  
After that, we back-track from the end vertex ("n") towards the start vertex
("g") using our areConnected(String, String) method from the Graph class
to see if two vertices are connected. That way we end up with the shortest
path from "g" to "n".

Expand|Select|Wrap|Line Numbers
  1. import java.io.*;
  2. import java.util.*;
  3.  
  4. public class BFSAlgorithm {
  5.  
  6.     private Graph graph;
  7.  
  8.     /**
  9.      * Constructor.
  10.      */
  11.     public BFSAlgorithm(Graph g) {
  12.         graph = g;
  13.     }
  14.  
  15.     /**
  16.      * 1 - Create a stack to store all the vertices of our path on.
  17.      * 2 - First push the 'end' vertex on our stack.
  18.      * 3 - Now loop from the highest level back to the first level and
  19.      *     a. loop through each level and
  20.      *     b. check each vertex in that level if it's connected to the
  21.      *        vertex on the top of our stack, if we find a match, push that
  22.      *        match on the stack and break out of the loop.
  23.      * 4 - Now we only need to reverse the collection (stack) before returning 
  24.      *     the path in the "correct" order (from start to finish).
  25.      * 
  26.      * Here's an example ASCII drawing of backtracking from the end vertex "n" 
  27.      * to the starting vertex "g". The arrows, <-, denote the path that was 
  28.      * found.
  29.      * 
  30.      * level:  0      1      2      3       4      5      6    7     8
  31.      *        ---------------------------------------------------------
  32.      *         g <-+  IV     e      I       a   +- III <- o <- VI <- n 
  33.      *             +- V <-+  f   +- II <-+  b   |         p
  34.      *                    +- c <-+       |  d   |
  35.      *                       j           +- h <-+
  36.      *                                      q
  37.      *                                      r
  38.      */
  39.     private List<String> backTrack(List<List<String>> container, String end) {
  40.         Stack<String> path = new Stack<String>();                     // 1
  41.         path.push(end);                                               // 2
  42.         for(int i = container.size()-1; i >= 0; i--) {                // 3
  43.             List<String> level = container.get(i);
  44.             String last = path.peek();
  45.             for(String s : level) {                                   // a
  46.                 if(graph.areConnected(last, s)) {                     // b
  47.                     path.push(s);
  48.                     break;
  49.                 }
  50.             }
  51.         }
  52.         Collections.reverse(path);                                    // 4
  53.         return path;
  54.     }
  55.  
  56.     /**
  57.      * 1 - Get the level from the 'container' which was added last.
  58.      * 2 - Create a new level to store (possible) unexplored verices in.
  59.      * 3 - Loop through each of the vertices from the last added level, and
  60.      *     a. get the neighboring vertices connected to each vertex,
  61.      *     b. loop through each connecting vertex and if the connecting vertex
  62.      *        has not yet been visited,
  63.      *     c. only then add it to the newly created level-list and mark it as 
  64.      *        visited in our set.
  65.      * 4 - We don't need to search any further if we stumble upon the 'end' 
  66.      *     vertex. In that case, just "return" from this method.
  67.      * 5 - Only make the recursive call if we have found vertices which have 
  68.      *     not been explored yet.
  69.      */
  70.     private void bfs(List<List<String>> container, 
  71.             String end, Set<String> visited) {
  72.  
  73.         List<String> lastLevel = container.get(container.size()-1);   // 1
  74.         List<String> newLevel = new ArrayList<String>();              // 2
  75.  
  76.         for(String vertex : lastLevel) {                              // 3
  77.             List<String> edges = graph.getEdges(vertex);              // a
  78.             for(String edge : edges) {                                // b
  79.                 if(!visited.contains(edge)) {                         // c
  80.                     visited.add(edge);
  81.                     newLevel.add(edge);
  82.                 }
  83.                 if(edge.equals(end)) return;                          // 4
  84.             }
  85.         }  
  86.         if(newLevel.size() > 0) {                                     // 5
  87.             container.add(newLevel);
  88.             bfs(container, end, visited);
  89.         }
  90.     }
  91.  
  92.     /**
  93.      * 1 - Create an empty 'container' to store all the levels from the 
  94.      *     'start'-vertex in. A level is also a list with one or more vertices.
  95.      * 2 - The first level only holds the 'start' vertex, which is added first,
  96.      *     this is the 'init' list.
  97.      * 3 - The 'start' vertex is also stored in a Set which keeps track of all 
  98.      *     the vertices we have encountered so that we don't traverse vertices
  99.      *     twice (or more).
  100.      * 4 - Once we initialized the steps 1-3, we can call the actual BFS-
  101.      *     algorithm.
  102.      * 5 - Once the BFS-algorithm is done, we call the backTrack(...) method 
  103.      *     to find the shortest path between 'end' and 'start' between the 
  104.      *     explored levels of the graph. 
  105.      */
  106.     public List<String> getShortestPath(String start, String end) {
  107.         List<List<String>> container = new ArrayList<List<String>>(); // 1
  108.         List<String> init = new ArrayList<String>();                  // 2
  109.         init.add(start);
  110.         container.add(init);
  111.         Set<String> visited = new HashSet<String>();                  // 3
  112.         visited.add(start);
  113.         bfs(container, end, visited);                                 // 4
  114.         return backTrack(container, end);                             // 5
  115.     }
  116.  
  117.     /**
  118.      * Main method:
  119.      *  1 - Create a Graph.
  120.      *  2 - Get a shortest path between two vertices.
  121.      *  3 - Print the shortest path.
  122.      */
  123.     public static void main(String[] args) throws FileNotFoundException {
  124.         Graph graph = new Graph("data.txt");                          // 1
  125.         BFSAlgorithm bfsAlgorithm = new BFSAlgorithm(graph);
  126.         List<String> shortestPath = 
  127.             bfsAlgorithm.getShortestPath("g", "n");                   // 2
  128.         for(int i = 0; i < shortestPath.size(); i++) {
  129.             System.out.print(shortestPath.get(i));                    // 3
  130.             System.out.print(i < shortestPath.size()-1 ? "\n -> " : "\n");
  131.         }
  132.     }
  133. }
  134.  
Ok, that was more lines of text than I had anticipated. I hope I didn't bore
you too much. For those who want to experiment some more: I made a small
selection of movies and actors (+/- 70000 vertices) from the IMDB's data files*
which you can experiment with to see how the actors Ron Jeremy and
Kevin Bacon are connected (are they even connected???). You can find that file here:
http://www.iruimte.nl/graph/imdb.txt which can be parsed in the same way as
the text file above.

Regards,
Bart (prometheuzz)

* All data files can be found here: ftp://ftp.sunet.se/pub/tv+movies/imdb/
Jun 20 '07 #1
Share this Article
Share on Google+
4 Comments


10K+
P: 13,264
Neat. A very nice article indeed Bart.
Jun 22 '07 #2

prometheuzz
Expert 100+
P: 197
Neat. A very nice article indeed Bart.
Thanks!
; )

..........
Jun 22 '07 #3

P: 1
thx for article.. this is my project..Thx again
Mar 9 '08 #4

P: 1
Great Article! Excellent comments and coding!
Mar 24 '08 #5