424,952 Members | 1,722 Online Submit an Article
Got Smarts?
Share your bits of IT knowledge by writing an article on Bytes.

Graphs and the BFS algorithm

 Expert 100+ P: 197 Hello (Java) enthusiasts, In this article Id like to tell you a little bit about graphs and how you can search a graph using the BFS (breadth first search) algorithm. Ill address, and hopefully answer, the following questions:  what is a graph?  how can a graph be represented as an ADT?  how can we search/walk through a graph using the BFS algorithm? What is a graph? A graph is a (data) structure consisting of a set of vertices and a collection of pairs of vertices from that set. These pairs denote how the vertices from the graph are connected to each other. Think of Hollywoods film industry as a graph: there are movies made which are produced by actors (for simplicity, I left out actresses, directors, producers, etc. which are also a part of a movie, of course!). The movies and actors are the vertices from that graph and the relation from an actor to a movie is called an edge. Heres an example of a small graph where the Roman numerals are movies and the lower-case alphabetic characters are the actors: Such a graph is called an unweighted, undirected graph. Undirected means there is a relation (edge) from an actor to a movie and that same relation goes back. Think of it as a two-way street, you can go both ways. And unweighted means that it doesnt "cost" anything to go from one vertex to the other. A nice example of a weighted, bidirected (both un- and directed) graph is a street network. The crossings (and dead-ends) are the vertices and the roads connecting those vertices are the edges. It is bidirected because there are two- and one-way streets, and it is weighted because theres a distance between the vertices. Well thats all very nice, I hear you thinking, but  How can a graph be represented as an ADT? As been stated in the previous paragraph: the vertices in a graph are unique. And each vertex "points to" a list of vertices which it is connected to. So, the java.util.Map interface is an ideal structure for such a thing. For those of you that don't know what it is, a java.util.Map is an ADT from the standard JDK which holds a unique key and maps a value to that key. For details, please see the API documentation: http://java.sun.com/j2se/1.5.0/docs/.../util/Map.html And to store a collection of edges, we simply use a java.util.List. This kind of representation of a graph is called an adjacency list. What else do we need in our Graph class? To begin with, we will need to build the following functionality:  we need to be able to add and connect vertices;  a way to parse a text file that holds the data of our graph;  perhaps to override the toString() method to check if the graph is properly built; and, for our BFS algorithm in the next section, we will need to be able to:  get the edges of a specific vertex;  check if a certain vertex is connected to another vertex. Heres an implementation of the Graph class so far: Expand|Select|Wrap|Line Numbers import java.io.*; import java.util.*;   public class Graph {       private Map> adjacencyList;       /**      * Instatiates the 'adjacencyList' and then parse the data file.      */     public Graph(String fileName) throws FileNotFoundException {         adjacencyList = new HashMap>();         parseDataFile(fileName);     }       /**      * This is an undirected graph, so we connect 'vertexA' to 'vertexB'       * and the other way around.      */     public void addConnection(String vertexA, String vertexB) {         connect(vertexA, vertexB);         connect(vertexB, vertexA);     }       /**      * A private helper-method to connect 'vertexA' to 'vertexB'.      * If 'vertexA' alreay exists in our 'adjacencyList', get it's       * edges-list and add 'vertexB' to it. If it doesn't exist,        * create a new ArrayList, add 'vertexB' to it and put it all       * in our 'adjacencyList'.      */     private void connect(String vertexA, String vertexB) {         List edges;         if(adjacencyList.containsKey(vertexA)) {             edges = adjacencyList.get(vertexA);             edges.add(vertexB);         } else {             edges = new ArrayList();             edges.add(vertexB);             this.adjacencyList.put(vertexA, edges);         }     }       /**      * Returns true iff 'vertexA' poits to to 'vertexB'.      * Note that since this is an undirected graph, we do not       * need to check the other way around, the case if 'vertexB'       * is points to 'vertexA'.      */     public boolean isConnectedTo(String vertexA, String vertexB) {         List edges = getEdges(vertexA);         return edges.contains(vertexB);     }       /**      * Returns all the edges of a certain vertex, or throws an       * exception if the vertex doesn't exist in this graph.      */     public List getEdges(String vertex) {         List edges = adjacencyList.get(vertex);         if(edges == null) {             throw new RuntimeException(vertex+" not present in the graph.");         }         return edges;     }       /**      * Reads a text file with the graph-data. The text file contains       * N-blocks of lines where each block starts with the movie followed      * by N-lines of text representing the actors and ending with an       * empty line.      */     private void parseDataFile(String fileName) throws FileNotFoundException {         Scanner file = new Scanner(new File(fileName));         while(file.hasNextLine()) {             String movie = file.nextLine().trim();             while(file.hasNextLine()) {                 String actor = file.nextLine().trim();                 if(actor.length() == 0) break;                 addConnection(movie, actor);             }         }     }       /**      * A Sting representation if this Graph.      */     public String toString() {         StringBuilder builder = new StringBuilder();         Iterator vertices = adjacencyList.keySet().iterator();         while(vertices.hasNext()) {             String vertex = vertices.next();             List edges = adjacencyList.get(vertex);             builder.append(vertex);             builder.append(" --> ");             builder.append(edges);             builder.append('\n');         }         return builder.toString();     }       /**      * main      */     public static void main(String[] args) {         try {             Graph graph = new Graph("data.txt");             System.out.println(graph);         } catch (FileNotFoundException e) {             e.printStackTrace();         }     } }   Heres the contents of the file data.txt for the graph I posted an image of: Expand|Select|Wrap|Line Numbers I   a   b   c   d   e   II   c   h   q   r   III   h   o   p   IV   e   f   g   V   c   g   j   VI   k   m   n   o   As you can see if you run this code, the graph is now properly built. So, we're ready to do some brute-force searching on our graph. How can we search/walk through a graph using the BFS algorithm? First of all: what is BFS anyway? BFS is a brute, and rather simple way of traversing your graph starting at a given vertex. It divides the graph in several levels. The starting point of a BFS is called level 0, it's direct edges level 1, the edges of it's edges level 2, and so on. You could compare it when you throw a stone in a pool of water: the starting vertex is where the stone hits the water and the ripples from the impact are "discovering" unknown territory. So, this algorithm can be used to find a shortest path between two vertices. By a shortest path in this case I mean the path from one vertex to another while traversing the least possible number of edges. Note that I said "in this case", because in the case of a weighted graph, the shortest path is not necessarily the one with the least edges: one direct road between two vertices of a length of 10 miles, is longer than two roads with a length of 4 miles, of course. Let's say would like to find the shortest path between vertices g and n. As you can see there are two possible paths between them: Expand|Select|Wrap|Line Numbers  g -> IV -> e -> I -> c -> II -> h -> III -> o -> VI -> n  and Expand|Select|Wrap|Line Numbers  g -> V -> c -> II -> h -> III -> o -> VI -> n  Needles to say, we are interested in finding the second path, which is the shortest. To keep things orderly, I made a separate class which creates a graph from a text file and finds a shortest path between two vertices. The first thing that happens is that during the BFS traversal a 'container' is being created of newly discovered vertices where the first vertex (the starting point) is on index 0 (level 0), it's edges at index 1 (level 1), and so on. So, after discovering our end vertex, that 'container' will look like this: Expand|Select|Wrap|Line Numbers level 0 = g level 1 = IV, V level 2 = e, f, c, j level 3 = I, II level 4 = a, b, d, h, q, r level 5 = III level 6 = o, p level 7 = VI level 8 = n   After that, we back-track from the end vertex ("n") towards the start vertex ("g") using our areConnected(String, String) method from the Graph class to see if two vertices are connected. That way we end up with the shortest path from "g" to "n". Expand|Select|Wrap|Line Numbers import java.io.*; import java.util.*;   public class BFSAlgorithm {       private Graph graph;       /**      * Constructor.      */     public BFSAlgorithm(Graph g) {         graph = g;     }       /**      * 1 - Create a stack to store all the vertices of our path on.      * 2 - First push the 'end' vertex on our stack.      * 3 - Now loop from the highest level back to the first level and      *     a. loop through each level and      *     b. check each vertex in that level if it's connected to the      *        vertex on the top of our stack, if we find a match, push that      *        match on the stack and break out of the loop.      * 4 - Now we only need to reverse the collection (stack) before returning       *     the path in the "correct" order (from start to finish).      *       * Here's an example ASCII drawing of backtracking from the end vertex "n"       * to the starting vertex "g". The arrows, <-, denote the path that was       * found.      *       * level:  0      1      2      3       4      5      6    7     8      *        ---------------------------------------------------------      *         g <-+  IV     e      I       a   +- III <- o <- VI <- n       *             +- V <-+  f   +- II <-+  b   |         p      *                    +- c <-+       |  d   |      *                       j           +- h <-+      *                                      q      *                                      r      */     private List backTrack(List> container, String end) {         Stack path = new Stack();                     // 1         path.push(end);                                               // 2         for(int i = container.size()-1; i >= 0; i--) {                // 3             List level = container.get(i);             String last = path.peek();             for(String s : level) {                                   // a                 if(graph.areConnected(last, s)) {                     // b                     path.push(s);                     break;                 }             }         }         Collections.reverse(path);                                    // 4         return path;     }       /**      * 1 - Get the level from the 'container' which was added last.      * 2 - Create a new level to store (possible) unexplored verices in.      * 3 - Loop through each of the vertices from the last added level, and      *     a. get the neighboring vertices connected to each vertex,      *     b. loop through each connecting vertex and if the connecting vertex      *        has not yet been visited,      *     c. only then add it to the newly created level-list and mark it as       *        visited in our set.      * 4 - We don't need to search any further if we stumble upon the 'end'       *     vertex. In that case, just "return" from this method.      * 5 - Only make the recursive call if we have found vertices which have       *     not been explored yet.      */     private void bfs(List> container,              String end, Set visited) {           List lastLevel = container.get(container.size()-1);   // 1         List newLevel = new ArrayList();              // 2           for(String vertex : lastLevel) {                              // 3             List edges = graph.getEdges(vertex);              // a             for(String edge : edges) {                                // b                 if(!visited.contains(edge)) {                         // c                     visited.add(edge);                     newLevel.add(edge);                 }                 if(edge.equals(end)) return;                          // 4             }         }           if(newLevel.size() > 0) {                                     // 5             container.add(newLevel);             bfs(container, end, visited);         }     }       /**      * 1 - Create an empty 'container' to store all the levels from the       *     'start'-vertex in. A level is also a list with one or more vertices.      * 2 - The first level only holds the 'start' vertex, which is added first,      *     this is the 'init' list.      * 3 - The 'start' vertex is also stored in a Set which keeps track of all       *     the vertices we have encountered so that we don't traverse vertices      *     twice (or more).      * 4 - Once we initialized the steps 1-3, we can call the actual BFS-      *     algorithm.      * 5 - Once the BFS-algorithm is done, we call the backTrack(...) method       *     to find the shortest path between 'end' and 'start' between the       *     explored levels of the graph.       */     public List getShortestPath(String start, String end) {         List> container = new ArrayList>(); // 1         List init = new ArrayList();                  // 2         init.add(start);         container.add(init);         Set visited = new HashSet();                  // 3         visited.add(start);         bfs(container, end, visited);                                 // 4         return backTrack(container, end);                             // 5     }       /**      * Main method:      *  1 - Create a Graph.      *  2 - Get a shortest path between two vertices.      *  3 - Print the shortest path.      */     public static void main(String[] args) throws FileNotFoundException {         Graph graph = new Graph("data.txt");                          // 1         BFSAlgorithm bfsAlgorithm = new BFSAlgorithm(graph);         List shortestPath =              bfsAlgorithm.getShortestPath("g", "n");                   // 2         for(int i = 0; i < shortestPath.size(); i++) {             System.out.print(shortestPath.get(i));                    // 3             System.out.print(i < shortestPath.size()-1 ? "\n -> " : "\n");         }     } }   Ok, that was more lines of text than I had anticipated. I hope I didn't bore you too much. For those who want to experiment some more: I made a small selection of movies and actors (+/- 70000 vertices) from the IMDB's data files* which you can experiment with to see how the actors Ron Jeremy and Kevin Bacon are connected (are they even connected???). You can find that file here: http://www.iruimte.nl/graph/imdb.txt which can be parsed in the same way as the text file above. Regards, Bart (prometheuzz) * All data files can be found here: ftp://ftp.sunet.se/pub/tv+movies/imdb/ Jun 20 '07 #1

 10K+ P: 13,264 Neat. A very nice article indeed Bart. Jun 22 '07 #2

 Expert 100+ P: 197 Neat. A very nice article indeed Bart. Thanks! ; ) .......... Jun 22 '07 #3

 P: 1 thx for article.. this is my project..Thx again Mar 9 '08 #4

 P: 1 Great Article! Excellent comments and coding! Mar 24 '08 #5 