CIS 3223 Lab09 – Implementing Depth-First Graph (DFG) Search for Connectivity Analysis

DUE: (before the next lab, 10:30am Friday, April 3, 2008)


 

Depth-First Approach for Discovery of Connected Graph Regions

 

In Lab 8 you implemented the DFS algorithm (Fig 3.3 in the Dasgupta textbook) that finds all reachable nodes from a given node of an undirected graph. In this lab, you will use DFS function to write a Matlab code that searches for all connected regions of the graph. The pseudo code for that function is shown in Figure 3.5 of the textbook.

 

TASKS

1. Your goal is to implement in Matlab a DFS_CONNECTED algorithm that, given a graph, finds all connected regions of an undirected graph. The function definition is:

 

function regions = dfs_connected(G)

 

You should assume:

G to be the adjacency matrix of dimension N*N, where N is the number of graph nodes.

regions to be an array of size N*1 with integer values that indicate to which region each of the N nodes belongs. For example, if all nodes are connected, regions will be an array of ones. For graph of Figure 3.2, the outcome should be regions = [1 1 1 1 1 1 1 1 1 1 2 2].

Observe that dsf function from Lab 8 should be used as a subroutine.

 

2. Evaluate the DFS_CONNECTED function. Use graph from figure 3.2 in the textbook. Its edges are

E = {(A,B), (A,C), (A,D), (B,F), (B,E), (C,F), (D,G), (D,H), (E,I), (E,J), (G,H), (I,J), (K,L)}.

 

3. Write function DSF_ANALYSIS (and provide the resulting code) that analyzes regions and outputs sizes of all unique regions in the graph. Functions unique and find will be very useful to answer this. The function definition is:

 

function summary = dfs_analysis(regions)

 

For example, for graph of Figure 3.2, the outcome should be summary = [10 2], meaning that there are 10 nodes in the first connected region and 2 in the second.

 

4. Generate graphs G of size N = {10, 50, 100, 500} and with alpha = {0.01, 0.1}  using the following code:

G = sprand(N,N,alpha);

Explain what function sprand is doing?

Run DFS_CONNECTED 10 times for each possible combination of N and alpha. For each run, save the number of disjoint connected regions, size of the largest region, average size of the region, and time needed to run the DFS.

 

5. Analyze the runtime results of Task 4 and explore if the runtime confirms the theoretical result O(|V| + |E|). Most likely, the runtime will appear to follow O(|V|2). The reason is that adjacency matrix representation requires you to loop through every element of the matrix. Take a look at your implementation and comment if this is correct.

 

6. You should implement a more efficient DSF Matlab algorithm that uses graph representation with adjacency lists. The following is the code that, starting from graph G represented as a matrix, creates GL which is Matlab’s version of the adjacency list:

 

for i=1:N

GL{i} = find(G(i,:) == 1);

end

 

Explain what does this code achieve and what is the connection between GL and the adjacency list. Argue what is the runtime complexity of converting G to GL? Write function DSF_CONNECTED_AL that uses GL graph representation,

 

function regions = dfs_connected_AL(GL)

 

7. Repeat experiments of Task 4 using DSF_CONNECTED_AL function instead of DSF_CONNECTED. Of course, you should first convert each G graph to GL graph. Compare the runtime of the two functions as a function of N and alpha.