CIS 3223 Lab09 – Implementing Depth-First Graph (DFG) Search
for Connectivity Analysis
DUE: (before the next lab, 10:30am Friday, April 3, 2008)
Depth-First Approach for
Discovery of Connected Graph Regions
In Lab 8 you implemented the DFS
algorithm (Fig 3.3 in the Dasgupta textbook) that
finds all reachable nodes from a given node of an undirected graph. In this
lab, you will use DFS function to write a Matlab code
that searches for all connected regions of the graph. The pseudo code for that
function is shown in Figure 3.5 of the textbook.
TASKS
1. Your goal is to implement in Matlab
a DFS_CONNECTED algorithm that, given a graph, finds all connected regions of
an undirected graph. The function definition is:
function regions = dfs_connected(G)
You should assume:
G to be the
adjacency matrix of dimension N*N, where N is the number of graph nodes.
regions to be an array of size N*1
with integer values that indicate to which region each of the N nodes belongs. For example, if all nodes are
connected, regions will be an array of ones. For graph of Figure 3.2, the
outcome should be regions
= [1 1 1 1 1 1 1 1 1 1 2 2].
Observe that dsf function from Lab 8 should be used as a subroutine.
2. Evaluate the DFS_CONNECTED function. Use
graph from figure 3.2 in the textbook. Its edges are
E = {(A,B), (A,C), (A,D),
(B,F), (B,E), (C,F), (D,G), (D,H), (E,I), (E,J), (G,H), (I,J), (K,L)}.
3. Write function DSF_ANALYSIS (and
provide the resulting code) that analyzes regions and outputs sizes of all unique
regions in the graph. Functions unique and find will be very useful to
answer this. The function definition is:
function summary = dfs_analysis(regions)
For example, for graph of
Figure 3.2, the outcome should be summary = [10 2],
meaning that there are 10 nodes in the first connected region and 2 in the
second.
4. Generate graphs G of size N
= {10, 50, 100, 500} and with alpha = {0.01, 0.1} using the following code:
G = sprand(N,N,alpha);
Explain what function sprand is doing?
Run
DFS_CONNECTED 10 times for each possible combination of N and alpha. For each run, save the number of disjoint connected regions,
size of the largest region, average size of the region, and time needed to run
the DFS.
5. Analyze the runtime results of Task 4 and
explore if the runtime confirms the
theoretical result O(|V| + |E|). Most likely, the
runtime will appear to follow O(|V|2). The
reason is that adjacency matrix representation requires you to loop through
every element of the matrix. Take a look at your implementation and comment if this is correct.
6. You
should implement a more efficient DSF
Matlab algorithm that uses graph representation with adjacency
lists. The following is the code that, starting from graph G represented as a matrix,
creates GL which is Matlab’s version of the adjacency
list:
for i=1:N
GL{i} = find(G(i,:) == 1);
end
Explain what
does this code achieve and what is the connection between GL and the adjacency list. Argue what is the runtime
complexity of converting G to GL? Write function DSF_CONNECTED_AL
that uses GL graph representation,
function regions = dfs_connected_AL(GL)
7. Repeat experiments of Task 4 using DSF_CONNECTED_AL function instead
of DSF_CONNECTED. Of course, you should first convert each G graph to GL graph.
Compare the runtime of the two
functions as a function of N and alpha.