473,412 Members | 2,293 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,412 software developers and data experts.

Graph algorithms - DFS, generators callbacks, and optimisation

I'm trying to check a graph (represented in the usual Python adjacency
list format) for loops. This is dead simple - you use a depth-first
search, and look out for "back edges" (edges from a vertex "back" to
one you've already seen).

I already have a DFS algorithm which generates each edge in turn. It's
great for what it does, but it ignores back edges.

def DFS(graph, start, seen = None):
if not seen:
seen = {start: 1}
for v in graph[start]:
if v not in seen:
seen[v] = 1
yield start, v
if v in graph:
for e in DFS(graph, v, seen):
yield e

I could code a variation on this, just for this one problem, but that
sort of cut and paste coding offends me (and DFS is *just* fiddly
enough that I'm not confident of getting it right each time I
reimplement it). So what I'd rather do is enhance my existing code to
be a bit more general.

I have a C++ reference which implements a highly general DFS
algorithm, using the visitor pattern (pass a visitor object to the
function, and various callback methods are called at points in the
algorithm - there are discover_vertex, discover_edge, back_edge, etc
callbacks).

I could code something like this (the pseudocode for the algorithm is
almost executable in Python!) but it doesn't feel particularly
"pythonic". Having to write a visitor class for each use case feels
unwieldy (as well as being perceptibly slower than the generator
implementation). Having the various callbacks as arguments (pass a
callable or None) isn't much faster or cleaner.

Can anybody think of a good way of making the DFS code above a little
more generic, without losing the efficient and simple (from the
caller's POV) approach of the current version?

Paul.
--
This signature intentionally left blank
Jul 18 '05 #1
3 3435
Well in this case couldn't you make seen do the detection for you.
If I understand your analysis you could make the seen dictionary a
special kind of dictionary.

ie

class DFSSeen(dict):
def __contains__(self,x):
r = super(DFSSeen,self).__contains__(x)
if r:
print 'back link to',x
return r

G = {
1: (2,3),
2: (3,5),
3: (4,),
4: (6,),
5: (2,6),
6: (1,),
}
for v in DFS(G,1,DFSSeen({1:1})):
print v

using the above I get
C:\tmp>dfs.py
(1, 2)
(2, 3)
(3, 4)
(4, 6)
back link to 1
(2, 5)
back link to 2
back link to 6
back link to 3

so cycle detection is possible without altering the original code, but
it probably isn't sufficient for your purposes in that we don't have the
start vertex.

In article <oe**********@yahoo.co.uk>, Paul Moore <pf******@yahoo.co.uk>
writes

I'm trying to check a graph (represented in the usual Python adjacency
list format) for loops. This is dead simple - you use a depth-first
search, and look out for "back edges" (edges from a vertex "back" to
one you've already seen).

I already have a DFS algorithm which generates each edge in turn. It's
great for what it does, but it ignores back edges.

def DFS(graph, start, seen = None):
if not seen:
seen = {start: 1}
for v in graph[start]:
if v not in seen:
seen[v] = 1
yield start, v
if v in graph:
for e in DFS(graph, v, seen):
yield e

I could code a variation on this, just for this one problem, but that
sort of cut and paste coding offends me (and DFS is *just* fiddly
enough that I'm not confident of getting it right each time I
reimplement it). So what I'd rather do is enhance my existing code to
be a bit more general.

I have a C++ reference which implements a highly general DFS
algorithm, using the visitor pattern (pass a visitor object to the
function, and various callback methods are called at points in the
algorithm - there are discover_vertex, discover_edge, back_edge, etc
callbacks).

I could code something like this (the pseudocode for the algorithm is
almost executable in Python!) but it doesn't feel particularly
"pythonic". Having to write a visitor class for each use case feels
unwieldy (as well as being perceptibly slower than the generator
implementation). Having the various callbacks as arguments (pass a
callable or None) isn't much faster or cleaner.

Can anybody think of a good way of making the DFS code above a little
more generic, without losing the efficient and simple (from the
caller's POV) approach of the current version?

Paul.


--
Robin Becker
Jul 18 '05 #2
Robin Becker <ro***@jessikat.fsnet.co.uk> writes:
Well in this case couldn't you make seen do the detection for you.
If I understand your analysis you could make the seen dictionary a
special kind of dictionary.


Interesting trick. I hadn't thought of this, mainly because "seen" is
only in the parameter list to allow the recursive call to work - I'd
never intended it to be passed by the user. (I have a non-recursive
version of DFS without a "seen" parameter).

More general DFS algorithms have a "colour" dictionary, where vertices
can be "White" (not seen) "Black" (seen) or "Grey" (essentially,
"still being looked at"). I'm not sure how that would interact with
this trick.

Hmm. I wish I had a "big" use case where issues like this matter. But
all of my uses are small - the sort of thing where a general library
helps a lot, because the whole project is too small to justify
(re-)implementing the algorithm. But with just small examples, you
never get the breadth of use cases to make the optimal design obvious.

Or maybe I'm just a lousy library designer :-)

Time to stop rambling and do something productive...

Paul.
--
This signature intentionally left blank
Jul 18 '05 #3
On Sat, 29 Nov 2003 10:54:49 +0000, Paul Moore <pf******@yahoo.co.uk> wrote:
I'm trying to check a graph (represented in the usual Python adjacency
list format) for loops. This is dead simple - you use a depth-first
search, and look out for "back edges" (edges from a vertex "back" to
one you've already seen).

I already have a DFS algorithm which generates each edge in turn. It's
great for what it does, but it ignores back edges.

def DFS(graph, start, seen = None):
if not seen:
seen = {start: 1}
for v in graph[start]:
if v not in seen:
seen[v] = 1
yield start, v
if v in graph:
for e in DFS(graph, v, seen):
yield e

I could code a variation on this, just for this one problem, but that
sort of cut and paste coding offends me (and DFS is *just* fiddly
enough that I'm not confident of getting it right each time I
reimplement it). So what I'd rather do is enhance my existing code to
be a bit more general.
Not very tested, but how about (borrowing Robin's example data):

====< PaulMoore.py >=============================================
def DFS2(graph, start, rlevel=0, seen = None):
if seen is None: seen = {}
seen[start] = 1 # allow for jumping into arbitrary subtrees with
# same seen dict in new generator ?
for v in graph[start]:
is_back = v in seen
seen[v] = True # ok if redundant
yield (start, v), is_back, rlevel
if not is_back:
if v in graph:
for e in DFS2(graph, v, rlevel+1, seen):
yield e

if __name__ == '__main__':
G = {
1: (2,3),
2: (3,5),
3: (4,),
4: (6,),
5: (2,6),
6: (1,),
}
for e, is_back, level in DFS2(G, 1):
print '%s%s -> %s%s' %(level*' ',e[0], e[1], ('',' (back)')[is_back])
================================================== ===============

The output is:

[15:15] C:\pywk\clp>PaulMoore.py
1 -> 2
2 -> 3
3 -> 4
4 -> 6
6 -> 1 (back)
2 -> 5
5 -> 2 (back)
5 -> 6 (back)
1 -> 3 (back)

Can anybody think of a good way of making the DFS code above a little
more generic, without losing the efficient and simple (from the
caller's POV) approach of the current version?

My .02 above ;-)

Regards,
Bengt Richter
Jul 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Dave Benjamin | last post by:
Is there a straightforward way to create a generator from a function that takes a callback? For instance, I have a function called "process": def process(text, on_token): ... For each token...
25
by: Magnus Lie Hetland | last post by:
Is there any interest in a (hypothetical) standard graph API (with 'graph' meaning a network, consisting of nodes and edges)? Yes, we have the standard ways of implementing graphs through (e.g.)...
1
by: entropy123 | last post by:
Hey all, In an effort to solve a sticky - for me - problem I've picked up Sedgewick's 'Algorithm's in C'. I've tried working through the first few problems but am a little stumped when he refers...
8
by: Jef Driesen | last post by:
I'm working on an image segmentation algorithm. An essential part of the algorithm is a graph to keep track of the connectivity between regions. At the moment I have a working implementation, but...
4
by: enigma261 | last post by:
Hello, I am looking for graph layout algorithms. I am representing a process flow using a graph. I have the visual representation of nodes and connectors. I would like to use a graph layout...
2
by: brianlum | last post by:
Hi, I have been looking for a good way to convert python code into a control flow graph. I know of Python functions that will convert an expression into an abstract syntax tree (i.e. ast =...
10
by: diffuser78 | last post by:
Is there any library in Python which has implementation of graph theoretic algorithms and models ?
5
by: Sebastian.Pawlus | last post by:
SAX or DOM, graph processing. It is my fist application in XML or I should say it will be. Problem features. A xml file wile cover a huge graph structure. In fist sections will be defined...
10
by: andrea | last post by:
I'm studying some graphs algorithm (minumum spanning tree, breath search first topological sort etc etc...) and to understand better the theory I'm implementing them with python... I made my own...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.