473,378 Members | 1,436 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

Bounding box on clusters in a 2D list

If I have

ex: x = [[1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0],
[1,1,1,0,1,1,0,0,0,0,0,1,1,1,1,0,0,0],
[1,1,1,0,1,1,1,0,1,1,0,1,1,1,0,0,0,0],
[0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0]]
what I want is a boundingbox over the region where we find clusters of
1's.So for instance in the above list first 3 roes and colums have 1's
so the area of that box is 3x3
so my final array should have an array of approx areas of clusters of
1's like
area = [ 9,4 ....]
Hope I am clear with my question.

Jul 19 '05 #1
12 2246
I hope this is what you need, sometimes understanding the question is
one of the hardest parts :-)

If you can use a graph data structure, you can create a graph, and then
you can find the lenght of all its connected components (indentations
reduced to 2 spaces):

.. def mat2graph(g, m, good=None, w=1):
.. "Function docs..."
.. g.clear()
.. if m and isinstance(m, (list, tuple)):
.. if good != None:
.. m = [[good(e) for e in row] for row in m]
.. maxy = len(m)-1
.. maxx = len(m[0])-1
.. add = g.addBiArc
.. for y, row in enumerate(m):
.. for x, e in enumerate(row):
.. if e:
.. g.addNode((x,y)) # Necessary for isolated nodes.
.. if x>0 and m[y][x-1]: add( (x,y), (x-1,y), w=w)
.. if x<maxx and m[y][x+1]: add( (x,y), (x+1,y), w=w)
.. if y>0 and m[y-1][x]: add( (x,y), (x,y-1), w=w)
.. if y<maxy and m[y+1][x]: add( (x,y), (x,y+1), w=w)
..
.. from graph import Graph
.. g = Graph()
.. m = [[1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0],
.. [1,1,1,0,1,1,0,0,0,0,0,1,1,1,1,0,0,0],
.. [1,1,1,0,1,1,1,0,1,1,0,1,1,1,0,0,0,0],
.. [0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0]]
.. mat2graph(g, m)
.. print map(len, g.connectedComponents())

Something similar to mat2graph will probably become a method of Graph.
A quite similar program can be used for other python graph libraries.

If you cannot use a graph data structure, of if you need to go faster,
you can use a flood fill algorithm tuned for this problem:
.. def connected(m, foreground=1, background=0):
.. def floodFill4(m, r,c, newCol, oldCol):
.. """Simplest 4-connected flood fill algorithm, there are much
.. faster non-recursive ones. I've modified this from:
.. http://www.student.kuleuven.ac.be/~m0216922/CG/floodfill.html"""
.. if c>=0 and c<maxc and r>=0 and r<maxr and m[r][c]==oldCol:
.. m[r][c] = newCol
.. floodFill4(m, r+1, c, newCol, oldCol)
.. floodFill4(m, r-1, c, newCol, oldCol)
.. floodFill4(m, r, c+1, newCol, oldCol)
.. floodFill4(m, r, c-1, newCol, oldCol)
..
.. maxr = len(m)
.. maxc = len(m[0])
.. newCol = 2
.. oldCol = foreground
.. for r in xrange(maxr):
.. for c in xrange(maxc):
.. if m[r][c] == oldCol:
.. floodFill4(m, r,c, newCol=newCol, oldCol=oldCol)
.. newCol += 1
.. # Frequences, this can be become a standard python command
.. freq = {}
.. for row in m:
.. for e in row:
.. if e in freq:
.. freq[e] += 1
.. else:
.. freq[e] = 1
.. del freq[background]
.. return freq.values()
..
.. m = [[1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0],
.. [1,1,1,0,1,1,0,0,0,0,0,1,1,1,1,0,0,0],
.. [1,1,1,0,1,1,1,0,1,1,0,1,1,1,0,0,0,0],
.. [0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0]]
.. print connected(m)
There are much faster ways to do flood fills:
http://www.codeproject.com/gdi/QuickFill.asp
I think Python PIL doesn't support flood fills, and I think doing it
with Numarray is slower, and it's useful only to reduce the used
memory.
This connected program is slow and raw, and it uses recursivity a lot,
so it can crash the interpreter. You can use a data structure to
implement the stack yourself, to speed this program up. But for small
matrices I think this connected is enough ^_^

Bear hugs,
Bearophile

Jul 19 '05 #2
On 23 Apr 2005 13:17:55 -0700, "su*******@gmail.com" <su*******@gmail.com> wrote:
If I have

ex: x = [[1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0],
[1,1,1,0,1,1,0,0,0,0,0,1,1,1,1,0,0,0],
[1,1,1,0,1,1,1,0,1,1,0,1,1,1,0,0,0,0],
[0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0]]
what I want is a boundingbox over the region where we find clusters of
1's.So for instance in the above list first 3 roes and colums have 1's
so the area of that box is 3x3
so my final array should have an array of approx areas of clusters of
1's like
area = [ 9,4 ....] Where does the 4 come from?
Hope I am clear with my question.

Not quite clear on if "clusters" have to be rectangular and fill their bounding boxes
or whether

[[0, 0, 0, 0, 0],
+-----+
[0,|1, 1,|0, 0],
| |
[0,|0, 1,|0, 0],
+-----+
[0, 0, 0, 0, 0]]

is a cluster with three 1's. Or indeed whether "bounding boxes" have to be rectangular.
I.e., what about diagonal clusters?
,__,
[[0, 0,|1,|0, 0],
_/ _/ ,__,
[0,/1,/0, 0,|1],
_/ _/ _/ /
[1,/0, 0,/1,/0],
_/ /
[0, 0,/1,/0, 0]]

Or should that be
+-------------+
[[|0, 0, 1, 0, 0|],
| |
[|0, 1, 0, 0, 1|],
| |
[|1, 0, 0, 1, 0|],
| |
[|0, 0, 1, 0, 0|]]
+-------------+

since two 3x3 squares around the length-3 diagonals would overlap.

Regards,
Bengt Richter
Jul 19 '05 #3
Richter,yes what I am looking for is for cluster with rectangular
bounding boxes as you explained in the first figure.

Jul 19 '05 #4
hi Bearphile! That really gives me an idea.Thanks much for that. Yes as
you said the algorithm reaches a maximium recursion depth for larger
sets i tried.I still have a question. if
m = [[0,0,0,0],[0,1,1,0,0],[0,0,1,0,0],[0,0,0,0]]

all it does is count the number of 1's and return us the number in that
cluster. But if I want the box to be considered as one cluster as
[[0, 0, 0, 0, 0],
+-----+
[0,|1, 1,|0, 0],
| |
[0,|0, 1,|0, 0],
+-----+
[0, 0, 0, 0, 0]]
because it could be that the 0 in between the cluster of ones is a part
of the group.So the area of this boundingbox is 4 rather than 3. Do you
see where I am heading

Jul 19 '05 #5
On 24 Apr 2005 09:44:49 -0700, "su*******@gmail.com" <su*******@gmail.com> wrote:
Richter,yes what I am looking for is for cluster with rectangular
bounding boxes as you explained in the first figure.

Sorry, not your fault, but I'm still not clear on diagonal connection/separation.
E.g., (removing punctuation ;-) is this 3 boxes or do the top left and bottom right
connect at their touching corners and thus become bounded with a rectangle that then
also includes the little box at the top right?

+-------+ +---+
0 | 1 1 | 0 | 1 |
| | +---+
0 | 1 1 | 0 0
+-------+?------+
0 0 0 ? 1 1 |
| |
0 0 0 | 0 1 |
+-------+

+-------+ +---+ +-------+???+---+ +---------------+
0 | 1 1 | 0 | 1 | 0 | 1 1 | 0 | 1 | 0 | 1 1 0 1 |
| | +---+ | | +---+ | |
0 | 1 1 | 0 0 0 | 1 1 | 0 0 ? 0 | 1 1 0 0 |
+-------+?------+ =>* | +-------+ => | |
0 0 0 ? 1 1 | 0 | 0 0 1 1 | 0 | 0 0 1 1 |
| | | | | |
0 0 0 | 0 1 | 0 | 0 0 0 1 | 0 | 0 0 0 1 |
+-------+ +---------------+ +---------------+

And what about nested boxes?

+-------------------+
0 | 1 1 1 1 1 |
| |
0 | 1 0 0 0 1 |
| +---+ |
0 | 1 0 | 1 | 0 1 |
| +---+ |
0 | 1 0 0 0 1 |
| |
0 | 1 1 1 1 1 |
+-------------------+

Regards,
Bengt Richter
Jul 19 '05 #6
su*******@gmail.com:
hi Bearphile! That really gives me an idea.Thanks much for that. Yes as you said the algorithm reaches a maximium recursion depth for larger
sets i tried.
You can improve the little flood filling function, avoiding the "bad"
Python recursivity.

Do you see where I am heading


Your problem still looks a bit ill defined to me (see Bengt Richter's
nested boxes case), but you can probably modify this part of my
connected():

.. freq = {}
.. for row in m:
.. for e in row:
.. if e in freq:
.. freq[e] += 1
.. else:
.. freq[e] = 1

For the *values* of the freq dictionary you can use the 4 coordinates
of the bounding box of the key-specific cluster, that is updating its
far left, far right, etc. coordinates, with something like this:

box[e] = ( max(x, box[e][0]), min(x, box[e][1]), max(y, box[e][2]),
min(y, bbox[e][3] )

Or something faster/more correct. It's not difficult, I presume.
(connected() debug: if you want to keep the original matrix m, at the
end of the connected() you can set at 1 all its values different from
0.)

Bear hugs,
Bearophile

Jul 19 '05 #7
bearphile, Is there a way I could do the floodfill rather iteratively
than recursive. It somehow breaks although I made some optimisations to
the code.

Jul 19 '05 #8
Just was wondering how to integrate the floodfill with the stack.I
guess it will get rid of recursion problem. You mean read all the
elements of the array to a stack and then push and pop based on
conditions?

Jul 19 '05 #9
In my first post you can find two URLs with better flood filling
algorithms. You can also modify the easy recursive function, using a
list-based stack intead of recursive calls.

Bye,
Bearophile

Jul 19 '05 #10
Bearophile,I have written the floodfill in python with stack. But i
think I am making something wrong I dont get what i need.Need your
opinion.the code follows

def connected(m, foreground=1, background=0):
def floodFill4(m, r,c, newCol, oldCol):
if newCol == oldCol:
print "empty stack"
return -1
while(m[r].pop(c)):
m[r][c] = newCol
if r+1 < maxr and m[r+1][c]==oldCol:
if m[r+1][c] == 0:return
if r-1 >= 0 and m[r-1][c]==oldCol:
if m[r-1][c] == 0:return
if c+1 < maxc and m[r][c+1]==oldCol:
if m[r][c+1] == 0:return
if c-1 >= 0 and m[r][c-1]==oldCol:
if m[r][c-1] == 0:return

maxr = len(m)
maxc = len(m[0])
newCol = 2
oldCol = foreground
for r in xrange(maxr):
for c in xrange(maxc):
if m[r][c] == oldCol:
floodFill4(m, r,c, newCol=newCol, oldCol=oldCol)
newCol += 1

Jul 19 '05 #11
def floodFill4(m, r, c, newCol, oldCol):
stack = [ (r, c) ]
while stack:
r, c = stack.pop()
if c >=0 and c < maxc and r >=0 and r< maxr and m[r][c]==oldCol:
m[r][c] = newCol
stack += [ (r+1, c), (r-1, c), (r, c+1), (r, c-1) ]

Jul 19 '05 #12
Then you can probably use something like this:

.. def boxesArea(m, foreground=1, background=0):
.. maxr = len(m)
.. maxc = len(m[0])
.. newCol = 2
.. oldCol = foreground
.. for r,row in enumerate(m):
.. for c,e in enumerate(row):
.. if e == oldCol:
.. # Flood fill by Lonnie Princehouse
.. stack = [ (r, c) ]
.. while stack:
.. r, c = stack.pop()
.. if c>=0 and c<maxc and r>=0 and r<maxr and
m[r][c]==oldCol:
.. m[r][c] = newCol
.. stack += [ (r+1, c), (r-1, c), (r, c+1), (r,
c-1) ]
.. newCol += 1
.. box = {}
.. for r,row in enumerate(m):
.. for c,e in enumerate(row):
.. if e in box:
.. box[e] = ( min(c, box[e][0]), min(r, box[e][1]),
.. max(c, box[e][2]), max(r, box[e][3] ) )
.. else:
.. box[e] = (c, r, c, r)
.. del box[background]
.. return [(b[2]-b[0]+1)*(b[3]-b[1]+1) for b in box.values()]
..
.. try: import psyco
.. except ImportError: pass
.. else: psyco.bind(boxesArea)

Note that this version modifies the given m still.

Hugs,
Bearophile

Jul 19 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: ljb | last post by:
Using PHP-4.3.11 with GD extension, bundled GD library, and freetype-2.1.9 (with X.org 6.8.1). I can't get the TrueType Font (TTF) bounding box to come out right. Whether using imagettftext() or...
2
by: Gabriele Bartolini | last post by:
Hi guys, just a quick and probably stupid question. When I make a cluster based on an index on a table, how can I remove it later? Should I 'create' a new one by using the primary key index? ...
6
by: Mike | last post by:
We're starting a project at work moving VSAM to RDBMS. The choice is between DB2 and Oracle. It seems like the Oracle RAC is a better cluster choice with it's share everything rather than the DB2...
0
by: Morten Wennevik | last post by:
When the linklabel link gets focus, there is a dotted bounding box surrounding the link text until something else receives focus. Is it possible to remove this box? -- Using M2, Opera's...
2
by: Michelle | last post by:
Hello, I put a dropdown list in ItemTemplate Column. I can select its value from the dropdown list and store in a dataset, but when I retrieve records from dataset, it doesn't show up in the...
1
by: Dennis Gearon | last post by:
Have I read correctly, a database instance can only work with one cluster(directory)? ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your...
0
by: nor | last post by:
Hi, I’m doing a project to recognize shape of object. I had completed filtering part and now I’m starting for recognition part. First, I want to create bounding box of the object in the image so...
0
by: deluxmilkman | last post by:
I have a movie clip with a gradient mask. now I would like to have the bounding box of this movie clip reduced to the visible(masked) area, so that I can get its x and y positions. is it even...
1
by: dazzler | last post by:
I have up to 100data points over 400x400pixel area, when the user would move mouse cursor over one point, the application would detect it and show additional information about it. I have all x,y...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.