A suggestion on data structures.

pereges

Hi, I'm in a bit of dilemma here.

I want to use an adjacency matrix for building edge lists in my
project.

The thing is an adjacency matrix will have nv * nv elements where nv
is number of vertices. Advantage is that it makes things very fast in
my project, only one pass over list of triangles as opposed to two.
But I'm a little skeptical about allocating memory for nv * nv
elements. nv can be very high up to 10,000,000 as well. Storing
10,000,000 * 10,000,000 elements might actually be an extremely
stupid idea which may not even be allowed. Also, while creating the
matrix also a lot of time may be spent :

typedef int **m;

matrix create_matrix(size_t cols, size_t rows)
{
size_t i;
matrix m;

m = (int **) malloc(rows * sizeof(int *));

for (i = 0; i < rows; i++)
{
m[i] = (int *) malloc(cols * sizeof(int));
}

return (m);
}

You still have to make nv passes here.

And then I want to do some initializations using memset i.e. all
values must be zero initially. I don't know the internal workings of
memset so once again I'm fearing that it will slow my program even
more. Is there an easier way out of this ?

Aug 6 '08 #1

Subscribe Post Reply

1268

Ben Bacarisse

pereges <Br*****@gmail.comwrites:

I want to use an adjacency matrix for building edge lists in my
project.

The thing is an adjacency matrix will have nv * nv elements where nv
is number of vertices. Advantage is that it makes things very fast in
my project, only one pass over list of triangles as opposed to two.
But I'm a little skeptical about allocating memory for nv * nv
elements. nv can be very high up to 10,000,000 as well. Storing
10,000,000 * 10,000,000 elements might actually be an extremely
stupid idea which may not even be allowed. Also, while creating the
matrix also a lot of time may be spent :

typedef int **m;

matrix create_matrix(size_t cols, size_t rows)
{
size_t i;
matrix m;

m = (int **) malloc(rows * sizeof(int *));

for (i = 0; i < rows; i++)
{
m[i] = (int *) malloc(cols * sizeof(int));
}

return (m);
}

You would not do this for sure. First, an adjacency matrix has only
0/1 bits in it, so there is no need to an array of ints. Secondly,
they are almost always sparse. You need to read up on representations
for sparse boolean matrices. Initially, of course, this is not a C
question but it will turn into one later!

--
Ben.

Aug 6 '08 #2

Harold Aptroot

"pereges" <Br*****@gmail.comwrote in message
news:1f**********************************@v1g2000p ra.googlegroups.com...

Hi, I'm in a bit of dilemma here.

I want to use an adjacency matrix for building edge lists in my
project.

The thing is an adjacency matrix will have nv * nv elements where nv
is number of vertices. Advantage is that it makes things very fast in
my project, only one pass over list of triangles as opposed to two.
But I'm a little skeptical about allocating memory for nv * nv
elements. nv can be very high up to 10,000,000 as well. Storing
10,000,000 * 10,000,000 elements might actually be an extremely
stupid idea which may not even be allowed. Also, while creating the
matrix also a lot of time may be spent :

typedef int **m;

matrix create_matrix(size_t cols, size_t rows)
{
size_t i;
matrix m;

m = (int **) malloc(rows * sizeof(int *));

for (i = 0; i < rows; i++)
{
m[i] = (int *) malloc(cols * sizeof(int));
}

return (m);
}

You still have to make nv passes here.

As to a rather useless suggestion in this case (it will eat all your ram and
then some):
since all rows are going to be the same size, you don't actually need a
jagged array (unless the algorithm somehow manages to require it), you could
just malloc(rows * cols * sizeof(int))
in this situation you wouldn't need an int though - and as others have
pointed out, a sparse set would be much better

Aug 7 '08 #3

Bartc

"pereges" <Br*****@gmail.comwrote in message
news:1f**********************************@v1g2000p ra.googlegroups.com...

Hi, I'm in a bit of dilemma here.

I want to use an adjacency matrix for building edge lists in my
project.

The thing is an adjacency matrix will have nv * nv elements where nv
is number of vertices. Advantage is that it makes things very fast in
my project, only one pass over list of triangles as opposed to two.
But I'm a little skeptical about allocating memory for nv * nv
elements. nv can be very high up to 10,000,000 as well. Storing
10,000,000 * 10,000,000 elements might actually be an extremely
stupid idea which may not even be allowed. Also, while creating the
matrix also a lot of time may be spent :

It's unlikely any triangle could be adjacent to up to 10million others.
Usually up to just 3 others. Unless they are going to be connected in
unusual ways.

Suggest 3 ints per triangle, containing indices of other triangles. When
there are more than 3, have some overflow scheme (one of the 3 points to an
allocated list perhaps).

To test whether two A,B triangles are adjacent requires testing B to see if
A is in it's list.

You might try also a hash table: this has to be set up first with all
adjacencies. Once set up, create a hash value from the indices or addresses
of triangles A,B, and lookup; if present in the table, they are adjacent.
(Each table entry contains A:B, but you need many unused entries too.)

--
Bartc

Aug 7 '08 #4

soscpd

Hello Pereges, List

On Aug 6, 6:20 pm, pereges <Brol...@gmail.comwrote:
I have some work with matrix of that size and bigger (snapshot of
magnetic surface spin in some surface), and databases really help me a
lot (too, make the code more easy to parallel computing). One simple
sqlite db can be really fast, and you will not waste precious time
writing 1k lines to make the application run 5% faster (I know... in
10 days, 5% reach 12 hours). You can too, hack into the db engine core
and drop a few useless stuff, to make it run faster. 1Tb database
isn't something outstanding, anyway (ok... but it is outstanding for a
sqlite like db - my advice - don't do it. Get a client/server engine,
instead. Hopefully, with the server running in another machine.).

Rafael

Aug 7 '08 #5

pereges

yeah, I've tried looking into hash table for edges. But the thing is
every edge is represented by two vertex indices and finding a has
function that takes the two indices as input and results in a unique
address is quite difficult especially with so many vertices.

Aug 7 '08 #6

Bartc

[Rapid detection of any of 10million items touching another]

"pereges" <Br*****@gmail.comwrote in message
news:e1**********************************@w24g2000 prd.googlegroups.com...

yeah, I've tried looking into hash table for edges. But the thing is
every edge is represented by two vertex indices and finding a has
function that takes the two indices as input and results in a unique
address is quite difficult especially with so many vertices.

Your idea of having a 100-million-million-element set or array is not
practical to do in a simple way; using sparse arrays or databases (as soscpd
suggested) is going to slow things down.

How fast do you want this test to be?

Depending on how your triangles are stored, you might as well just examine
them without building any table. (I take it each triangle is defined as 3
edge indices, and each edge by 2 vertex indices? Or is the triangle defined
also by 3 vertices?)

A hash function does not need to give a unique result; only reasonably well
spread. However even such a table is going to need a lot of memory:

How many entries will there be? I'm guessing 3 entries per triangle (A:B,
A:C, A:D) which will include duplicating some information (so there will be
a B:A in there somewhere). So about 30 million entries. Each entry is 2 ints
(8 bytes); and with some extra capacity in the table, this will be at least
300Mbytes.

My original idea was to store 3 ints per triangle: total cost about
120Mbytes. So the options include:

Store no extra info, just analyze your existing data to see if A:B touch;
cost: 0Mbytes;
Store touching indices with each triangle; cost: 120Mbytes;
Hash table: cost: 300Mbytes
Simple bit-matrix: cost: 12.5million million bytes.

All getting a little faster, or would be if the 12,500 Gbytes was feasible.

--
Bartc

Aug 7 '08 #7

Bartc

"pereges" <Br*****@gmail.comwrote in message
news:c4**********************************@a3g2000p rm.googlegroups.com...

- Read mesh file and allocate memory for triangles, vertices etc.
- Traverse the lists to build edge list.
- From the list of edges, determine the list of sharp edges and
corenrs in the object.

Just to clarify, your input includes all the vertices, and specifies all the
triangles in terms of those vertices?

So your real problem (the one you were considering the huge 1e7 x 1e7 array
for) is building the edge table (your struct edge) quickly?
--
Bartc

Aug 7 '08 #8

pereges

On Aug 7, 4:14 pm, "Bartc" <b...@freeuk.comwrote:

Just to clarify, your input includes all the vertices, and specifies all the
triangles in terms of those vertices?

So your real problem (the one you were considering the huge 1e7 x 1e7 array
for) is building the edge table (your struct edge) quickly?

Yes vertices are given by their 3d coordinates(x, y, z)

eg:

vertex 0 5.666788 4.555677 -2.333333
vertex 1 ............................
vertex 2 ............................
........
........
vertex n-1

Triangles are represented as 3 indices into vertex list above:

triangle 0 12 0 1
triangle 1 4 5 6
..........
...........
triangle m-1

I'm just looking for an overall quicker method. Adjacency matrix is
just one idea that I used long back when the mesh was extremely
small(Atmost 50-60 triangles and 25-30 vertices). But this is useless
now as I'm testing bigger meshes.

Aug 7 '08 #9

Bartc

"pereges" <Br*****@gmail.comwrote in message
news:2c**********************************@p10g2000 prf.googlegroups.com...

On Aug 7, 4:14 pm, "Bartc" <b...@freeuk.comwrote:

>Just to clarify, your input includes all the vertices, and specifies all
the
triangles in terms of those vertices?

So your real problem (the one you were considering the huge 1e7 x 1e7
array
for) is building the edge table (your struct edge) quickly?

Yes vertices are given by their 3d coordinates(x, y, z)

I'm just looking for an overall quicker method. Adjacency matrix is
just one idea that I used long back when the mesh was extremely
small(Atmost 50-60 triangles and 25-30 vertices). But this is useless
now as I'm testing bigger meshes.

OK. In your second pass, for each edge of each triangle (ie 30million
times), you seem to be scanning the entire edge list looking for iv0, iv1.

That's clearly inefficient. At least, the edge list can be sorted and a
binary search can be done. Or it can be created as a binary tree in the
first place. (Or as a hash as in earlier suggestions.)

(Simply sorting the edge list will already clump together the duplicate
edges, perhaps simplifying the second pass.)

I don't quite understand the half-edge business. Otherwise I would do this
lookup in the first pass: building the edge list so that it is already
sorted (by iv0:iv1), and only adding new edges if they are not already in
the list. This way there wouldn't be duplicated edges, and you might not
need that second pass.

--
Bartc

Aug 7 '08 #10

soscpd

Hello Pereges, List

On Aug 7, 7:35 am, pereges <Brol...@gmail.comwrote:

Yes vertices are given by their 3d coordinates(x, y, z)

vertex 0 5.666788 4.555677 -2.333333

Triangles are represented as 3 indices into vertex list above:

triangle 0 12 0 1

I can assume vertex 0 is to triangle 0?

Kinda

typedef struct
{
double vertex[2];
int triangle[2];
} matrix_dot;

Should I assume triangle values to be between 0 and ...? What about
vertex?

Do you have all input's (vertex/triangle) already in a file? No on-
demand data at all?

Rafael

Aug 7 '08 #11

pereges

Ok, I managed to do this in another fast way.

For every vertex, now I have a shared triangle list. That is a list of
triangles that share this particular vertex. So when I need to find
out the triangles shared by an edge, all I need to do is :

1. Go to the edge end points v0 and v1.
2. Look in the shared triangle lists of v0 and v1 to find out exactly
two common triangles.

Building a shared triangle list is not at all difficult. While reading
the triangle vertex indices (v0, v1, v2), just go to these individual
vertices and store the address of the current triangle being read.

This is my vertex structure:

typedef struct
{
size_t index;
vector p;
vector normal;
int corner_flag;
node *shared_tri; /* Link list containing address of triangles
sharing this vertex */
}vertex;

The three vertex indices of some triangle nt given by iv0, iv1, iv2:

iv0 = mesh_ptr->tri[nt].v[0];
iv1 = mesh_ptr->tri[nt].v[1];
iv2 = mesh_ptr->tri[nt].v[2];

Add to list of shared triangles at every vertex :

add_to_list(&mesh_ptr->vert[iv0].shared_tri);
mesh_ptr->vert[iv0].shared_tri->data = &mesh_ptr->tri[nt];

add_to_list(&mesh_ptr->vert[iv1].shared_tri);
mesh_ptr->vert[iv1].shared_tri->data = &mesh_ptr->tri[nt];

add_to_list(&mesh_ptr->vert[iv2].shared_tri);
mesh_ptr->vert[iv2].shared_tri->data = &mesh_ptr->tri[nt];

As you can see, this hardly takes any time because you are doing this
simultaneously as
the triangle vertex indices were being read from the ascii file.
Now the algorithm for building the edge list:
void build_edge_list2()
{
size_t i, j, k;
size_t iv0, iv1, it0, it1;
edge *e;
node *ptr1, *ptr2;

for (i = 0; i < mesh_ptr->ntri; i++)
{
for (j = 0; j < 3; j++)
{
iv0 = mesh_ptr->tri[i].v[j];
iv1 = mesh_ptr->tri[i].v[(j + 1) %
3];

if (iv0 < iv1)
{
add_to_list(&mesh_ptr->edge_list);
mesh_ptr->edge_list->data = malloc(sizeof(edge));
assert(mesh_ptr->edge_list->data != NULL);
e = (edge *)mesh_ptr->edge_list->data;
e->vertex_index[0] = iv0;
e->vertex_index[1] = iv1;

ptr1 = mesh_ptr->vert[iv0].shared_tri;
k = 0;

while (ptr1 != NULL)
{
it0 = ((triangle *)ptr1->data)->index;
ptr2 = mesh_ptr->vert[iv1].shared_tri;

while (ptr2 != NULL)
{
it1 = ((triangle *)ptr2->data)->index;

if (it0 == it1) /* Common triangle found*/
{
e->triangle_index[k++] = it0;
break;
}

ptr2 = ptr2->next;
}

if (k == 2)
{
/* No point in traversing the list further
once two
common triangles are found */
break;
}
ptr1 = ptr1->next;
}
}
}
}
}

This approach is incredibly fast as opposed to the previous one. I
clocked the time needed to construct the list and in the first case it
took about 16 seconds and in the second code its less than 1 second.
The most obvious advantage is that the second pass over the list of
triangles is not required nor is the second pass over the edge list
needed. Just one pass and the edge list gets created as we pass
through the list of triangles. The only disadvantage is probably the
extra space needed for storing the link list of shared triangles. But
anyway, I believe it is better to have lists like this for other
calculations like vertex normals where the vertex normal has to be
calculated by taking contribution from all surrounding triangles. If
need arises, then one can delete the linklist after all such
calculations are done to free space in the middle of the program.

Aug 8 '08 #12

pereges

Ok, I managed to do this in another fast way.

For every vertex, now I have a shared triangle list. That is a list of
triangles that share this particular vertex. So when I need to find
out the triangles shared by an edge, all I need to do is :

1. Go to the edge end points v0 and v1.
2. Look in the shared triangle lists of v0 and v1 to find out exactly
two common triangles.
3. These two common triangles are the ones shared by the edge between
v0 and v1.

Building a shared triangle list is not at all difficult. While reading
the triangle vertex indices (v0, v1, v2), just go to these individual
vertices and store the address of the current triangle being read.

This is my vertex structure:

typedef struct
{
size_t index;
vector p;
vector normal;
int corner_flag;
node *shared_tri; /* Link list containing address of triangles
sharing this vertex */

}vertex;

The three vertex indices of some triangle nt given by iv0, iv1, iv2:

iv0 = mesh_ptr->tri[nt].v[0];
iv1 = mesh_ptr->tri[nt].v[1];
iv2 = mesh_ptr->tri[nt].v[2];

Add to list of shared triangles at every vertex :

add_to_list(&mesh_ptr->vert[iv0].shared_tri);
mesh_ptr->vert[iv0].shared_tri->data = &mesh_ptr->tri[nt];

add_to_list(&mesh_ptr->vert[iv1].shared_tri);
mesh_ptr->vert[iv1].shared_tri->data = &mesh_ptr->tri[nt];

add_to_list(&mesh_ptr->vert[iv2].shared_tri);
mesh_ptr->vert[iv2].shared_tri->data = &mesh_ptr->tri[nt];

As you can see, this hardly takes any time because you are doing this
simultaneously as
the triangle vertex indices were being read from the ascii file.

Now the algorithm for building the edge list:

void build_edge_list2()
{
size_t i, j, k;
size_t iv0, iv1, it0, it1;
edge *e;
node *ptr1, *ptr2;

for (i = 0; i < mesh_ptr->ntri; i++)
{
for (j = 0; j < 3; j++)
{
iv0 = mesh_ptr->tri[i].v[j];
iv1 = mesh_ptr->tri[i].v[(j + 1)%3];

if (iv0 < iv1)
{
add_to_list(&mesh_ptr->edge_list);
mesh_ptr->edge_list->data = malloc(sizeof(edge));
assert(mesh_ptr->edge_list->data != NULL);
e = (edge *)mesh_ptr->edge_list->data;
e->vertex_index[0] = iv0;
e->vertex_index[1] = iv1;

ptr1 = mesh_ptr->vert[iv0].shared_tri;
k = 0;

while (ptr1 != NULL)
{
it0 = ((triangle *)ptr1->data)->index;
ptr2 = mesh_ptr->vert[iv1].shared_tri;

while (ptr2 != NULL)
{
it1 = ((triangle *)ptr2->data)->index;

if (it0 == it1) /* Common triangle found*/
{
e->triangle_index[k++] = it0;
break;
}

ptr2 = ptr2->next;
}

if (k == 2)
{
/* No point in traversing the list
after two common triangles are found */
break;
}
ptr1 = ptr1->next;
}
}
}
}
}

This approach is incredibly fast as opposed to the previous one. I
clocked the time needed to construct the list and in the first case it
took about 16 seconds and in the second code its less than 1 second.
The most obvious advantage is that the second pass over the list of
triangles is not required nor is the second pass over the edge list
needed. Just one pass and the edge list gets created as we pass
through the list of triangles. The only disadvantage is probably the
extra space needed for storing the link list of shared triangles. But
anyway, I believe it is better to have lists like this for other
calculations like vertex normals where the vertex normal has to be
calculated by taking contribution from all surrounding triangles. If
need arises, then one can delete the linklist after all such
calculations are done to free space in the middle of the program.

Aug 8 '08 #13

Bartc

"pereges" <Br*****@gmail.comwrote in message
news:59**********************************@j33g2000 pri.googlegroups.com...

Ok, I managed to do this in another fast way.

For every vertex, now I have a shared triangle list. That is a list of
triangles that share this particular vertex. So when I need to find
out the triangles shared by an edge, all I need to do is :

1. Go to the edge end points v0 and v1.
2. Look in the shared triangle lists of v0 and v1 to find out exactly
two common triangles.

This approach is incredibly fast as opposed to the previous one. I
clocked the time needed to construct the list and in the first case it
took about 16 seconds and in the second code its less than 1 second.

I'm surprised searching the entire edge list for each triangle edge is only
about 20 times slower.

Or it's possible that looking at the local set of triangles per vertex is
not as fast as it could be; I'm guessing you're only doing a linear search
here. Apart from having to build the shared triangle list.

I still have a feeling that building the entire edge table during the first
pass, using a fast lookup method of all edges, could also be fast and
possibly faster.

I believe it is better to have lists like this for other
calculations like vertex normals where the vertex normal has to be
calculated by taking contribution from all surrounding triangles.

Yes it would be difficult to create the vertex normal from just the vertex.

--
Bartc

Aug 8 '08 #14

A suggestion on data structures.

Similar topics