473,513 Members | 2,441 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Multidimensional sparse arrays using linked lists

7 New Member
Hi,

I am implementing an aggregation model to compute data cubes.
I have implemented the multiway array aggregation statically. That is if there are, say, 4 dimensions each of cardinality 16 and the chunk size of 4, I create chunkArray[4][4][4][4] and insert the measure at the appropriate position and aggregate the measure. The chunkArray is a sparse array and static implementation is not efficient. In order the dynamically create the chunkArray with measure stored only for the instances that occur I need the offset of each of the cell in the chunk.

Eg: Dimension 3
A[8], B[8], C[8] - each dimension with cardinality 8.
If the relational database table is fully populated then there will be 8*8*8 instances in the table.
If not, then some of the instances will be missing.
a1,b1,c1,20
a1,b1,c2,3
.......
.......
a1,b1,c8,12

This will be like the truth table with 3 variables each of which can take upto 8 values as compared to 2 values in the digital truth table.

The aggregation will be done on chunks. If I have instances a1,b1,c1 - a1,b1,c3 - a1,b1,c8 - some of the instances will be missing. What I need then is an offset of the instance from the first instance.
a1,b1,c1 offset=0
a1,b1,c3 offset=2
a1,b1,c8 offset=7

Suppose the chunk size is 2 in all dimensions, the the total number of chunks will be 8. I will bring in the first 8 chunk cells and aggregate. But the problem is, in the first 8 cells only 3 are dense and rest 5 are not there. I am stuck in finding the offset of the chunk cell. If you have any idea, please let me know.
Mar 1 '10 #1
15 5904
jkmyoung
2,057 Recognized Expert Top Contributor
Could you give us a better idea of how you will be using the chunks in the future? I'm not sure what the difference is between this and a table sorted on a then b then c.
Mar 1 '10 #2
skr13
7 New Member
I will be using the chunks to aggregate the values present in them. For eg,

chunkArray[0][0][0] = 5;
chunkArray[0][0][7] = 9;

when i aggregate keeping the first 2 indices constant, i ll get 13.

Say A represents CITY, B represent ITEM and C represents TIME.
These are the 3 dimensions for the chunk.

CITY ITEM TIME
0- X 0-i1 0-t1
1- Y 1-i2 1-t2
2- Z 2-i3 2-t3

the measure may be sales. keeing A=0 B=0 and aggregating alon C, I will get the sales in city X for item Y for all the times
Mar 1 '10 #3
Banfa
9,065 Recognized Expert Moderator Expert
Are you using C or C++? If you are using C++ and you expect the array to be quite sparse I imagine a map would work quite well.

On the other hand if the array is only slightly sparse (only a few missing values) then it might be just as easy and less complicated to use a static array.
Mar 1 '10 #4
weaknessforcats
9,208 Recognized Expert Moderator Expert
One suggestion is to use an array of struct variables. One member is your data and the other is a binary that is true of the data is present and zero otherwise. This allows you to have zombie elements.

That way your calculations to get to an element will always work out.
Mar 2 '10 #5
skr13
7 New Member
I used linked lists to create an element in the list only if the data is present and used the dimensionality to calculate the offset. Now i am storing the data and the offset in each element of the list, thereby I have saved lot of memory.
Mar 2 '10 #6
jkmyoung
2,057 Recognized Expert Top Contributor
I agree with using an array (sorted) that only contains the values as opposed to a linked list, giving a small memory savings, and negating the need for offsets.

The advantages I can see to linked lists occur if you are querying the first few elements repeatedly, or adding many elements after the initial creation.

Expand|Select|Wrap|Line Numbers
  1. struct Row{
  2. short a, b, c, value;
  3. }
  4.  
  5. Row[] Table = new Row[100];
  6.  
This does really feel like database work
Mar 2 '10 #7
skr13
7 New Member
I need to query the table. For eg, SELECT FROM TABLE where CITY=X and TIME=t1.
For this i need to pre-compute the data cubes and store them. Actually I have the table for chunks.

struct chunks
{
int value,offset;
struct chunks *link;
};

It would be easier if the used arrays to compute data cubes but the problem is I will know the array dimension only during runtime. I would appreciate any other idea to compute the data cubes(aggregation).
Mar 2 '10 #8
weaknessforcats
9,208 Recognized Expert Moderator Expert
It would be easier if the used arrays to compute data cubes but the problem is I will know the array dimension only during runtime.
You can create the array at run-time after you know the dimensions.

I have written a SQL cursor manager to capture the results of a SELECT using arrays. You really need to avoid linked lists as they are too slow. You also avoid the stack since memory there is often restricted.

BTW: You didn't answer Banfa about whether you are using C or C++. If C++ you should be using a vector.
Mar 3 '10 #9
skr13
7 New Member
I am using C++. For each attribute in the table there will be 1 dimension. If the table has 20 dimensions then I will need a 20-dimensional array. This 20 dimensional array will be a sparse array. Linked lists are slow but the memory will not be wasted. And just curious, how do you create 20D array at the runtime?
Mar 3 '10 #10
weaknessforcats
9,208 Recognized Expert Moderator Expert
C and C++ have only one-dimensional arrays. Read: http://bytes.com/topic/c/insights/77...rrays-revealed

The easiest thing to do is create an inheritance hierarchy for your attributes. The Attribute is the base class with derived classes for the varoius kinds-of attributes.

You create an object of a derived class an use it through a base class pointer.


So, create a vector<Attribute*> and add all of the attributes for a row of your array:

Expand|Select|Wrap|Line Numbers
  1. vector<Attribute*> aThing;
  2. Attribute* ptr = new Color(RED);   //Color derives from Attribute
  3. aThing. push_back(ptr);
  4. //
  5. //Add the other attrutes that apply to aThing.
  6. //
  7.  
Then add the vector<Attribute*> to your database:


Expand|Select|Wrap|Line Numbers
  1. vector< vector<Attribute*> > theDataBase;
  2.  
  3. theDataBase.push_back(aThing);


Now your array has just the attributes for each elemen of the database.

You may need to use a Visitor with your Attribute hierarchy. There is an Insight article about how to do that.
Mar 3 '10 #11
jkmyoung
2,057 Recognized Expert Top Contributor
I don't think there's an understanding of how we suggest you use the array:

Vector contents (V):

index : row(city, item, time, value)
[0] : (1, 1, 1, 20)
[1] : (1, 1, 2, 3)
[2] : (1, 1, 7, 5)
[3] : (1, 2, 2, 4)
...
etc.

So if we want to sum where CITY=X and TIME=t1, we go:
Expand|Select|Wrap|Line Numbers
  1. long Sum(int city, int item, int time)
  2. {
  3. sum = 0;
  4. for(int i = 0; i < V.length; i++){
  5.   Row r = V.ElementAt(i);
  6.   if (  ((city == 0) | | (city == r.city)) && //city restriction
  7.         ((item == 0) | | (item == r.item)) && //item restriction
  8.         ((time== 0) | | (time == r.time))) && //time restriction 
  9.       sum += r.value;
  10. }
  11. return sum;
  12. }
  13.  
  14. call this function like 
  15. Sum(X, 0, t1)
  16.  
0 means that the argument is not used.
Mar 3 '10 #12
skr13
7 New Member
I do get what you are saying. But I need to precompute the data cubes and store them and the fire the queries. If there are, say 3 dimension(attributes), I need to abstract the dimensions and compute the cuboids AB, BC, AC, A, B and C. I have done this statically. After that I use a bit map indexing scheme to handle the queries. But I can't imagine n dimensions using linked lists. I have this idea. In the linked list, each node will have n pointers 1 for each dimension. It will be basically a tree wit multiple children. When I traverse along a path, I will be aggregating that particular direction. I would appreciate any idea regarding how to go about this implementation
Mar 4 '10 #13
jkmyoung
2,057 Recognized Expert Top Contributor
Alright, basing it on your previous implementation somewhat, are you looking for something like:
Expand|Select|Wrap|Line Numbers
  1.  class Node
  2.     {
  3.         int index, value;
  4.         ArrayList links;
  5.  
  6.         Node(){
  7.             index = 0;
  8.             value = 0;
  9.             links = new ArrayList();
  10.         }
  11.  
  12.         Node(ArrayList indexes, int v)
  13.         {
  14.             int i = (int)(indexes[0]);
  15.             index = i;
  16.             value = v;
  17.             links = new ArrayList();
  18.             if (indexes.Count != 0)
  19.             {
  20.                 indexes.RemoveAt(0);
  21.                 links.Add(new Node(indexes, v));
  22.             }
  23.         }
  24.  
  25.         void AddNode(ArrayList indexes, int v)
  26.         {
  27.             int i = (int)(indexes[0]);
  28.             Node n = FindNode(i);
  29.             if (n == null) //make new node
  30.             {
  31.                 links.Add(new Node(indexes, v));
  32.             } else { // add to existing node
  33.                 indexes.RemoveAt(0);
  34.                 n.addNode(indexes, v);
  35.             }
  36.             value += v;
  37.         }
  38.  
  39.         Node FindNode(int index){
  40.             for (int i = 0; i < links.Count; i++)
  41.             {
  42.                 Node n = (Node)(links[i]);
  43.                 if (n.index == index)
  44.                     return n;
  45.             }
  46.             return null;
  47.         }
  48.  
  49.         int QueryNode(ArrayList indexes)
  50.         {
  51.             if (indexes.Count == 0)
  52.                 return value;
  53.             int i = (int)(indexes[0]);
  54.             Node n = FindNode(i);
  55.             if (n == null)
  56.                 return 0;
  57.  
  58.             indexes.RemoveAt(0);
  59.             return n.QueryNode(indexes);
  60.         }
  61. }
  62.  
  63. Start with Node root = new Node(), and add, query your nodes from root. 
  64. Eg City=X and TIME=t1
  65. create new array list q: (x, t1);
  66. return root.QueryNode(q);
  67.  
Sorry, this is in C#; don't have a C++ editor at the moment. Just refactor ArrayLists to a Vector or int[].
realized after writing it, that I should have put the indexes in reverse order, but you can do that if you want.
Mar 4 '10 #14
weaknessforcats
9,208 Recognized Expert Moderator Expert
index : row(city, item, time, value)
[0] : (1, 1, 1, 20)
[1] : (1, 1, 2, 3)
[2] : (1, 1, 7, 5)
[3] : (1, 2, 2, 4)
What I had in mind was:

1)create an array where [0] is city, [1] is item, [2] is time and [3] is value:

Expand|Select|Wrap|Line Numbers
  1.    //[0] : (1, 1, 1, 20)
  2.     //[1] : (1, 1, 2, 3)
  3.    // [2] : (1, 1, 7, 5)
  4.    // [3] : (1, 2, 2, 4)
  5.     int arr0[4] = {1,1,1,20};
  6.     vector<int> v0(arr0, arr0+4);
  7.     int arr1[4] = {1,1,2,3};
  8.     vector<int> v1(arr1, arr1+4);
  9.     int arr2[4] = {1,1,7,5};
  10.     vector<int> v2(arr2, arr2+4);
  11.     int arr3[4] = {1,2,2,4};
  12.     vector<int> v3(arr2, arr3+4);
Then add these arrays to a larger array which is the database:

Expand|Select|Wrap|Line Numbers
  1.     vector<vector<int> > DataBase;
  2.     DataBase.push_back(v0);
  3.     DataBase.push_back(v1);
  4.     DataBase.push_back(v2);
  5.     DataBase.push_back(v3);
Then create a multimap for each attribute, Here is the one for city:

Expand|Select|Wrap|Line Numbers
  1.     pair<int, vector<int> > KeyAndValue;
  2.     multimap<int, vector<int> > CityMap;
  3.  
  4.     KeyAndValue.first = 1;
  5.     KeyAndValue.second = DataBase[0];    
  6.     CityMap.insert(KeyAndValue);
  7.  
  8.     KeyAndValue.first = 1;
  9.     KeyAndValue.second = DataBase[1];    
  10.     CityMap.insert(KeyAndValue);
  11.  
  12.     KeyAndValue.first = 1;
  13.     KeyAndValue.second = DataBase[2];    
  14.     CityMap.insert(KeyAndValue);
  15.  
  16.     KeyAndValue.first = 1;
  17.     KeyAndValue.second = DataBase[3];    
  18.     CityMap.insert(KeyAndValue);
CityMap now has an entry for all city=1 in the DataBase.

To look for City = X, you just query the CityMap for all city=1. You do this with iterators. First, get the lower bound. Second, get the upper bound. Betweem these two iterators are all the entries for city = X:
Expand|Select|Wrap|Line Numbers
  1.     //Look for City = X:
  2.     int X = 1;
  3.     multimap<int, vector<int> >::iterator lower;
  4.     lower = CityMap.lower_bound(1);
  5.     multimap<int, vector<int> >::iterator upper;
  6.     upper = CityMap.upper_bound(1);
Next, for time = t1, just iterator between the lower and upper bound checking each entry for a time == t1.
Expand|Select|Wrap|Line Numbers
  1.                 int t1 = 1;
  2.  
  3.     //All the City= X are between the upper and lower iterator
  4.     multimap<int, vector<int> >::iterator  temp;
  5.     temp = lower;
  6.     while (temp != upper)
  7.     {
  8.         //check for time = t1
  9.         //temp is actually pointing at a vector<int>.
  10.         vector<int>::iterator itr = (temp->second).begin();
  11.         while (itr != (temp->second).end())
  12.         {
  13.             //[0] is city [1] is item [2] is time and [3] is value
  14.              if (itr[2] == t1)
  15.              {
  16.                        //found a city=X and time=t1
  17.              }
  18.              ++itr;
  19.         }
  20.         ++temp;
  21.     }
Mar 4 '10 #15
skr13
7 New Member
Thanks a lot jkmyoung and weaknessforcats. I am doing this using multi dimensional linked lists and will probably use multimap. I have done bit vector mapping, but multimap seems to be more easier in processing the queries. I appreciate your help
Mar 4 '10 #16

Sign in to post your reply or Sign up for a free account.

Similar topics

3
2627
by: doodle4 | last post by:
Hello all, I am trying to convert some C code into python. Since i am new to python, i would like to know how to deal with multidimensional arrays? Thanks, -Joe Here's a snippet of what i...
9
6654
by: Charles Banas | last post by:
i've got an interesting peice of code i'm maintaining, and i'd like to get some opinions and comments on it, hopefully so i can gain some sort of insight as to why this works. at the top of the...
3
1770
by: Naomi | last post by:
Hi there, Just wondering, if there is any way to have a dynamic / unspecified size of an array? It obviously needs to be updated, deleted etc. The only way I thought is by making the array...
2
1382
by: d[ - - ]b | last post by:
Hi there, Just wondering, if there is any way to have a dynamic / unspecified size of an array? It obviously needs to be updated, deleted etc. The only way I thought is by making the array...
38
5081
by: Peteroid | last post by:
I looked at the addresses in an 'array<>' during debug and noticed that the addresses were contiguous. Is this guaranteed, or just something it does if it can? PS = VS C++.NET 2005 Express...
21
4153
by: utab | last post by:
Hi there, Is there a way to convert a double value to a string. I know that there is fcvt() but I think this function is not a part of the standard library. I want sth from the standard if...
4
9577
by: chy1013m1 | last post by:
In Java, one can do the following: int array = {{1,2,1}, {1}, {0,0,0,0,0,0,0,-1}}; How can one declare such array in C++/C without using various new operators individually to each row indicies?...
7
1881
by: Mrkljus | last post by:
Hi, ppl Somebody knows why this works in MSIE6 and not in MSIE7 : artPage=new Array(); artPage=new Array(); artPage=null; artPage=new Array(); artPage=1290; artPage=1291;
5
9689
by: adam.kleinbaum | last post by:
Hi there, I'm a novice C programmer working with a series of large (30,000 x 30,000) sparse matrices on a Linux system using the GCC compiler. To represent and store these matrices, I'd like to...
9
4473
by: Slain | last post by:
I need to convert a an array to a multidimensional one. Since I need to wrok with existing code, I need to modify a declaration which looks like this In the .h file int *x; in a initialize...
0
7380
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7535
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7098
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7523
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
5085
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4745
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3232
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3221
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1592
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.