467,168 Members | 986 Online

 Hi everyone! I have 2 matrices [XYZ]: The 1 is smaller than the 2. I would like to select in the 2nd only lines which correspond to couples [XY] of 1st... Is that possible? And how? However, I need a very optimized code, because I work with very very big matrice (250000*3), and this operation had to be quick... Thanks, Regards, Vinc' A= [1 1 7] [1 2 5] [2 4 7] [3 5 8] B= [1 1 2] [1 2 3] [1 3 4] [1 4 3] [1 5 0] [2 1 7] [2 2 2] [2 3 7] [2 4 1] [2 5 7] [3 1 1] [3 2 4] [3 3 7] [3 4 0] [3 5 0] And I want, with the function I'm looking for, this result : C=f(B)= [1 1 2] [1 2 3] [2 4 1] [3 5 0] Jul 19 '05 #1
• viewed: 1822
Share:
4 Replies
 Vinc_get31 wrote: A= [1 1 7] [1 2 5] [2 4 7] [3 5 8] B= [1 1 2] [1 2 3] [1 3 4] [1 4 3] [1 5 0] [2 1 7] [2 2 2] [2 3 7] [2 4 1] [2 5 7] [3 1 1] [3 2 4] [3 3 7] [3 4 0] [3 5 0] And I want, with the function I'm looking for, this result : C=f(B)= [1 1 2] [1 2 3] [2 4 1] [3 5 0] Well. First of all. Make sure things are sorted! Searching is always easier and faster if things are sorted. That's why phone books are sorted :-) Sort your B matrix according to the first and second column. Your posted example already has the correct sort order. I don't know if this was intended or if it is just a concidence, that's why I address this issue. There are multiple strategies you could use. A general one is this: Take a row from the A matrix -> 2 4 7 You are looking for a row in the B vector which starts with a 2. So do a binary search to locate a row that starts with 2 Eg. The binary search in the first column will bring you to: [1 1 2] [1 2 3] [1 3 4] [1 4 3] [1 5 0] [2 1 7] [2 2 2] [2 3 7] <- [2 4 1] [2 5 7] [3 1 1] [3 2 4] [3 3 7] [3 4 0] [3 5 0] that is a row [ 2 3 7 ] which starts with 2. Since you are looking for [ 2 4 7 ] and 4 is greater then 3, do a search backwards until you either find a row which starts with [ 2 4 . ] or the first column is no longer a 2 or the second row element is greater then 4 (remember: the entries are sorted !) In the second case and third case, there is no row which starts with [2 4 .] If on the other hand the binary search dropped you at an entry [ 2 5 7 ] and you are still searching for [ 2 4 7 ], then you know that the row, if it is in the matrix is before [ 2 5 7 ]. So continue search frontwards until you either find a row that fits or the second column has a value smaller then 4, or the first column has a value smaller then 2. In this cases there is no row that fits your needs. Basically that's it. Repeat the whole process for the next search row and collect the results. If you think this is complicated: not at all. You have done the very same thing hundreds of times: Whenever you search a phone number in a phone book, you do the very same thing: First locate the page which contains the last name, eg by opening the book in the middle and deciding if the searched name is before of after that page. Repeat that process until you find a page which contains the last name (that's the binary search step). If there are multiple first names with the same last name, just pick one and decide if the searched first name is in front of or after the one you search for. You see: coming up with algorithms is often a question of watching carfully what you do in real life :-) Various variations are possible. Eg. if the numbers in your matrix are limited to lets say the a range of 0 to 9, then an additional array containing the starting indices of those numbers in the first row may replace the binary search step. (That would be equivalent with having an index in the phone book: you are looking for a last name that starts with C. Don't worry searching the whole book, all names starting with C are on page 78ff) -- Karl Heinz Buchegger kb******@gascad.at Jul 19 '05 #2
 Hi Vinc, the speed of the search depends on the structure of the data. Your Vector B has always blocks of five elements and the increment between the blocks i constant (1). The second argument you showed here is always 1 to 5. Is this always like that or just in your example you posted? If it is always like that, you could just calculate the position and there is no search needed. But I assume you would store the data just in a two dimensional field. I think you have to describe your data a little bit more exact. Regards, Patrick B= [1 1 2] [1 2 3] [1 3 4] [1 4 3] [1 5 0] [2 1 7] [2 2 2] [2 3 7] [2 4 1] [2 5 7] [3 1 1] [3 2 4] [3 3 7] [3 4 0] [3 5 0] Jul 19 '05 #3
 Vinc_get31 wrote: Hi everyone! I have 2 matrices [XYZ]: The 1 is smaller than the 2. I would like to select in the 2nd only lines which correspond to couples [XY] of 1st... This is known as a "join" operation in databases. If the matrices are sorted (which they apparently are in your representation), there is a quite fast algorithm of the order O(N+M), where N and M are the size of the sparse matrices. 1) i=1; j=1; 2) while ((i<=length(A)) && (j<=length(B)) 3) compare A[i] and B[i] 4) decide: 5) A[i] is greater: increment j 6) B[j] is greater: increment i 7) they are equal : output B[j] and increment j 8) endwhile this simply steps with two indices through both matrices simultaneously, it is somewhat similiar two mergesort. -- Vale ! Christianus Auriocus Jul 19 '05 #4
 Vinc_get31 wrote: Hi everyone! I have 2 matrices [XYZ]: The 1 is smaller than the 2. I would like to select in the 2nd only lines which correspond to couples [XY] of 1st... This is known as a "join" operation in databases. If the matrices are sorted (which they apparently are in your representation), there is a quite fast algorithm of the order O(N+M), where N and M are the size of the sparse matrices. 1) i=1; j=1; 2) while ((i<=length(A)) && (j<=length(B)) 3) compare A[i] and B[i] 4) decide: 5) A[i] is greater: increment j 6) B[j] is greater: increment i 7) they are equal : output B[j] and increment j 8) endwhile this simply steps with two indices through both matrices simultaneously, it is somewhat similiar two mergesort. -- Vale ! Christianus Auriocus Jul 19 '05 #5

### This discussion thread is closed

Replies have been disabled for this discussion.