Connecting Tech Pros Worldwide Forums | Help | Site Map

Grouping results by multiple variables

Newbie
 
Join Date: Aug 2007
Posts: 5
#1: Aug 15 '07
Hello, I'm new here, but am really impressed with the positive attitude! I was wondering if someone could give me some help or point me in the right direction. I have a query that returns the following fields:

city
state
name
serial
category

using this code
[PHP]

$p = '';
$result = mysql_query("SELECT city, state, title, serial, category FROM serials ORDER BY serial, category");

while($row = mysql_fetch_assoc($result))
{
if($row['serial'] == $p)
{
echo "<br />";
$p = $row['serial'];

}else{

echo "<br />";
echo "<br />";
$p = $row['serial'];
}
echo $row["city"];
echo $row["state"];
echo $row["title"];
echo $row["serial"];
echo $row["category"];

}

?>
[/PHP]

This generates the following:
Expand|Select|Wrap|Line Numbers
  1. Toronto     Ontario     Chef     12345     ABC
  2. Detroit     Michigan     Cook      12345  DBF
  3.  
  4. Toronto     Ontario     Cook     12353     ABC
  5.  
  6. Los Angeles California     Waiter  12355  DGG
  7.  
  8. NYC         New York     Janitor 12382  DGG
  9.  
  10. Miami           Florida     Cook     12777     FGH
  11. Fort Worth     Texas         Usher      12777  QWE
  12.  
  13. Dallas            Texas       Cook       22222  FGH
  14.  
  15. NYC         New York     Chef      33333  JKL
  16. Detroit     Michigan     Janitor 33333     DBF
  17.  
I'm not too concerned about the formatting of the results right now.

What I would like to do is to group like results together. Naturally the <serialnum> values are grouped together and broken up by the double line break. What I would like to also group like 'categories' together with the <serialnum> groupings.

Ideally the data above would be represented like this:
Expand|Select|Wrap|Line Numbers
  1. Toronto     Ontario     Chef     12345     ABC
  2. Detroit     Michigan     Cook      12345  DBF
  3. Toronto     Ontario     Cook     12353     ABC
  4. Detroit     Michigan     Janitor 33333     DBF
  5. NYC         New York     Chef      33333  JKL
  6.  
  7. Los Angeles California     Waiter  12355  DGG
  8. NYC         New York     Janitor 12382     DGG
  9.  
  10. Miami           Florida     Cook     12777     FGH
  11. Fort Worth     Texas         Usher      12777  QWE
  12. Dallas            Texas       Cook       22222     FGH
  13.  
So because the <category> field on line 3 matches the <category> field of line 1, it is grouped with the first grouping. Likewise, although the <serial> field of line 4 doesn't match either of lines 1,2,3 the <category> field of line 4 matches that of line 2. Lastly, since the <serial> field of line 5 matches the <serial> field of line 4, it is grouped in the first grouping.

Does this make sense? Sorry, reading it back it doesn't read that well. Let me know if I can make any further clarification.

Most importantly, thanks in advance for any direction that you can provide!

Jake

pbmods's Avatar
Site Moderator
 
Join Date: Apr 2007
Location: Texas
Posts: 5,435
#2: Aug 15 '07

re: Grouping results by multiple variables


Heya, Jake. Welcome to TSDN!

Looking at this part of your code:
Expand|Select|Wrap|Line Numbers
  1. if($row['serial'] == $p)
  2. {
  3.     echo "<br />";        // Note: similar line in else...
  4.     $p = $row['serial'];  // Note: similar line in else...
  5. }
  6. else
  7. {
  8.     echo "<br />";
  9.     echo "<br />";       // Similar line 1
  10.     $p = $row['serial']; // Similar line 2
  11. }
  12.  
Notice that you're really only doing anything if the current value *doesn't* match the previous value. Have a look at this:
Expand|Select|Wrap|Line Numbers
  1. $lastSerial = '';
  2. $lastCat = '';
  3.  
  4. .
  5. .
  6. .
  7.     // Note that we always echo at least one <br />.
  8.     echo '<br />';
  9.  
  10.     // If the current row s/b in a new section, add a <br />.
  11.     if(! (($row['serial'] == $lastSerial) || ($row['category'] == $lastCategory)))
  12.     {
  13.         echo '<br />';
  14.     }
  15.  
  16.     $lastSerial = $row['serial'];
  17.     $lastCat = $row['cat'];
  18.  
Newbie
 
Join Date: Aug 2007
Posts: 5
#3: Aug 15 '07

re: Grouping results by multiple variables


Thanks for the warm welcome pbmods!

That definitely cleaned up the code, thank you.

The main thing I'd like it to do is like to put the results into groups based on either variable, <serial> or <category>. So in the first 'text' example above, line 16 should be placed up in the first grouping because the category 'DBF' is already present in the first grouping (line 2).

Hmmm still not too sure I made my question very clear.

Thanks for the quick response, it's great here!

Jake
Newbie
 
Join Date: Aug 2007
Posts: 5
#4: Aug 16 '07

re: Grouping results by multiple variables


Hope this is allowed, but I just wanted to bump this and see if anyone had even heard of something like this being capable?

Thanks again!

Jake
jx2 jx2 is offline
Familiar Sight
 
Join Date: Feb 2007
Location: Bristol UK
Posts: 227
#5: Aug 17 '07

re: Grouping results by multiple variables


well its exist solution for that :-)
but you actuly asked two questions pbmods answer you one of them

the other answer is here

it probably doesnt solve all problems but you get what you want to

look at "GROUP BY WITH ROLLUP"
pbmods's Avatar
Site Moderator
 
Join Date: Apr 2007
Location: Texas
Posts: 5,435
#6: Aug 17 '07

re: Grouping results by multiple variables


Heya, Jake.

Bumping is allowed on this site. NB as a result, we are extra-hard on double-posting (posting the same problem twice in the same forum).

I'm not sure I understand entirely what you're looking for. If you look at the example you provided:

Expand|Select|Wrap|Line Numbers
  1. Toronto     Ontario     Chef    12345  ABC
  2. Detroit     Michigan    Cook    12345  DBF
  3. Toronto     Ontario     Cook    12353  ABC
  4. Detroit     Michigan    Janitor 33333  DBF
  5. NYC         New York    Chef    33333  JKL
  6.  
  7. Los Angeles California  Waiter  12355  DGG
  8. NYC         New York    Janitor 12382  DGG
  9.  
  10. Miami       Florida     Cook    12777  FGH
  11. Fort Worth  Texas       Usher   12777  QWE
  12. Dallas      Texas       Cook    22222  FGH
  13.  
There's a definite pattern here; between any two rows in the same section, either the serial number is the same, or the category is the same. It seems like (and correct me if I'm wrong) MySQL is returning the results in the proper order, so the logical way to check to see if we need to start a new section would be to compare the current row with the previous one and see if both the serial and the category are different.

Would there be a situation where at least one of the criteria matches, but we should start a new section anyway? And/or are there circumstances where even though neither the serial nor the category matches, but we should still keep the two rows in the same section?
Newbie
 
Join Date: Aug 2007
Posts: 5
#7: Aug 17 '07

re: Grouping results by multiple variables


jx2 - Thanks for your reply. As I understand the GROUP BY parameter (including GROUP BY WITH ROLLUP) it will only work if there is an aggregate function in the query. Since I want all lines to be echo'd I don't want to have an aggregate function.

pbmods - Yes and No, but mostly yes. :)

Quote:
There's a definite pattern here; between any two rows in the same section, either the serial number is the same, or the category is the same.
Yes, thats exactly right. In the following example (as used previously) there are 3 groups, lets call them A, B, C. In group A line 4 doesn't have either the same serial or category as the line directly above it. It does however have the same category as another record from that that group, line 2.

Expand|Select|Wrap|Line Numbers
  1. Toronto     Ontario     Chef    12345  ABC
  2. Detroit     Michigan    Cook    12345  DBF
  3. Toronto     Ontario     Cook    12353  ABC
  4. Detroit     Michigan    Janitor 33333  DBF
  5. NYC         New York    Chef    33333  JKL
  6.  
  7. Los Angeles California  Waiter  12355  DGG
  8. NYC         New York    Janitor 12382  DGG
  9.  
  10. Miami       Florida     Cook    12777  FGH
  11. Fort Worth  Texas       Usher   12777  QWE
  12. Dallas      Texas       Cook    22222  FGH 
Similarly, in Group C, line 12 doesn't have the same serial or category as the line above it, but the category matches that of line 10.

Quote:
It seems like (and correct me if I'm wrong) MySQL is returning the results in the proper order
Not quite. The results are set to ORDER BY serial, which means that they'll all be grouped together, but the category column is not grouped together.

If I just use the following code as I currently am I could get 5000 results, and while they're sorted by serial, result/line 126 could have the same category as result 3456. Because line 126 has the same category as line 3456 they should be grouped together.

[PHP]
$result = mysql_query("SELECT city, state, title, serial, category FROM serials ORDER BY serial, category");
[/PHP]

Quote:
Would there be a situation where at least one of the criteria matches, but we should start a new section anyway?
No.

Quote:
And/or are there circumstances where even though neither the serial nor the category matches, but we should still keep the two rows in the same section?
No.

I just can't seem to wrap my head around how you'd group results based on either of two variables.

Thanks soooo much for your help so far!
pbmods's Avatar
Site Moderator
 
Join Date: Apr 2007
Location: Texas
Posts: 5,435
#8: Aug 17 '07

re: Grouping results by multiple variables


Heya Jake.

Ok. I see what you're getting at now; I was mistaking the name column for the category column.

Possibly part of the misunderstanding is that in MySQL, the phrase 'GROUP BY' has a slightly different meaning.

I think in your case, you'll need to manually parse your results and organize them that way:
Expand|Select|Wrap|Line Numbers
  1. $_groups = array();
  2. $_idx = array();
  3. while($_row = mysql_fetch_assoc($_res))
  4. {
  5.     if(isset($_idx['serial'][$_row['serial']]))
  6.     {
  7.         $_groups[$_idx['serial'][$_row['serial']]][] = $_row;
  8.         $_idx['category'][$_row['category']] = $_idx['serial'][$_row['serial']];
  9.     }
  10.     elseif(isset($_idx['category'][$_row['category']]))
  11.     {
  12.         $_groups[$_idx['category'][$_row['category']]][] = $_row;
  13.         $_idx['serial'][$_row['serial']] = $_idx['category'][$_row['category']];
  14.     }
  15.     else
  16.     {
  17.         $_groups[][] = $_row;
  18.         end($_groups);
  19.         $_idx['serial'][$_row['serial']] = key($_groups);
  20.         $_idx['category'][$_row['category']] = key($_groups);
  21.     }
  22. }
  23.  
  24. foreach($_groups as $_entries)
  25. {
  26.     foreach($_entries as $entry)
  27.     {
  28.         foreach($entry as $val)
  29.         {
  30.             echo "$val\t";
  31.         }
  32.         echo '<br />';
  33.     }
  34.     echo '<br />';
  35. }
  36.  
Phew! Ok, so what's going on here?

What we're doing is creating a big multi-dimensional array. As we go through, we check to see if we've already processed an entry with the same serial or category. If we have, we assign it to the same group as the last one. Otherwise, we create a new group and update our index.

Going back to everybody's favorite example:
Expand|Select|Wrap|Line Numbers
  1. Toronto* * *Ontario* * *Chef* * 12345* ABC
  2. Detroit* * *Michigan* * Cook* * 12345* DBF
  3. Toronto* * *Ontario* * *Cook* * 12353* ABC
  4. Detroit* * *Michigan* * Janitor 33333* DBF
  5. NYC* * * * *New York* * Chef* * 33333* JKL
  6. *
  7. Los Angeles California* Waiter* 12355* DGG
  8. NYC* * * * *New York* * Janitor 12382* DGG
  9. *
  10. Miami* * * *Florida* * *Cook* * 12777* FGH
  11. Fort Worth* Texas* * * *Usher* *12777* QWE
  12. Dallas* * * Texas* * * *Cook* * 22222* FGH
  13.  
We end up with a $_groups that looks something like this (abbreviated for space):
Expand|Select|Wrap|Line Numbers
  1. Array
  2. (
  3.     [0] => Array
  4.         (
  5.             [0] => Array
  6.                 (
  7.                     [City] => Toronto
  8.                     ...
  9.                     [Serial] => 12345
  10.                     [Cat] => ABC
  11.                 )
  12.             [1] => Array
  13.                 (
  14.                     [City] => Detroit
  15.                     ...
  16.                     [Serial] => 12345
  17.                     [Cat] => DBF
  18.                 )
  19.             .
  20.             .
  21.             .
  22.         )
  23.     [1] => Array
  24.         (
  25.             [0] => Array
  26.                 (
  27.                     [City] => Los Angeles
  28.                     ...
  29.                     [Serial] => 12355
  30.                     [Cat] => DGG
  31.                 )
  32.             [1] => Array
  33.                 (
  34.                     [City] => NYC
  35.                     ...
  36.                     [Serial] => 12382
  37.                     [Cat] => DGG
  38.                 )
  39.         )
  40.     .
  41.     .
  42.     .
  43. )
  44.  
Once we've done all that hard work, the output part is rather trivial....
Newbie
 
Join Date: Aug 2007
Posts: 5
#9: Aug 17 '07

re: Grouping results by multiple variables


Yes Yes Yes!

Thank you sooo much pbmods! I've tested this code on a big db and it worked flawlessly. I'm going to spend some time and make sure I understand exactly whats going on.

Thanks so much for all your help, not only in this thread, but also in everyones threads!

Jake
pbmods's Avatar
Site Moderator
 
Join Date: Apr 2007
Location: Texas
Posts: 5,435
#10: Aug 17 '07

re: Grouping results by multiple variables


Heya, Jake.

Glad to hear you got it working! Good luck with your project, and if you ever need anything, post back anytime :)
Reply


Similar PHP bytes