473,473 Members | 1,893 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

problem with grouping (?)

2 New Member
Hello to all,

I am quite new to sql and I don't understand why a query has a certain output.I apologise for the long post but I'd rather describe exactly what I am doing in, hoping that someone can spot the problem that I fail to.
I have created and populated two tables in my database that look like:

Table: binding_sites

id start_bindsite end_bindsite
1 10212250 10212257
2 10105334 10105368
3 101556063 101556073


Table:genome

gene_start gene_end gene_name
10212248 10312253 gene1
10212258 10322250 gene2
101556070 102556060 gene3

What I want is to find which gene from table genome is closer to each of the sites 1, 2, 3 in table binding_sites.

I have created a nested query that calculates the linear distances between the gene_start and both the start_bindsite and end_bindsite and then with the outer select I wish to choose the minimum distance. The problem is that when I group by the start_bindsite field the output is indeed for each binding site the minimum distance but the gene_name is always gene1 , whereas for the last two binding sites it is gene2 and gene3 respectively that should be presented. And when grouping by other fields or combination of other fields doesn't seem to help.

The query:

Expand|Select|Wrap|Line Numbers
  1. select   temp.start, temp.end, temp.gene_name, min(temp.distance) as minimum_distance
  2.           from (select  start_bindsite, end_bindsite, gene_name, abs(gene_start-start_bindsite) as distance
  3.                    from genome, binding_sites
  4.                    group by  gene_name, start_bindsite
  5.                    union 
  6.                   select  start_bindsite, end_bindsite, gene_name, abs(gene_start-end_bindsite) as distance
  7.                   from genome, binding_sites 
  8.                   group by  gene_name, start_bindsite) as temp
  9.         group by  temp.start
  10.  
  11.  
And the output
+-----------+-----------+-----------+------------------+
| start | end | gene_name | minimum_distance |
+------------+-----------+-----------+------------------+
| 10105334 | 10105368 | gene1 | 106880 |
| 10212250 | 10212257 | gene1 | 1 |
| 101556063 | 101556073 | gene1 | 3 |
+-----------+-----------+------------+------------------+
3 rows in set (0.00 sec)


Does anyone have any idea what am I doing wrong here? Any hint or suggestion would be hugely appreciated!!
Thanks for your time in advance,

Eva
Jul 12 '08 #1
2 1471
coolsti
310 Contributor
Without really getting into your particular query, I suspect the problem you have is similar to the problem statement of this thread:

http://bytes.com/forum/thread819577.html

and that the solution to your problem lies in my answer to this thread.

The problem is that your select query calculates the minimum distance as an output variable, and then you join your tables and do your group by. Your output before doing the group by then has the columns you have asked to be selected including this minimum distance. But when you then do the group by "id", MySQL has no way of knowing which of the rows for each "id" should be selected, and so in general will take the first row for each "id", in which case your gene name is wrong, although the minimum distance value is correct.

If I am correct in the above, there are two solutions to the problem. If you expect your output query to always only contain one row, you can do it in an easy way. Drop the "group by" clause, and just select what you wish, calculate your distance and give it some alias name, let us say "distance", and then add the clause "order by distance limit 1". This will not attempt to group by anything, but will order the rows in ascending order by the distance, and since you want the minimum distance, you limit the output to only the first row.

The above only works if your output will contain only one row. That is, you are limiting your query to handle only one "id" at a time. If you wish to have multiple rows in your output, say the row that corresponds to the minimum distance for many values of "id", then you need to use the technique I suggest in the thread I mention above.

Good luck!
Jul 13 '08 #2
evablue
2 New Member
Hello again,

Thank you so much for your reply, it was really helpgul.
I solved the problem after all by applying an inner join to the table described as temp and the table occuring from my whole previous query (only this time I had the same fields to the select and the group by ;-) ).
Thanks again.

Cheers,

Eva
Jul 13 '08 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

5
by: Jerome | last post by:
Hi, I've got the following problem: I want my ASP page to display a certain number (based on a COUNT query), it works fine if the result is at least 1! If there are no records to be counted...
2
by: Paxton | last post by:
Hi, I'm trying to display the total page views per page within a given date range, but the correct SQL is seemingly beyond me. I get the correct result with a straightforward Group By and Count...
16
by: cody | last post by:
I have to write an algorithm with must ensure that objects are put in buckets (which are always 4 in size). The objects have two properties: A and B. It is not allowed that in a bucket are objects...
4
by: Mal Reeve | last post by:
Hello, I have a report that has only 2 levels of grouping. The detail section is simply 1 large block for a memo field. I am finding that on some occasions the report errors and generates...
0
by: virendra | last post by:
hi, I am working on crystal report 9 in asp.net. I created multilevel dynamic grouping in CR. if page is load firsttime. and i am giving firstlevel of grouping or whatever it's working fine. but...
6
by: Christoph | last post by:
I'm trying to come up with a stylesheet where, when the rows are displayed, duplicate game names are not shown on subsequent rows. It works but doesn't work properly. If I sort the data using...
3
by: Steven Bethard | last post by:
I'm trying to solve a constraint-satisfaction problem, and I'm having some troubles framing my problem in such a way that it can be efficiently solved. Basically, I want to build groups of two...
4
by: Chris | last post by:
I tried to retrieve the digit grouping symbol in MSAccess but unfortunately 3;0 is retrieved instead of comma which is my symbol. Function retrieves decimal symbol and list separator without any...
10
by: amitabh.mehra | last post by:
Hi I havent used MQT before. Read the online tips and tutorials but none seems to give any hint for my problem. I have a base table (base_table) as: st varchar(25) default...
3
BeemerBiker
by: BeemerBiker | last post by:
I tried the following that didnt work "c:\\new.mdb" 'c:\\new.mdb' "c:\\new\.mdb" So far, the only thing that worked is "c:\\new" but I had to rename the destination access database to get rid...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
1
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.