By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
425,790 Members | 1,409 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 425,790 IT Pros & Developers. It's quick & easy.

Need a SQL Query to find Last/Latest Child for Each Parent item in a table

P: 14
Hi,

I am a Excel VBA programmer, I am not so good in SQL and Access. can any one help me with the below scenario

The requirement is to get the latest serviced date for each car in the table. The table looks like this:
Expand|Select|Wrap|Line Numbers
  1. _________________________
  2. Car.No  SrvcID  Service Dt
  3. -------------------------
  4. CAB123  200901  31/01/2009
  5. OWN321  201004  24/04/2010
  6. CAB123  201001  18/01/2010
  7. OWN321  201101  23/01/2011
  8. CAB123  201203  24/03/2012
  9. OWN321  201109  14/09/2011
  10. OWN321  201209  28/04/2012
  11. -------------------------
The expected result is as below:
Expand|Select|Wrap|Line Numbers
  1. _________________________
  2. Car.No  SrvcID  Service Dt
  3. -------------------------
  4. CAB123  201203  24/03/2012
  5. OWN321  201209  28/04/2012
  6. -------------------------
I have tried different ways, by creating a sub query, creating two alias names for the same table and used it like two tables etc..., I finally gaveup and went to google. Looks like I am not good enough to google out what I want.

Any help to get the expected result would be greatly appriciated.

Thanks in advance,
Narendran S
May 8 '12 #1
Share this Question
Share on Google+
28 Replies


NeoPa
Expert Mod 15k+
P: 31,418
The meaning of the question is quite clear, but you've left out one important detail, which is what identifies an individual record uniquely in your table? It's clearly not a single field of those shown, so it's probably a number of fields taken together or a separate field not included. Please clarify in order that someone can help you.
May 8 '12 #2

P: 14
I am sorry if i have confused you with that sample data, The Unique field is the SrvcID. The logic is that, the car CAB123 has got three records and OWN321 has four in the table, however i want to fetch the one latest record written in the table for each car.

I must use SrvcID numbers to find out the latest record for a car, that is the largest SrvcID for a car would be the latest SrvcID for the same. Similarly I have to get largest SrvcIDs for each car in the table.
May 8 '12 #3

NeoPa
Expert Mod 15k+
P: 31,418
NarenKeer:
The Unique field is the SrvcID.
Are you sure. They seem to be date related so surely multiple vehicles could have the same [SrvcID] value?
May 8 '12 #4

zmbd
Expert Mod 5K+
P: 5,397
I have a nice history table that I use to track repair data just like this:

Using your sample data:
Added primary key [ID] as autonumber
ASSUMING that the [Service DT] is the date of service for each event.

Really should have a unique key in the table this is why I added the [ID] field.

So
Expand|Select|Wrap|Line Numbers
  1. Table Named: tbl_servicehistory
  2. [ID]-pk-autonumber
  3. [CarNo]-text
  4. [SrvcID]-text
  5. [ServiceDT]-date(dd/mm/yyyy)
  6.  
Then the following will group your records down to the [Car.No] and then pull the oldest record by the [Service DT]

...
Expand|Select|Wrap|Line Numbers
  1. SELECT tbl_servicehistory.ID,tbl_servicehistory.CarNo, tbl_servicehistory.SrvcID, tbl_servicehistory.ServiceDT
  2. FROM tbl_servicehistory 
  3. INNER JOIN 
  4. (SELECT Max( tbl_servicehistory.ServiceDT) AS MaxOfDate 
  5. FROM tbl_servicehistory 
  6. GROUP BY tbl_servicehistory.CarNo) AS zzz_Query1 
  7. ON tbl_servicehistory.ServiceDT = zzz_Query1.MaxOfDate;
...

This will pull records [ID] = 5 and 7 if the data is entered into tbl_servicehistroy as shown in your post.

It should be a simple matter to pull the desired information from the resulting recordset.

IMHO
It would be a best practice if you would consider not using spaces, periods, and other non-alphanumeric charactors in your table and field names. With the one exception being the underscore.
My reasoning behind this is two fold - I am on old school programer and more importaintly, spaces, periods, and other non-alphanumeric charactors can and do cause issues with running code, SQL parsers, and make queries harder to write.



z
May 8 '12 #5

NeoPa
Expert Mod 15k+
P: 31,418
@zmbd You're keeping me busy tonight :-)

As formatting is something you've put effort into and are doing quite well, I will make some comments that I hope will be helpful :
  1. Good use of the highlighter for the name.
  2. Generally, only quote directly relevant snippets from a post. Multiple quotes is fine if called for, but the responses should deal with the quoted text only.
  3. Keep quoting to a minimum. There is no need to quote the first post. The thread should all be about the first post and it's at the top of the page anyway.
  4. Your post is almost perfectly formatted anyway, but more than two new lines together is never required. It makes it hard to see all the text on a page.
  5. A code block finishes with [/code]. If this is on a new line then the post will include an empty code line below the code. Your second block is done well.

Moving on to the SQL :
Unfortunately it doesn't quite work. It may do with the small sample of data provided, but it relies on the [ServiceDT] field being unique across the groupings, which cannot be guaranteed. This is a very difficult area to get one's head around, but relies on producing an aggregating subquery which identifies a single record within the grouping, uniquely.

If one assumes that [tbl_ServiceHistory] is indeed as you've defined it (It's a shame more members don't follow your example and lay the information out as clearly for us to work on), and that the [CarNo] and [ServiceDt] fields together identify a record uniquely (which makes sense but I'm still trying to clarify with the OP), then something like the following would be required :
Expand|Select|Wrap|Line Numbers
  1. SELECT   tSH.ID
  2.        , tSH.CarNo
  3.        , tSH.SrvcID
  4.        , tSH.ServiceDT
  5. FROM     [tbl_ServiceHistory] AS [tSH]
  6.          INNER JOIN
  7.     (SELECT   [CarNo]
  8.             , Max([ServiceDT]) AS [MaxServDt]
  9.      FROM     [tbl_ServiceHistory]
  10.      GROUP BY [CarNo]) AS subQ
  11.   ON     (tSH.CarNo = subQ.CarNo)
  12.  AND     (tSH.ServiceDT = subQ.MaxServDt)
I almost forgot to mention, your IMHO paragraph is also very good advice, and well expressed. Those who use other characters invariably end up regretting it sooner or later.
May 9 '12 #6

zmbd
Expert Mod 5K+
P: 5,397
@NeoPa

Thank you for the formatting feed back. Trying to do this by hand and between tasks... some of the tests we do have quite the lag time.

ANYWAY:

I am missing the difference between the query you wrote and the one I suggested... could be just that is is late. :)

I did make a few assumptions: namely, that [CarNo] should be treated as a foreign key and that [ServiceDT] would not appear more than once for a given [CarNo].value for given date. I would suppose that [CarNo].value could have multiple [SrvceID].value for the same [ServiceDT].value... that would break the query I wrote. Certainly more information would be helpful...
May 9 '12 #7

NeoPa
Expert Mod 15k+
P: 31,418
zmbd:
I am missing the difference between the query you wrote and the one I suggested... could be just that is is late. :)
That's partly down to my getting it wrong in post #6 :-D

The important part is that the subquery needs to return, and be linked to on, both the [MaxServDt] field as well as the [CarNo] field. See lines #7 and (after the update) #11.

I will update my earlier post now that you've helped me find the error, as I don't want to be giving out any misleading information (which would not be good for anyone reading it, and nor would it be for me).
May 9 '12 #8

P: 14
It worked :D. Thanks a tonne to you guyz.

The scenario is like this, the fields

Car.No SrvcID Service Dt

are all from three different tables, and along with these fields I have like another 10 fields to populate from 4 different tables.

I have created the relationships carefully and wrote first query to Join these tables and populates all required fields

Then I have created another query(which acts like the sub query in your example) that takes MAX of SrvcID field which is Unique.

Now i created the a third querry and used Inner Join like how you mentioed in your example:

Expand|Select|Wrap|Line Numbers
  1. SELECT * 
  2.  
  3. FROM Query1 
  4.  
  5. INNER JOIN Query2 
  6.  
  7. ON Query1.SrvcID = Query2.SrvcID 
and it gave me the expected result.

Thanks again for you help.

Also I will keep in mind about the best practice and formattings next time when I put any post.
May 9 '12 #9

NeoPa
Expert Mod 15k+
P: 31,418
NarenKeer:
... that takes MAX of SrvcID field which is Unique.
You made that statement before, but when I queried it you failed to respond. I don't believe that this field can be unique. It clearly seems to be built from the date of the service. It would be a strange set of data indeed, where a service for any individual date could only reflect a single vehicle. Theoretically it's possible, but very unlikely. That's why I asked you for clarification in post #4. A request that has so far been ignored. A response to this would prove helpful, and may help you to avoid an error in your logic. Testing code with data doesn't indicate the code is good just because the results are as expected. Bad code often passes tests. That doesn't make it good code, just code that hasn't failed yet.

NarenKeer:
Also I will keep in mind about the best practice and formattings next time when I put any post.
That is never a bad thing to keep in mind, but I directed those comments towards zmbd, who is clearly a more advanced poster in this respect, and was making recognisable attempts to work in a better way. It wouldn't be a problem you taking these ideas on board, but no-one would expect you to.
May 9 '12 #10

P: 14
Hi NeoPa,
A request that has so far been ignored.
Apologies for this, I have given a bad example data for SrvcID.

That was just example data, SrvcID is a "Autogenerated" number which is unique.

Thanks,
Narendran S
May 10 '12 #11

zmbd
Expert Mod 5K+
P: 5,397
@NarenKeer
Given then that [SrvcID] being an Autogenerated number such as with "autonumber" then [SrvcID] might also serve as your table's primary key.

So Long as [SrvcID] increments with every new record, is not duplicated, and is numeric then:

Expand|Select|Wrap|Line Numbers
  1. SELECT tsH.ID
  2.      , tsH.CarNo
  3.      , tsH.SrvcID
  4.      , tsH.ServiceDT 
  5. FROM tbl_servicehistory AS [tsH] 
  6.      INNER JOIN  
  7. (SELECT Max(tbl_servicehistory.SrvcID) AS MaxOfSrvcID  
  8. FROM tbl_servicehistory  
  9. GROUP BY tbl_servicehistory.CarNo) AS subQ  
  10. ON tbl_servicehistory.SrvcID = subQ.MaxOfSrvcID;
(NOTE: I haven't proofed the above)
May 10 '12 #12

NeoPa
Expert Mod 15k+
P: 31,418
My mistake. I said "ignored" when I really should have said "overlooked". I very much doubt it was done consciously. These things are easy to miss when there are multiple responses to go through and respond to. Thank you for clarifying anyway.

It can get a little more complicated than this (SQL from post #6) if there is a requirement to identify the record by the PK, but it doesn't seem necessary for these requirements.
May 10 '12 #13

NeoPa
Expert Mod 15k+
P: 31,418
zmbd:
So Long as [SrvcID] increments with every new record, is not duplicated, and is numeric then:
That is not a 'safe' assumption. Although it appears that MS use a procedure that provides that with 'AutoNumber', it is certainly not documented as such, and relying on that behaviour is dubious to say the least. Even if the software doesn't let you down you're also assuming that all records will be entered in the same order as the services are performed. This is not a safe assumption in various circumstances, the first of which that springs to mind is that data from different service stations may be loaded at different times via some sort of batch process. With such a long delay between services this may well work reliably for 99.99% of occasions in real-life situations but the logic is poor and on the one occasion that it does go wrong will be even more difficult to identify. It is never recommended to rely on anything about an 'AutoNumber' field other than its uniqueness.
May 10 '12 #14

P: 14
Hi NeoPa and zmbd,

These tables I use are actually linked tables. They are originally from Sybase. I just use them to give some custom reports to the local team. I had a meeting with the Sybase guy as well. He confirmed that both the Car Number and Service IDs are primary keys in their respective tables.

It is really appriciatable that you guyz are so careful and didn't want one to get misguided or miss any important aspects.

Thanks,
Naren
May 10 '12 #15

zmbd
Expert Mod 5K+
P: 5,397
OK,
Lets go back to the origninal post:
...The requirement is to get the latest serviced date for each car in the table ...
...largest SrvcID for a car would be the latest SrvcID for the same. Similarly I have to get largest SrvcIDs for each car in the table...
From NarenKer's last post:
We now know that [CarNo] should be unique and PK in another table.
we now know that [SrvcID] should be unique and PK in tbl_service.

Is it possible for a given [CarNo] to have the same [ServiceDT] with different [SrvceID] for each record such as in an invoice for services... say something along the lines of refueling?

Expand|Select|Wrap|Line Numbers
  1. _________________________ 
  2. [CarNo] [SrvcID] [ServiceDt]
  3. ------------------------- 
  4. CAB123  200901  01/01/2009 
  5. CAB123  201001  01/01/2009
  6. CAB123  201203  01/01/2009 
  7. ------------------------- 
As NeoPa stated... is there the possiblity that the records are batch entered by many service stations; thus the highest [SrvcID] may not reflect the newest date?

If both are true, that the highest [SrvcID] is not related to the newest date then I suspect that you might not be able to do this reliably without some more information.

-z
May 10 '12 #16

NeoPa
Expert Mod 15k+
P: 31,418
@Naren
We don't want to mislead anyone. We are here to help, and if we mislead that is worse than simply not helping. It doesn't help anyone who reads it and it also makes us look incompetent. Speaking for myself, I'm not happy to confuse anyone, nor am I happy to look incompetent. It's a no-brainer then, to be helpful in the best way we can :-)
May 10 '12 #17

P: 14
Hi zmbd,

Expand|Select|Wrap|Line Numbers
  1. We now know that [CarNo] should be unique and PK in another table.
  2. we now know that [SrvcID] should be unique and PK in tbl_service.
Yes the above is true.

Though the [SrvcID] field is getting updated by many service stations simultaniously; for a particular car, there is no possiblity that the same car is serviced in more than one service station.

However, like you said, there are scenarios where a particular car can have more than one [SrvcID] on the same day. Still, the fact is that unless the existing [SrvcID] is closed, the users cannot create a new one. And that closure is determined by the [ServiceDt], as the [ServiceDt] is when the servicing for that particular car is completed.

Hence, The client is interested in the last/latest [SrvcID]. As they just want to see when was the last time a car is last serviced and what was done in the last service.

Hope I answered all your questions.

Thanks,
Naren
May 11 '12 #18

NeoPa
Expert Mod 15k+
P: 31,418
NarenKeer:
Hence, The client is interested in the last/latest [SrvcID]. As they just want to see when was the last time a car is last serviced and what was done in the last service.
You're assuming that [SrvcID] is always assigned ascending numbers. As I said in post #14 :
NeoPa:
It is never recommended to rely on anything about an 'AutoNumber' field other than its uniqueness.
May 11 '12 #19

P: 14
Oh ok,

Now I got your point clearly. So I should use the [ServiceDt], instead of [SrvcID]

I will change it now.
May 11 '12 #20

zmbd
Expert Mod 5K+
P: 5,397
[Highligh]@NarenKeer[/highlight]

[SeviceDT] is not reliable. In post #18 you affirm that multiple [SrvcID] can have the same [ServiceDT]

From what NeoPa has stated, [SrvcID] is also unreliable.

So trying to show how Human error can garbage up your data... let's try the following:

CAB123
Goes into SrvcStn=1 on 2002-03-02
SrvcStn_1 starts the record [SrvcID] = 101
The service is compleated the same day.
Maybe the station attendant is forgetfull...
Closes that record out on 2002-03-10

In the meantime,
CAB123
Goes into SrvcStn=2 on 2002-03-02
Now... if understand correctly, [SrvcID] = 101 is still open therefore:
SrvcStn_2 starts the record [SrvcID] = 102
The service is compleated the same day.
Maybe the station attendant is forgetfull...
Closes that record out on 2002-03-03

In the meantime,
CAB123
Goes into SrvcStn=3 on 2002-03-02
Now... if understand correctly, [SrvcID] = 103 is still open therefore:
(where'd 103 come from... see below)
SrvcStn_3 starts the record [SrvcID] = 104
The service is compleated the same day.
This attendent is on the ball and closed out the same day the service is compleated.
Closes that record out on 2002-03-02

Inattentive attendant number 1:
CAB123
Goes into SrvcStn=4 on 2002-03-01
Let's say the attendant was distracted and forgot to create the [SrvcID]
Instead he makes the entry on 2002-03-02 before service SrvcStn_3 starts their serivice
Now because the last [SrvcID] is 102 we now get [SrvcID]=103
The service is compleated the same day - 2002-03-01
ASSUME he did this entry correctly.

Inattentive attendant number 2:
CAB123
Goes into SrvcStn=5 on 2002-03-01
Let's say the attendant was distracted and forgot to create the [SrvcID]
Instead he makes the entry on 2002-03-02 after service SrvcStn_3 starts their serivice
Now... if understand correctly, [SrvcID] = 104 is still open therefore:
SrvcStn_5 starts the record [SrvcID] = 105
The service is compleated the same day - 2002-03-01
ASSUME he did this entry correctly.


To help follow things I'm going to add to fields [ID], [Station], and [StartDate] just so we can follow the above data... understanding that neither field is part of your table as given:

Expand|Select|Wrap|Line Numbers
  1. [ID][Station][CarNo][SrvcID][StartDate][ServiceDT]
  2. 1   4   CAB123   103   2002-03-01   2002-03-01
  3. 2   5   CAB123   105   2002-03-01   2002-03-01
  4. 3   3   CAB123   104   2002-03-02   2002-03-02
  5. 4   2   CAB123   102   2002-03-02   2002-03-03
  6. 5   1   CAB123   101   2002-03-02   2002-03-10
So the last station to service [CarNo]=CAB123 is [Station]=3; therefor, the record we want is [ID]=3 with [SrvcID]=104
We can't rely on the [ServiceDT] as that will pull [ID]=5
We can't rely on [SrvcID] as that will pull [ID]=2

I think you'll need more information to reliably pull the correct [SrvcID] for the last time a given [CarNo] is serviced.

-z
May 11 '12 #21

NeoPa
Expert Mod 15k+
P: 31,418
zmbd:
If both are true, that the highest [SrvcID] is not related to the newest date then I suspect that you might not be able to do this reliably without some more information.
The solution I proposed in post #6 works on the basis that no vehicle will be serviced at any service station more than once in any single day. It uses the composite of both [CarNo] and [ServiceDT] to identify a record uniquely. It seems to me that this is a viable solution to the problem, and doesn't rely on any assumptions that are not logically safe.

Considering all we really have to go on is the GROUPing, which is mandated anyway, and the aggregated field ([ServiceDT], then this will always find the (or a) record which matches the [CarNo] and the maximum value for [ServiceDT]. If that's not unique then the problem lies with the data or the question - depending on which way you choose to look at it.
May 11 '12 #22

zmbd
Expert Mod 5K+
P: 5,397
NeoPa:
the solution i proposed in post #6 works on the basis that no vehicle will be serviced at any service station more than once in any single day.
That basis may prove faulty... say the "service" is nothing more than a refueling of the vehicle; thus, it could visit 100 stations within a single day.


NarenKeer:
though the ... [srvcid] field is getting updated by many service stations simultaneously; for a particular car, ...
So, as in my example data, you may have multiple records open at the same time for the same [CarNo] at multiple stations.

.... However, like you said, there are scenarios where a particular car can have more than one [srvcid] on the same day ....
and here we return the fact that any given [SrvcID] may be a 1:M with [ServiceDT] for any given car; thus, possibly invalidating NeoPa's basis for #6 as in my example data in #21 is possible.

Now we're stuck with the logic.
As NeoPa has stated: "...the problem lies with the data or the question..."

Either NeoPa or my solutions will work for a given dataset, NeoPa's perhaps is more robust; however, in this case... if a set of data as in #21 is possible, then you're request is not possible without more information.

-z
May 11 '12 #23

NeoPa
Expert Mod 15k+
P: 31,418
zmbd:
and here we return the fact that any given [SrvcID] may be a 1:M with [ServiceDT] for any given car; thus, possibly invalidating NeoPa's basis for #6 as in my example data in #21 is possible.
You make some very good points, however :
NeoPa:
If that's not unique then the problem lies with the data or the question - depending on which way you choose to look at it.
Maybe the value in [ServiceDT] is a Date/Time value rather a date alone. It certainly should be in the situation described. Otherwise the question is at fault rather than the solution, which is as good as you can get in the circumstances available.
May 12 '12 #24

P: 14
Hi zmbd,

First I think I should make my point clear on how the [ServiceDt] is determined
There is no possiblities that the [SrvcId] for a car can be created in two service stations at the same time. As the cars can be taken out of the station only after billing is done. The Service date should be updated when the billing is done.

However, there are possiblities that the same car goes out of Station1 in the morning, and it faces some problem, so on the way the same car goes in to Station2 for checkup. when done, the billing in second station will also be done in the same day. So, the same vehicle will have two SrvcIDs for the same day.

Second thing, I had a chat with the Sybase and java guys again, the sybase guy confirmed that the [SrvcID] keeps growing as a bigger number and never the latest [SrvcID] would be lesser than the old one. However, I asked him that few years down the line, this ID may grow as big as a 10digit number or more, which would be very bad to have such kind of growing number series. I suggested him to generate numbers like below from front end[yymmdd<Six_Digit_CarNo><two_digit_growing_number>]

It will look like below,
20091012TN012301
20091012TN012302
20091012TN012303
20101124KA097701
20101124KA097702
20101011TN097701
20111104KA097701

Date - [20091012] | CarNo - [TN0123] | 2 Digit Growing Number [03]


This would make life easier for developers to identify things. They said they will get back to me after a little feasibilty study.

Hope things will work well.
May 16 '12 #25

zmbd
Expert Mod 5K+
P: 5,397
NarenKeer:

I'm not sure that changing the [SrvcID] format/information will help.
Knowing that the car is "locked" may be helpfull... Does the [ServiceDT] record the time as well as the date?
May 16 '12 #26

P: 14
zmbd,

nope, it has only date in DD/MM/YYYY format
May 16 '12 #27

zmbd
Expert Mod 5K+
P: 5,397
IMHO: Your data is still at fault.

Capturing the time along with the date in [ServicDT] would be more helpfull than changing the [SrvcID]. With the additional information, I think, you could then use either NeoPa or my queries to get the data ... the car certainly couldn't be in the same or two places at the same time (maybe?).

Also, if I read this correctly, your new [SrvcID] is simply a compounding of "[ServiceDT][CarNo]##"... duplicating the data you already have. This may make things better for the reporting; however, for the query not so much.

-z
May 16 '12 #28

NeoPa
Expert Mod 15k+
P: 31,418
Making [ServiceDt] a Date and time field might be a simpler change to make, though [SrvcID] being built to allow unique identification of a record as well as identifying both the car and the date (as you suggest) could also prove helpful.
May 17 '12 #29

Post your reply

Sign in to post your reply or Sign up for a free account.