473,386 Members | 1,706 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Last 3 records by group

How can I retrieve the last 3 records from the following table. The table has the following fields and sample data:

ClientID TransactionID TransactionDate

1 6511216 01/02/2012
1 5332573 18/05/2011
1 9849528 11/02/2012
1 5374530 24/08/2010
1 5711675 26/04/2009
1 4001184 01/09/2011
2 9087526 15/07/2011
2 6524824 06/08/2009
2 5376892 26/07/2011
2 5327891 28/10/2008
2 6423568 11/11/2011
2 5379827 16/06/2012

the results should be (order by TransactionDate):

ClientID TransactionID TransactionDate

1 4001184 01/09/2011
1 6511216 01/02/2012
1 9849528 11/02/2012
2 5376892 26/07/2011
2 6423568 11/11/2011
2 5379827 16/06/2012


Note: I can't use 'Select Top 3' a. because of the month format b. this must work for a table where TransactionDate is a different field with text data.
Feb 11 '12
69 8466
Rabbit
12,516 Expert Mod 8TB
I didn't think that it would be a large difference. The problem is that the 150,000 records are joined to itself. So it's practically comparing 150000 ^ 2. And that's a lot of records to process. That would be the largest contributing factor to the speed of and would far outweigh the overhead of VBA.

However, this may be because of a lack of proper indexes. Perhaps an index on the client and month would improve the speed of the SQL.

Now, if we were talking about SQL Server 2005+, it has a row numbering function that would significantly speed up the processing because you don't have to join the table to itself. I'm talking seconds rather than minutes.
Feb 14 '12 #51
ADezii
8,834 Expert 8TB
@Rabbit:
I also do not think that
testing the performance on 100 rows and then multiplying the result by 1500 (total of 150000)
is a reliable Benchmark. The Testing needs to be performed on the entire Data set of 150,000 Records. Comments?
Feb 14 '12 #52
Rabbit
12,516 Expert Mod 8TB
Hard to say, because the numbers seem to be in line with what I got. I tested the SQL on 170,000 rows and it took a little under 6 minutes.

A full test couldn't hurt, if it's only 6-7 minutes it would be beneficial to just do a full test.
Feb 14 '12 #53
Rabbit
12,516 Expert Mod 8TB
I whipped this up. It's a simplified replication of the ROW_NUMBER() function in SQL Server. It should be many magnitudes faster.

Expand|Select|Wrap|Line Numbers
  1. Option Compare Database
  2. Option Explicit
  3.  
  4. Dim row As Double
  5. Dim strPartition As String
  6.  
  7. Public Function RowNum(partition As String) As Double
  8.     If partition <> strPartition Then
  9.         row = 0
  10.         strPartition = partition
  11.     End If
  12.  
  13.     row = row + 1
  14.     RowNum = row
  15. End Function
Expand|Select|Wrap|Line Numbers
  1. SELECT partitionField, 
  2.    orderField,
  3.    otherField
  4. FROM someTable
  5. WHERE RowNumber(partitionField) < 4
  6. ORDER BY partitionField,
  7.    orderField
PS: It's not as quick as the SQL Server version but it takes half the time as the previous solutions. I ran it on 170000 records and it took a little under 3 minutes.
Feb 14 '12 #54
NeoPa
32,556 Expert Mod 16PB
Moishy101:
Then I get a Cannot group on fields selected with '*' (qTH1). error.
OK. That means Jet SQL needs it spelled out long-hand. Unfortunate, as I was trying to minimise unnecessary verbage :-(

Expand|Select|Wrap|Line Numbers
  1. SELECT   qTH1.ClientID
  2.        , qTH1.TransactionNumber
  3.        , qTH1.MonthOrder
  4. FROM     [qryTransactionHebrew] AS [qTH1]
  5.          INNER JOIN
  6.          [qryTransactionHebrew] AS [qTH2]
  7.   ON     (qTH1.ClientID = qTH2.ClientID)
  8.  AND     (qTH1.MonthOrder >= qTH2.MonthOrder)
  9. GROUP BY qTH1.ClientID
  10.        , qTH1.TransactionNumber
  11.        , qTH1.MonthOrder
  12. HAVING   Count(*) < 4
  13. ORDER BY qTH1.ClientID ASC
  14.        , qTH1.MonthOrder DESC
Essentially the same as Rabbit's version but with more meaningful names (Well, you couldn't expect him to read my mind now, could you).
Feb 15 '12 #55
NeoPa
32,556 Expert Mod 16PB
On the performance issue, it would be interesting (think critical) to know whether or not [ClientID] is indexed. I would normally expect the SQL only version to be significantly faster, all else being equal. I didn't study the VBA suggested closely enough to determine if it was implementing the same logic as the SQL. Nor can I honestly say what I mean by significantly faster. A little more than was shown to be sure, but not necessarily much more.

I believe the killer issues are :
  1. The number of records involved.
  2. The fact that the data of each record needs to be run through [qryTransactionHebrew] before it's sorted.

PS. Did the OP ever comment on the problem related to months of different years, raised in post #19?
Feb 15 '12 #56
Rabbit:
I'm not sure what your pseudo code means, can you please elaborate?

NeoPa:

Thank you very much, it works well.
Feb 15 '12 #57
Rabbit
12,516 Expert Mod 8TB
It wasn't pseudo code. Well, the SQL query won't run with your table set up but it's syntactically correct. It basically numbers the rows breaking by the grouping field and then uses that to filter the rows. It should run at least twice as fast as the other methods.

NeoPa, I don't think that they did.
Feb 15 '12 #58
NeoPa:

Regarding the problem related to months of different years, raised in post #19, there is another problem, the Hebrew calendar has 6 leap years (13 months instead of 12) in a 19 year cycle, so the order will not be accurate.
but both of those issues can be solved quite simply albeit not in a smart or professional manner, in tblHebrewMonthOrder I can add the values for the coming years (expected lifetime of the mdb).
Feb 15 '12 #59
Rabbit:

What table setup do I need for your new query to work?
Feb 15 '12 #60
Is it was possible to have a separate auto-numbered field for each ClientID, if so would grabbing the last three records for each ClientID (based upon the auto-numbered field) be any more effective?

If the answer the above in yes, how can I implement it?
Feb 15 '12 #61
Rabbit
12,516 Expert Mod 8TB
In response to post 60, you don't have to change your table setup, you just have to use different sql, mine was just a syntactically correct example of how to use the code.

In response to post 61, that's what my code does.
Feb 15 '12 #62
NeoPa
32,556 Expert Mod 16PB
Moishy101:
but both of those issues can be solved quite simply albeit not in a smart or professional manner, in tblHebrewMonthOrder I can add the values for the coming years (expected lifetime of the mdb).
That makes sense actually. Sometimes you just have to have a kludgy solution when the data you're dealing with doesn't fit anywhere neatly. Clearly the [MonthOrder] values will go beyond thirteen in that case ;-)

For Rabbit's SQL you need to treat the references as :
  1. [someTable] == [qryTransactionHebrew]
  2. [partitionField] == [ClientID]
  3. [orderField] == [MonthOrder]
  4. [otherField] == [TransactionNumber]

Thus, the SQL would look something like :
Expand|Select|Wrap|Line Numbers
  1. SELECT   [ClientID]
  2.        , [TransactionNumber]
  3.        , [MonthOrder]
  4. FROM     [tblTransactionHebrew]
  5. WHERE    (RowNumber([ClientID]) < 4)
  6. ORDER BY [ClientID] ASC
  7.        , [MonthOrder] DESC
Feb 15 '12 #63
NeoPa
32,556 Expert Mod 16PB
I'd be surprised if that works exactly as intended. Hear me out on this, because I may be off-base and need correcting.

The WHERE clause is applied to the incoming data. As such, the ORDER BY clause has not had a chance to effect the order. I would expect this to reflect the original order, rather than that applied depending on the Hebrew month (which, I expect, would explain why Rabbit's test threw up no flags).

Is it possible to use a HAVING clause in a query which has no GROUP BY clause and no aggregation functions?
In case it is, try the following :
Expand|Select|Wrap|Line Numbers
  1. SELECT   [ClientID]
  2.        , [TransactionNumber]
  3.        , [MonthOrder]
  4. FROM     [tblTransactionHebrew]
  5. HAVING   (RowNumber([ClientID]) < 4)
  6. ORDER BY [ClientID] ASC
  7.        , [MonthOrder] DESC
I'm happy to be enlightened either way :-)
Feb 15 '12 #64
Rabbit
12,516 Expert Mod 8TB
Yes, I was worried about that being the case. But after creating the code and testing, it actually does take the order by into account before running the code. Otherwise, I was prepared to use a subquery to do the sorting before filtering.
Feb 15 '12 #65
NeoPa:
For Rabbit's SQL you need to treat the references as :

1. [someTable] == [qryTransactionHebrew]
2. [partitionField] == [ClientID]
3. [orderField] == [MonthOrder]
4. [otherField] == [TransactionNumber]
[MonthOrder] is in a separate table so the sql will have to be adapted a little bit.

Unfortunately it is not possible to use a HAVING clause in a query which has no GROUP BY clause and no aggregation functions, and so I get a "HAVING clause (RowNumber([ClientID])<4) without grouping or aggregation." error.

The WHERE clause is applied to the incoming data. As such, the ORDER BY clause has not had a chance to effect the order. I would expect this to reflect the original order, rather than that applied depending on the Hebrew month (which, I expect, would explain why Rabbit's test threw up no flags).
The order is fine, the problem is that it shows all records instead of the last three only.
Feb 15 '12 #66
Rabbit
12,516 Expert Mod 8TB
It looks like I was wrong. The WHERE clause is using the original order in the data source. I was just tricked because I included it as an output field and that seemed to be numbering the rows correctly. And no method of subquerying would fix the issue.

The only workaround would be to write out to a temporary table the sorted recordset before running the query. But that may be more trouble than it's worth. It should still be faster but it just becomes more complicated.
Feb 15 '12 #67
NeoPa
32,556 Expert Mod 16PB
Moishy101:
[MonthOrder] is in a separate table so the sql will have to be adapted a little bit.
No. It may be in a separate table from the transactions, but the source for the SQL ([someTable]) is [qryTransactionHebrew], which contains that data already.

Rabbit:
The only workaround would be to write out to a temporary table the sorted recordset before running the query. But that may be more trouble than it's worth. It should still be faster but it just becomes more complicated.
Unfortunate, after all that thinking and working out :-(

An alternative might be to add a field to the table (You may as well do two actually) to handle the ordering, which get updated when, and only when, the report is required. In all other circumstances, they should be considered to contain no valid data. With the relative data updated I don't expect the processing for the report would take too long. The updating might take some care to avoid making it take too long though.
Feb 15 '12 #68
Thank you all for your help, I'm busy with several other projects right now, I hope to get back to this issue in a week or so.
Mar 2 '12 #69
NeoPa
32,556 Expert Mod 16PB
No worries. I'm sure we can look at this again when you're next ready.

BTW. I missed a point from an earlier post which warrants a reply :
Moishy101:
The order is fine, the problem is that it shows all records instead of the last three only.
This shows that you haven't understood the point I was making from post #64. You may want to read it again, but it's a technical reason for why selecting the last three in the correct order doesn't work as intended. There's no point in selecting the last three of a set of records which are not already ordered in the correct sequence as that would simply result in three random records.
Mar 2 '12 #70

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: Matt | last post by:
I have a project where I need to have navigation buttons. Now I got them working but I found a problem that occurs when there are records deleted from the DB. The DB I am accessing has an ID column...
10
by: Lyn | last post by:
I have a form set to Single Form mode with which I can cycle through the records in a table via Next and Previous buttons. To avoid users pressing the Previous button on the first record and the...
5
by: booksnore | last post by:
I am reading some very large files greater than 10 GB. Some of the files (not all) contain a header and footer record identified by "***" in the first three characters of the record. I need to...
2
by: schapopa | last post by:
Hi, I want to create query where I could group records by quarters, and get the last record in each group. e.g Create Table MyTable ( Value , date )
4
by: Tomas | last post by:
I'm creating MS Access database and I need to have in query an average of 3 last records. How to do it? Maybe here is some function or sql expresion? Thanks
2
by: perryche | last post by:
I don't know where to begin search for this... I want to add a record, but still able to see the page, without bringing it to a new page and have to scroll back to see other records. I hope I...
1
by: jith87 | last post by:
hi, i am importing a text file into oracle database using sql loader.i need to ommit the first and last records of the text file while importing... can anybody help???? this is my text file... ...
4
by: PeteM | last post by:
Using SQL Server 2005 I have on a number of occasions, succesfully used the following code to create tables with the last dated results or observations for each Patient. However I am now being asked...
4
by: Greg (codepug | last post by:
I want to display the last 22 records in my continuous form. I have writen the following code and it works, but was wondering if there were any better suggestions for accomplishing this. My code...
5
by: asdasd10 | last post by:
i tired do it alone but its gives me an eror like that: "r object cant move back" here is my code : <% path = Server.mappath("../db/ServerData.mdb") set con =...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.