Given the following table: -
EMPL_ID | CO_ID | START | STOP
-
--------------------------------------------
-
12 U 01/06/1976 07/09/1980
-
12 U 03/18/1982 11/18/1988
-
12 U 03/23/1988 05/01/1992
-
12 U 01/25/1991 02/29/1996
-
12 U 03/16/2004 05/05/2004
-
12 U 05/06/2004 12/31/9999
-
12 M 11/24/1975 07/09/1980
-
12 M 03/18/1982 11/18/1988
-
12 M 03/23/1988 05/01/1992
-
12 M 01/25/1991 02/29/1996
-
12 M 03/16/2004 05/05/2004
-
12 M 05/06/2004 12/31/9999
-
16 U 11/26/1962 07/01/1963
-
16 U 07/01/1963 08/29/1963
-
16 U 10/17/1963 01/15/1965
-
16 U 04/09/1965 06/05/1972
-
Final Version: -
EMPL_ID | CO_ID | START | STOP
-
--------------------------------------------
-
12 U 01/06/1976 07/09/1980
-
12 U 03/18/1982 02/29/1996
-
12 U 03/16/2004 12/31/9999
-
12 M 11/24/1975 07/09/1980
-
12 M 03/18/1982 02/29/1996
-
12 M 03/16/2004 05/05/2004
-
12 M 05/06/2004 12/31/9999
-
16 U 11/26/1962 06/05/1972
-
I am trying to determine total service time by employee and company. If the employee leaves and returns within a 6 month window, then we can act if the employee has never left. (Employee 16, company U above).
Sometimes the stop date for the row will overlap with the start date on the next row, these rows should be merged and the greater stop date should be seen.
I've got this data in an MS Access table and/or a DB2 Z/OS table, but I am not able to find any functions or develop queries that will give me the result that I want.
I'm thinking this will have to be written in a VBA module in my MS Access database that will massage and condense the dates.
Anyone good with loops and date checks?
Thanks!
11 1697 @MahaCoder
Hey MahaCoder,
From what you are describing, it sounds like you're trying to normalize data here. There isn't going to be any query that is going to be able to do this out of the box. You are correct in the fact that you are going to have to write VBA and create a new table with the processed data. If you're using the DB2 datasource as your never changing source, you're going to have a great deal of latency issues with the processes, if you're converting on the fly all the time.
If this is a process to correct errors, it would be more than worth it to normalize the data. But as for a working model to continue with the current dataset isn't adviseable without taking a massive performance hit.
If you need help in writing the VBA and SQL statements, please let me know. Also please provide any work that you've already tested and is or isn't work currently.
Thanks,
Joe P.
Thanks Joe P for your interest in this project.
The data is actually entered with an additional key, [State], which I have removed and am trying to condense the dates up to to get a nationwide start and stop date by employee, by company. The data is maintained at the State level, so each one of those CO_ID's per EMPL_ID could be for different states. That's why the data does not look normalized, and this is not an attempt to correct data. This would be a monthly execution so that the data can be evaluated at a nationwide level.
Here is the VBA that I have started with, but with 600,000 rows to evaluate this is clearly going to be processing intensive. My idea was to load all start/stop dates by employee/company into a dictionary and then load the dictionary into an array and loop through and evaluate the rows to keep and the new start/stop dates. The corrected array could then be loaded back into the table. This is where I'm missing the logic. I think I'm making it much more difficult then I need to.
I mentioned my different platforms because I know there are functions in DB2 UDB or OLAP that can evaluate the previous and next rows, and was hoping there was something similar that I could do in a query in DB2 Z/OS. - Public Function CondenseDates()
-
-
Dim strSQL As String
-
Dim strKey1 As String
-
Dim strKey2 As String
-
Dim conConnection As New ADODB.Connection
-
Dim rstRecordset As New ADODB.Recordset
-
Dim dbsDatabase As Database
-
Dim dtDateBeg As Date
-
Dim dtDateEnd As Date
-
Dim dtDate1 As Date
-
Dim dtDate2 As Date
-
Dim varRetVal As Variant
-
Dim dblCounter As Double
-
Dim objDict As New Scripting.Dictionary
-
Dim arrKeys() As Variant
-
Dim arrKeysDelete() As Variant
-
Dim intIndex As Integer
-
Dim intIndex2 As Integer
-
Dim objDictDelete As New Scripting.Dictionary
-
Dim blnDropRows As Boolean
-
-
strSQL = ""
-
strSQL = strSQL & "SELECT * "
-
strSQL = strSQL & "FROM @CTG287.TDMR039 "
-
strSQL = strSQL & "WITH UR"
-
-
'Initialize the progress meter.
-
varRetVal = SysCmd(acSysCmdInitMeter, "Reading Data...", GetRecordCount())
-
-
conConnection.Open strConnectionString
-
rstRecordset.Open strSQL, conConnection, adOpenForwardOnly, adLockReadOnly
-
-
strKey1 = rstRecordset(0)
-
strKey2 = rstRecordset(1)
-
dtDateBeg = rstRecordset(2)
-
dtDateEnd = rstRecordset(3)
-
-
rstRecordset.MoveNext
-
dblCounter = dblCounter + 1
-
-
Do While Not rstRecordset.EOF
-
If strKey1 = rstRecordset(0) And strKey2 = rstRecordset(1) Then
-
objDict.Add dtDateBeg & ":" & dtDateEnd, ""
-
-
Do While strKey1 = rstRecordset(0) And strKey2 = rstRecordset(1)
-
objDict.Add rstRecordset(2) & ":" & rstRecordset(3), ""
-
rstRecordset.MoveNext
-
Loop
-
End If
-
-
If objDict.Count = 0 Then
-
strSQL = ""
-
strSQL = strSQL & "INSERT INTO @CTG287.TDMR038 "
-
strSQL = strSQL & "VALUES ('" & strKey1 & "','" & strKey2 & "','" & dtDateBeg & "','" & dtDateEnd & "')"
-
conConnection.Execute strSQL
-
Else
-
arrKeys = objDict.Keys
-
-
'Loop through Keys array
-
For intIndex = 0 To UBound(arrKeys)
-
dtDate1 = Split(arrKeys(intIndex), ":")(0)
-
dtDate2 = Split(arrKeys(intIndex), ":")(1)
-
For intIndex2 = intIndex + 1 To UBound(arrKeys)
-
If dtDate1 <= Split(arrKeys(intIndex2), ":")(0) And dtDate2 >= Split(arrKeys(intIndex2), ":")(1) Then
-
Call ArrayRemoveItem(arrKeys, arrKeysDelete(intIndex2))
-
Exit For
-
End If
-
Next
-
Next
-
-
strSQL = ""
-
strSQL = strSQL & "INSERT INTO @CTG287.TDMR038 "
-
strSQL = strSQL & "VALUES ('" & strKey1 & "','" & strKey2 & "','" & dtDateBeg & "','" & dtDateEnd & "')"
-
conConnection.Execute strSQL
-
-
End If
-
-
objDict.RemoveAll
-
-
strKey1 = rstRecordset(0)
-
strKey2 = rstRecordset(1)
-
dtDateBeg = rstRecordset(2)
-
dtDateEnd = rstRecordset(3)
-
-
rstRecordset.MoveNext
-
-
If rstRecordset.EOF Then
-
strSQL = ""
-
strSQL = strSQL & "INSERT INTO @CTG287.TDMR039 "
-
strSQL = strSQL & "VALUES ('" & strKey1 & "','" & strKey2 & "','" & dtDateBeg & "','" & dtDateEnd & "')"
-
conConnection.Execute strSQL
-
End If
-
-
dblCounter = dblCounter + 1
-
-
varRetVal = SysCmd(acSysCmdUpdateMeter, dblCounter)
-
-
Loop
-
-
rstRecordset.Close
-
conConnection.Close
-
-
Set rstRecordset = Nothing
-
Set conConnection = Nothing
-
-
'Remove the progress meter.
-
varRetVal = SysCmd(acSysCmdRemoveMeter)
-
-
End Function
-
-
Private Function GetRecordCount() As Double
-
-
Dim strSQL As String
-
Dim rstRecordset As New ADODB.Recordset
-
Dim conConnection As New ADODB.Connection
-
-
strSQL = ""
-
strSQL = strSQL & "SELECT COUNT(*) "
-
strSQL = strSQL & "FROM @CTG287.TDMR038 "
-
strSQL = strSQL & "WITH UR"
-
-
conConnection.Open strConnectionString
-
rstRecordset.Open strSQL, conConnection, adOpenForwardOnly, adLockReadOnly
-
-
GetRecordCount = rstRecordset(0)
-
-
rstRecordset.Close
-
conConnection.Close
-
-
End Function
-
-
Private Sub ArrayRemoveItem(ItemArray As Variant, ByVal ItemElement As Long)
-
-
'PURPOSE: Remove an item from an array, then
-
' resize the array
-
-
'PARAMETERS: ItemArray: Array, passed by reference, with
-
' item to be removed. Array must not be fixed
-
-
' ItemElement: Element to Remove
-
-
'EXAMPLE:
-
' dim iCtr as integer
-
' Dim sTest() As String
-
' ReDim sTest(2) As String
-
' sTest(0) = "Hello"
-
' sTest(1) = "World"
-
' sTest(2) = "!"
-
' ArrayRemoveItem sTest, 1
-
' for iCtr = 0 to ubound(sTest)
-
' Debug.print sTest(ictr)
-
' next
-
'
-
' Prints
-
'
-
' "Hello"
-
' "!"
-
' To the Debug Window
-
-
Dim lCtr As Long
-
Dim lTop As Long
-
Dim lBottom As Long
-
-
If Not IsArray(ItemArray) Then
-
Err.Raise 13, , "Type Mismatch"
-
Exit Sub
-
End If
-
-
lTop = UBound(ItemArray)
-
lBottom = LBound(ItemArray)
-
-
If ItemElement < lBottom Or ItemElement > lTop Then
-
Err.Raise 9, , "Subscript out of Range"
-
Exit Sub
-
End If
-
-
For lCtr = ItemElement To lTop - 1
-
ItemArray(lCtr) = ItemArray(lCtr + 1)
-
Next
-
-
On Error GoTo ErrorHandler:
-
-
ReDim Preserve ItemArray(lBottom To lTop - 1)
-
-
Exit Sub
-
ErrorHandler:
-
'An error will occur if array is fixed
-
Err.Raise Err.Number, , _
-
"You must pass a resizable array to this function"
-
End Sub
Ok I've been working on the code today and discovered several issues with my current code and believe I have a working version. Below is my updated code. I needed to add my date check for 6 months, and instead of removing elements from the array I just tracked the min and max dates on each run, and once I hit the end of a run I loaded that row into the table and continued onto the next run, resetting my min and max start and stop dates.
However, this takes way too long to run... I'm still set on trying to accomplish this with a series of DB2 statements so that processing is run on the server, or even a DB2 stored procedure might be an idea?
- Do While Not rstRecordset.EOF
-
If strKey1 = rstRecordset(0) And strKey2 = rstRecordset(1) Then
-
objDict.Add dtDateBeg & ":" & dtDateEnd, ""
-
-
Do While (strKey1 = rstRecordset(0)) And (strKey2 = rstRecordset(1))
-
objDict.Add rstRecordset(2) & ":" & rstRecordset(3), ""
-
rstRecordset.MoveNext
-
If rstRecordset.EOF Then
-
Exit Do
-
End If
-
Loop
-
End If
-
-
'If there is only one date range for this key, load it into the table as is
-
If objDict.Count = 0 Then
-
strSQL = ""
-
strSQL = strSQL & "INSERT INTO @CTG287.TDMR039 "
-
strSQL = strSQL & "VALUES ('" & strKey1 & "','" & strKey2 & "','" & dtDateBeg & "','" & dtDateEnd & "')"
-
conConnection.Execute strSQL
-
Else
-
arrKeys = objDict.Keys
-
-
'initialize date1 and date2 with the lowest record for the key
-
dtDate1 = Split(arrKeys(0), ":")(0)
-
dtDate2 = Split(arrKeys(0), ":")(1)
-
-
'initialize the min and max run using the first record for the key
-
dtDateBegMin = dtDate1
-
dtDateEndMax = dtDate2
-
-
'Loop through Keys array
-
For intIndex = 0 To UBound(arrKeys)
-
-
'if the earliest end date (plus six months) is greater than the next start date, then condense
-
If DateAdd("m", 6, dtDateEndMax) >= CDate(Split(arrKeys(intIndex + 1), ":")(0)) Then
-
dtDateBegMin = dtDateBegMin
-
dtDateEndMax = Split(arrKeys(intIndex + 1), ":")(1)
-
'if the run is over, load the run into the table
-
Else
-
strSQL = ""
-
strSQL = strSQL & "INSERT INTO @CTG287.TDMR039 "
-
strSQL = strSQL & "VALUES ('" & strKey1 & "','" & strKey2 & "','" & dtDateBegMin & "','" & dtDateEndMax & "')"
-
conConnection.Execute strSQL
-
-
'Reset the begin min and end max dates with the new run
-
dtDateBegMin = Split(arrKeys(intIndex + 1), ":")(0)
-
dtDateEndMax = Split(arrKeys(intIndex + 1), ":")(1)
-
-
'Decrement the index counter by 1 to start over with the next run
-
intIndex = intIndex - 1
-
End If
-
-
'if the last row has been evaluated, load the last run into the table
-
If intIndex + 1 = UBound(arrKeys) Then
-
strSQL = ""
-
strSQL = strSQL & "INSERT INTO @CTG287.TDMR039 "
-
strSQL = strSQL & "VALUES ('" & strKey1 & "','" & strKey2 & "','" & dtDateBegMin & "','" & dtDateEndMax & "')"
-
conConnection.Execute strSQL
-
-
Exit For
-
End If
-
-
Next
-
-
End If
-
-
objDict.RemoveAll
-
-
If Not rstRecordset.EOF Then
-
strKey1 = rstRecordset(0)
-
strKey2 = rstRecordset(1)
-
dtDateBeg = rstRecordset(2)
-
dtDateEnd = rstRecordset(3)
-
-
rstRecordset.MoveNext
-
End If
-
-
dblCounter = dblCounter + 1
-
-
varRetVal = SysCmd(acSysCmdUpdateMeter, dblCounter)
-
-
Loop
NeoPa 32,556
Expert Mod 16PB
I won't jump in now as I expect Joe will come back to you when he comes online and I don't want to confuse matters. You've already gone to some trouble to respond fully, so I'm sure he'll be happy to progress it further.
If you've had no joy by the weekend though, bump the thread and I'll take it up.
NeoPa 32,556
Expert Mod 16PB
I would use VBA Recordeset code (DAO for preference) to go through the table and where any subsequent matching records are found, update the original to reflect the extended finish date and delete the (no longer required) subsequent record.
Does this sound sensible?
Is it within your scope?
NeoPa- this would definately fall in the scope. I might try this approach to see if I gain performance improvements. I've never used the Recordset properties of MovePrevious or Delete, so I would be concerned about what kind of problems I would encounter in my loop in regards to MoveNext when the record has been deleted.
NeoPa 32,556
Expert Mod 16PB
I expect you'll need to use the Bookmark facility too. You will need to read-ahead until you know the current record is not matching (or EOF of course) and then (with all the matching information noted) return to the main record (bookmarked) and update it. The intervening (matching) records can be deleted when processed originally, and the MoveNext should skip over them happily.
Having said that, the next record at this point will be one that's already been read on the first pass, so that can be bookmarked too.
Does all this still make sense?
Yes NeoPa- that makes sense. Do you think there will be much performance gain? I think I would like to consider writing a stored procedure that executes on the mainframe rather than relying on my CPU to process the records through any type of Recordset.
I believe you could improve the speed of this process by taking recordsets for one employee at a time. You could get a recordset of distinct EMPL_IDs, then loop through those and for each open a recordset containing only records for that employee on which to use NeoPa's strategy.
NeoPa 32,556
Expert Mod 16PB @MahaCoder
I'm not sure what you would compare with what to determine any such gain o great coder. @MahaCoder
I'm not aware of another language that would allow you to manipulate it this way so easily. SQL is specifically designed for RDBMSs, and this is a process which doesn't fit into that idea well.
@Chip:
I doubt there would be any benefit in that proposal. RDBMSs specifically optimise the provision of the data. By trying to do this yourself manually, I would expect that to be stopping the RDBMS from doing it for you. You'd need to be quite clever to do it better than it can, so best leave it to the expert (It's what it was designed for after all).
Sign in to post your reply or Sign up for a free account.
Similar topics
by: Bob |
last post by:
I have been asked to modify an existing Access database so that it can
update a MySQL databse in head office from multiple MS Access databases in
branches. The idea is that when the HO database is...
|
by: muntyanu |
last post by:
Hi all,
I have problem when merging existing DataTable into new dataset.
DataSet ds = new DataSet();
while ( done )
{
// fill myCustomDataSet.MyTable with data
ds.Merge(...
|
by: UDBDBA |
last post by:
Hi:
This is a merge questions which has been posted and answered... in my
case need more clairification when target table (tableB) matched
multiple rows to be updated based on the ON condition...
|
by: Sam Durai |
last post by:
Hello, I need to merge a small table (of rows less than 100,sometimes
even 0 rows) to a big table (of rows around 4 billion). I used the PK
of the big table as merge key but merge does a table scan...
|
by: --CELKO-- |
last post by:
Can I get a little help? Given a very simple table of events, with
their durations, I want to consolidate time periods for an event that
(1) touch each other or (2) overlap each other.
CREATE...
|
by: =?Utf-8?B?UHVjY2E=?= |
last post by:
Hi, I'm using vs2005, .net 2 for windows application. I have 2 solutions
that I would like to consolidate into and have just one installation .msi for
them, is this possible? So I just just copy...
|
by: AccessHunter |
last post by:
CASE_ID BEFORE BETWEEN AFTER
53222 N N Y
53222 Y N N
53222 Y N N
53222 Y N N
53222 Y N N
53222 Y N ...
|
by: Michel Esber |
last post by:
Question About Merge
DB2 V8 LUW FP 15.
Hello,
Consider table A (ID integer not null PK, Field2 varchar(50)).
I have the existing following set of rows in the table:
|
by: Michel Esber |
last post by:
Hi all,
DB2 V8 LUW FP 15
There is a table T (ID varchar (24), ABC timestamp). ID is PK.
Our application needs to frequently update T with a new value for ABC.
update T set ABC=? where ID...
|
by: DolphinDB |
last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation.
Take...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: Vimpel783 |
last post by:
Hello!
Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
|
by: jfyes |
last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
|
by: ArrayDB |
last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
|
by: CloudSolutions |
last post by:
Introduction:
For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
|
by: Defcon1945 |
last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
| |