473,396 Members | 2,009 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

How to remember last record read



Hi,

Can you guys see if there's a solution to this problem?

I have a database from which I have to read each record and process
that record. New records are being added all the time, so I need to go
back and check for new records and process them.

However:

-- there is no 'identity' column in the database design (so I cannot
keep track of the last record read by use of a numeric variable)

-- I am not allowed to update the database (so I cannot flag the
records I have read).

My problem is: how can I know which records I have already read and
which ones I haven't read yet? I don't want to process records twice
and don't want to miss any records.

Is there a known solution to this problem? Any ideas?

Thanks.

Jul 23 '05 #1
12 2047

Use a filed which has the datetime of type. Every time a record is
inserted update this with latest datetime. When viewing the record get
the latest of this field and check whether this different from the
record by getting the latest datetime from the table again

Madhivanan

Jul 23 '05 #2
On 2 Mar 2005 22:36:48 -0800, fr*******@hotmail.com wrote:

(snip)
My problem is: how can I know which records I have already read and
which ones I haven't read yet? I don't want to process records twice
and don't want to miss any records.


Hi francisds,

Madhivanan already pointed out the possibility to use a datetime column
with the date a row is inserted in the table. If such a column is not
available and you're not allowed to add one, here are a few other
options.

* You say that you're not allowed to update the database. Is that just
this table? If so, you can create a new table to hold the ey values of
all rows you already have processed.

* If you're not allowed to add tables to the database either, you could
consider keeping track of rows already processed in a table in another
database. This is quite a hack, though. It can be hard to keep the
databases synchronized, especially if you ever have to restore to a
backup.

* Yet another option would be to use a trigger on the table. Either do
the complete processing in that trigger (if it's short and can't fail),
or write rows with the primary keys of the inserted rows to another
table that you can use to decide which rows to process. Note that even
this trigger will slow down the insert process somewhat!

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 23 '05 #3
(fr*******@hotmail.com) writes:
Can you guys see if there's a solution to this problem?

I have a database from which I have to read each record and process
that record. New records are being added all the time, so I need to go
back and check for new records and process them.

However:

-- there is no 'identity' column in the database design (so I cannot
keep track of the last record read by use of a numeric variable)

-- I am not allowed to update the database (so I cannot flag the
records I have read).

My problem is: how can I know which records I have already read and
which ones I haven't read yet? I don't want to process records twice
and don't want to miss any records.

Is there a known solution to this problem? Any ideas?


So where does your process live? Is it a stored procedure or a client
program? If it's stored procedure maybe a temp table would do.

More information about this process and the circumstances would be
needed to give a good answer.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #4
Thank you for the suggestions. I am not sure they will work yet, but
it does give me hope of finding a solution. I still need to digest
them.

As for the Erland's question. I am writing an application which runs
on a separate PC from the SQL Server database which holds the table.

The table contains records of a business event (for example, a
payment). Each time an event occurs, then a record describing that
event is appended in the table. My application needs to process all
those records.

Again, my problem is, I am not allowed to update the table (it's a
stupid political issue, not a technical one). I checked and it has no
identity field.

Since the table has a column containing the datetime of the event that
occured, I tried using the datetime to remember the last event I
processed. Unfortunately, I found out that events can be generated
from many points and there is no guarantee that the records of that
event will be saved in the order they occured (IOW, an event that
occured at 2:30pm could be saved before an event that occured at 2:15).

-Francis

Erland Sommarskog wrote:
So where does your process live? Is it a stored procedure or a client
program? If it's stored procedure maybe a temp table would do.

More information about this process and the circumstances would be
needed to give a good answer.


Jul 23 '05 #5
> * You say that you're not allowed to update the database. Is that
just
this table? If so, you can create a new table to hold the ey values of all rows you already have processed.


Hi Hugo,

Novice question: what do you mean by the "value" of a row? How do I
get that?

I discovered that there is a column of 'uniqueidentifier' data type. I
could use this probably, although since the database generates
thousands of records a day, the list of processed keys will be fairly
large.

The trigger suggestion is intriguing. I might be able to use it for
another project, but not for this since I cannot change the database
(uhm, I haven't used triggers before, but I assume it requires
modifying the database, at least to save a trigger?)

Thanks.

-Francis

Jul 23 '05 #6
(fr*******@hotmail.com) writes:
Thank you for the suggestions. I am not sure they will work yet, but
it does give me hope of finding a solution. I still need to digest
them.

As for the Erland's question. I am writing an application which runs
on a separate PC from the SQL Server database which holds the table.

The table contains records of a business event (for example, a
payment). Each time an event occurs, then a record describing that
event is appended in the table. My application needs to process all
those records.
OK, so in theory you could just keep track in memory of which records
you have processed. But if you application crashes, you will lose
the knowledge of what you have processed. Then again, that depends on
what your processing consists of. If you have to redo the processing in
case of a crash that's the right thing.

Else your alternative appears to be to write to a file on the machine
you application runs on. When your application starts, you read the
file to see what you already have processed.

The challenge is two have a two-phase commit between the file and
the processing. If you write to the file too soon, you mail fail to
process a record in case of a restart. If you write too late, you may
reprocess a record on a restart.

Then again, if your processing involves some external mechanism, you
have the same problem, if you were to update the table.
Again, my problem is, I am not allowed to update the table (it's a
stupid political issue, not a technical one). I checked and it has no
identity field.

Since the table has a column containing the datetime of the event that
occured, I tried using the datetime to remember the last event I
processed. Unfortunately, I found out that events can be generated
from many points and there is no guarantee that the records of that
event will be saved in the order they occured (IOW, an event that
occured at 2:30pm could be saved before an event that occured at 2:15).


You mentioned in another post that there is a uniqueidentifier column.
Thus the table has a primary key. An identity column had been a little
easier to work with since it monotonic, and a uniqueidentifier is not.
But you could use a combination of datetime and uniqueidentifer. The
nice with the datetime column, is that it permits you prune the log
of processed records easily.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #7
On 4 Mar 2005 10:58:19 -0800, fr*******@hotmail.com wrote:
* You say that you're not allowed to update the database. Is thatjust
this table? If so, you can create a new table to hold the ey values

of
all rows you already have processed.


Hi Hugo,

Novice question: what do you mean by the "value" of a row? How do I
get that?


Hi Francis,

My fault. I should have written "you can create a new table to hold the
values of the primary key columns in the rows you already have
processed".

I discovered that there is a column of 'uniqueidentifier' data type. I
could use this probably, although since the database generates
thousands of records a day, the list of processed keys will be fairly
large.
You could only use this if you were allowed to add a column to the
table. And if you are allowed to do that, than there are better
solutions than uniqueidentifier.

The trigger suggestion is intriguing. I might be able to use it for
another project, but not for this since I cannot change the database
(uhm, I haven't used triggers before, but I assume it requires
modifying the database, at least to save a trigger?)


Yes, the trigger is stored in the database. But I'm not sure exactly
what you may and may not change, as in other messages you wrote that you
are not allowed to update the table. The trigger is not part of the
table, it's stored seperately in the same database.

BTW, have you already considered how to handle updates to a row that you
have already processed?

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 23 '05 #8
> You mentioned in another post that there is a uniqueidentifier
column.
Thus the table has a primary key. An identity column had been a little easier to work with since it monotonic, and a uniqueidentifier is not. But you could use a combination of datetime and uniqueidentifer. The
nice with the datetime column, is that it permits you prune the log
of processed records easily.

As I wrote earlier, I am not guaranteed about the order of arrival of
the records. How do you figure I might use the datetime in combination
with the uniqueidentifier?
-Francis

Jul 23 '05 #9
> >I discovered that there is a column of 'uniqueidentifier' data type.
I

You could only use this if you were allowed to add a column to the
table. And if you are allowed to do that, than there are better
solutions than uniqueidentifier.
Oh, I meant to say that I discovered that the table I am supposed to
read has a column of uniqueidentifiers, not that I can create one for
my use.
Yes, the trigger is stored in the database. But I'm not sure exactly
what you may and may not change, as in other messages you wrote that you are not allowed to update the table. The trigger is not part of the
table, it's stored seperately in the same database.
I can't touch the database at all. :-(
BTW, have you already considered how to handle updates to a row that you have already processed?


Good alert. It's not needed for this application (the events are
mechanically generated and no one will be modifying the data).

(If it makes things clearer, the database I am reading is a Cisco
CallManager CDR table).

http://www.cisco.com/en/US/products/...c.html#wp33440

-Francis

Jul 23 '05 #10
(fr*******@hotmail.com) writes:
As I wrote earlier, I am not guaranteed about the order of arrival of
the records. How do you figure I might use the datetime in combination
with the uniqueidentifier?


Sorry, I missed that part about the arrivals being out of order.

However, I supposed that you can assume that records arriving are not,
say, 24 hours old?

So you would write all GUIDs you processed to a file, together with
their datetime value. When you determine whether you have processed
a row or not, you first look at the datevalue, and if it's older than
24 hours (or two hours or whatever), you discard the record. If it's
newer, you look up the GUID in the file. (Probably you should have a
copy in memory for quick access. Every once in a while you would rewrite
the file and drop entires that are too old.

The main reason for this is to keep the local file down in size.

But you could also have a local SQL Server database that you write to. In
such case there would be no need for expiring old data. If the volume is
considerable so that several entries per second are generated, a database
may be the only option. (Or more precisely, you need an indexed store.)

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #11
On 4 Mar 2005 21:58:26 -0800, fr*******@hotmail.com wrote:

(snip)
I can't touch the database at all. :-(
Hi Francis,

In that case, you'll have to make do with a kludge.

I think the best way to handle this is to keep a list of the primary key
values of all rows you have already processed. You could store this in
another database, either on the same SQL Server instance or on another
instance. Depending on the environment where your process runs, using
another method might be better. Erland already made some suggestions.
(snip)(If it makes things clearer, the database I am reading is a Cisco
CallManager CDR table).

http://www.cisco.com/en/US/products/...c.html#wp33440


Based on a very quikc scan of the description, I'd say that you could
use either the pkid (based on the description, I agree that this is
probably a uniqueidentifier), or the combination of globalCallID_callID
and globalCallID_callManagerID. The latter might be the better choice,
as (if I understand the description correctly), the callID is assigned
sequentially on each server (and I assume that each server maps to a
callManagerID). You might find that you only need to store a last
processed callID for each callManagerID.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 23 '05 #12
Hi All,

We are in process of discussing these solutions. I may later post here
the final solution we adopt. If I don't, thought you might want to
know that I am considering the two options suggested:

- "You might find that you only need to store a last
processed callID for each callManagerID. "

- Remembering the last datetime of last record, querying for new
records with a later datetime (though including those that are 15
minutes earlier, just to make sure), and keeping a list of processed
pkids to prevent duplicate processing. This private list will be
purged around the time of each query, removing those that are old
enough.

Thanks much guys !

-Francis

Jul 23 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

15
by: Joshua Beall | last post by:
Hi All, What is the best way to use a cookie to remember a logged in user? Would you store the username and password in two separate cookies? Should the password be plain text? Hashed? Not...
8
by: Dan Matthews | last post by:
Hey everybody, I have an 2000 Access database that stores job listings from potential employers for a school. I would like to be able to display the date on a webpage the last time the database...
1
by: Matt | last post by:
how to get the last record from database without looping? Whenever the user need to insert a new record to the database, it just increment the id field by one from the last record. I tried...
5
by: Alistair | last post by:
hi peoples and masterful gurus (a little bit of grovelling never hurt anyone...apart from the sore knees) I'm after a little bit of advice on how to do something. basically a history of the...
10
by: Alain Guichaoua | last post by:
Good evening to all Here is my problem : I have a form with a subform. They are linked. When I open the form I would like the subform to reach its last record. I tried the method...
1
by: sixsoccer | last post by:
I have built a database with a <Mainform> and a <Subform>. My problem is twofold. 1. My subform is set as a continuos form with AllowAddiotions set to NO (ie. a list of Issues to the client on...
1
by: Tom | last post by:
Hi How to read the LAST record of datareader? Here is my code ) this.EndDate = Convert.ToDateTime(dr) [/code
9
by: mazen.s.m | last post by:
I have been trying to find a way to get the Domain and UserName of the user that last modified a file. I know how to get the owner of the file via WMI or Win32, but I can't find a way to get the...
18
by: Mel | last post by:
What is the best method to achieve this (I am relatively new to vb.net)? Should I use an ini file or the registry? Is there another option available in vb.net that is the preferred way? This...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.