Easy to Say but hard to implement

Gent

I have a table named Holding_Value that has several fields in it among
which are UID, fkHolding, EffDate, Units, MarketValue, AssetPrice. UID
is an identity field and fkHolding is a foreign key to a different
table. EffDate is the the effective date while units and marketvalue
are values stored in the table.
what i'm trying to do is get all the values (fkHolding, Effdate, Units,
MarketValue) for all fkHolding for a specific date. That would be
pretty easy if there each unique fkHolding had a corresponding value
for every date. The exception is that if no date is found than you
would have to get the next date less then or equal ot the query date.
To furhter explain assume that there 100 records in the table and there
are only 10 distinct fkHolding values. My result will need to include
only 10 records. Each record will have the values of the row containing
the values less than or equal to the given date for a specific given
date. so if given date (EffDate) is 12/1/2004 and 5 of the 10 distinct
fkHolding have been priced on that date, than we get those values, the
rest 5 rows in the resultset need to be the values of of the latest
date less than the given date.

Now the second problem is that this needs to be efficient because this
is only a part of my subquery and the table does not have 100 records
but a few million records. Now what i can do is get the latest value if
i were given an fkHolding for example i would write

declare @fkHolding as integer
declare @DateValue as datetime
select @fkHolding = 2981
select @DateValue = '9/2/2004'

select Holding_Values.UID, Holding_Values.EffDate,
Holding_Values.fkHolding, Holding_Values.AssetPrice,
Holding_Values.MarketValue
from Holding_Values INNER JOIN
(select max(Holding_Values.effdate) as DatePriced from
Holding_Values INNER JOIN
(select * from Holding_values where fkHolding = @fkHolding and
Holding_Values.EffDate < @DateValue) as a
on a.UID = Holding_values.UID ) as b
on Holding_Values.EffDate = b.DatePriced and Holding_Values.fkHolding =
@fkHolding

or also would write it in the same way taking a different approach:

declare @fkHolding as integer
declare @DateValue as datetime
select @fkHolding = 2981
select @DateValue = '9/2/2004'

select Top 1 Holding_Values.UID, Holding_Values.EffDate,
Holding_Values.fkHolding,
Holding_Values.AssetPrice, Holding_Values.MarketValue from
Holding_Values INNER JOIN
(select * from Holding_values where fkHolding = @fkHolding and
Holding_Values.EffDate < @DateValue) as a
on a.UID = Holding_values.UID
Order by Holding_Values.EffDate desc

Both these queries produce a row each when ran for a specific date and
fkHolding. Now the challege is to get all the latest distinct fkHolding
values given only a date.
Thank you for your time and help.

Gent Metaj

Jul 23 '05 #1

Subscribe Post Reply

1541

Hugo Kornelis

On 15 Dec 2004 12:07:48 -0800, Gent wrote:

(snip)

what i'm trying to do is get all the values (fkHolding, Effdate, Units,
MarketValue) for all fkHolding for a specific date. That would be
pretty easy if there each unique fkHolding had a corresponding value
for every date. The exception is that if no date is found than you
would have to get the next date less then or equal ot the query date.

(snip)

Hi Gent,

Since you didn't post CREATE TABLE and INSERT statements to recreate your
situation, I couldn't test it, but something like this should do the
trick:

SELECT h.UID, h.EffDate, h.fkHolding, h.AsseetPrice, h.MarketValue
FROM Holding_Values AS h
WHERE h.EffDate <= @DateValue
AND NOT EXISTS (SELECT *
FROM Holding_Values AS h2
WHERE h2.EffDate <= @DateValue
AND h2.EffDate > h.EffDate)

Another way to do it (test them both to see which one gives you the best
performance):

SELECT h.UID, h.EffDate, h.fkHolding, h.AsseetPrice, h.MarketValue
FROM Holding_Values AS h
INNER JOIN (SELECT fkHolding, MAX(EffDate) AS EffDate
FROM Holding_Values
WHERE EffDate <= @DateValue
GROUP BY fkHolding) AS h2
ON h.fkHolding = h2.fkHolding
AND h.EffDate = h2.EffDate

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)

Jul 23 '05 #2

David Portas

It would have been a lot easier to say if you'd posted some DDL, sample data
and expected results.
http://www.aspfaq.com/etiquette.asp?id=5006

What are the key(s)? Please include them with CREATE TABLE statements. Keys
are going to make a big difference to any query. Without sample data I'm
also unclear if your required result is to include ALL rows for the latest
date for each fkholding or just ONE row for each.

Is this it?

CREATE TABLE Holding_Values (uid INTEGER NOT NULL, effdate DATETIME NOT
NULL, fkholding INTEGER NOT NULL, assetprice INTEGER NOT NULL, marketvalue
INTEGER NOT NULL /* ??? PRIMARY KEY NOT SPECIFIED */)

SELECT H.uid, H.effdate, H.fkholding, H.assetprice, H.marketvalue
FROM Holding_Values AS H,
(SELECT fkholding, MAX(effdate) AS effdate
FROM Holding_Values
WHERE effdate <= @datevalue
GROUP BY fkholding) AS D
WHERE H.fkholding = D.fkholding
AND H.effdate = D.effdate

--
David Portas
SQL Server MVP
--

Jul 23 '05 #3

--CELKO--

>> have a table named Holding_Value that has several fields [sic] in
it .. <<

Where is the DDL? And tables have columns which are completely
different from fields

.. among which are UID, fkHolding, EffDate, Units, MarketValue,

AssetPrice. UID is an identity field [sic] and fkHolding is a foreign
key to a different
table. EffDate is the the effective date while units and marketvalue
are values stored in the table. <<

INDENTITY is never a key and should not be used. There is no magical
"Univerisal Identifier"; do you also believe that God put a 17-letter
hebrew number to everything in Creation? That is how silly using
IDENTITY for a key in an RDBMS is.

A name element tells us what the entity or attribute is in terms of a
data model. You do not use affixes to tell us HOW it is used in one
occurrence in one table. The name "fkHolding" looks slightly obscene
(sorry, but it looks like "F**kHolding" to me).

"Holding_Values" is an attribute, not an entity name. This is a
hisotry, so use that in the name. If you have an asset_price, where is
the asset? I am guessing that used two names for the same entity, so
the holding is the asset.

Get the ISO-11179 Standards or any book on data modeling.

Again, without DDL and proper keys, here is my wild guess:

SELECT H1.*
FROM HoldingHistory AS H1
INNER JOIN
(SELECT asset_id, MAX(eff_date)
FROM HoldingHistory
WHERE eff_date <= @report_date
GROUP BY asset_id)
AS H2(asset_id, eff_date)
ON H1.asset_id = H2.asset_id
AND H1.eff_date = H2. eff_date;

Jul 23 '05 #4

kevin ruggles

"--CELKO--" <jc*******@earthlink.net> wrote in message >
.....

INDENTITY[sic] is never a key and should not be used. . .....
I keep hearing this, and to some extent agree, however, I also keep seeing
it in use, and when I'm running a
quick scenario, I use identity to generate a key for small data sets.
Can you expound or give links / ref to articles that go into detail on this.
The name "fkHolding" looks slightly obscene.....

fk is used as prefix to indicate Foreign Key.

Kevin Ruggles

Jul 23 '05 #5

nib

kevin ruggles wrote:

"--CELKO--" <jc*******@earthlink.net> wrote in message >
....
INDENTITY[sic] is never a key and should not be used. .

....
I keep hearing this, and to some extent agree, however, I also keep seeing
it in use, and when I'm running a
quick scenario, I use identity to generate a key for small data sets.
Can you expound or give links / ref to articles that go into detail on this.

Do a google search of this group. It is usually discussed, passionately,
every couple months.

Zach

Jul 23 '05 #6

Hugo you got it right. I guess i had a brain fart, i was not grouping
by fkHolding when i was trying to do the query. The first approach is a
drag. I let it run for over 2 minutes with no results (indexes might
have something to do wiht it too) but the second one worked like a
charm. It took less than 1 second for a 2.5 million record table. David
Portas solutions works as well.

In response to CELKO's comment about Identity and Primary key I
remember one of my database professors recommending against a while
back ago, but we seem to use that quite often here at work, and i see
identity used a lot as primary key. My prof did not elaborate too much
on why it was a stupid idea to use an identity as primary key but i
would appreciate if someone else has more info.
And fk is a convention often used to mean Foreign Key.

Thanks,

Gent

Jul 23 '05 #7

David Portas

> I use identity to generate a key for small data sets.

Can you expound or give links / ref to articles that go into detail on this

A search of the web and the microsoft.public.sqlserver.programming
hierarchy will find you many, many articles by Celko and others on this
topic.
when I'm running a
quick scenario, I use identity to generate a key for small data sets
What for? IDENTITY is part of a physical implementation not part of the
logical model of your data. If you post a CREATE TABLE statement here,
for example, just with just an IDENTITY column but don't identify any
other key then that tells us nothing about the entities involved and
we'll probably have a much harder time trying to solve your problem. In
modelling and problem-solving scenarios it is usually the natural key
of your data matters. (IDENTITY prompts other problems of its own of
course, but that's where you came in...)
fk is used as prefix to indicate Foreign Key

Reasonable people differ when it comes to naming conventions. However,
I'll bet if you take a quick poll you'll find that most SQL pros (good
ones anyway) loathe to see prefixes on column and table names. One
reason is that if you represent structure and datatype and other info
in identifiers and then that metadata changes you have to change the
identifier even though the data element itself hasn't changed. Another
reason is that they are harder to type and remember. The standards
document that Joe cited defines some naming conventions for data
elements.

--
David Portas
SQL Server MVP
--

Jul 23 '05 #8

--CELKO--

>> one of my database professors recommending against a while back ago,
but we seem to use that quite often here at work, and I see identity
used a lot as primary key. <<

For the technical reasons that have to do with portability, relational
design and data integrity, you can Google my name and IDENTITY to some
of my rants.

The *real* reasons have to do with human behavior. In the working world,
RDBMS systems come from legacy file systems and untrained legacy file
system programmers. They mimic the designs they learned in the old
technology.

There was little separation of logical and physical data. Physically
contigous fields made up *physically contigous* records in files that
were in a sequence on a magnetic tape. The tape was sorted on a key and
all the EDP depended on that sort order to locate a record -- you did
not do random access on a tape.

Then comes the RDBMS, with the concept of relational keys. This meant
you had to know your data model, you had to do research! That's work!
It is so much easier to use some proprietary exposed physical locator
like IDENTITY or a row number to mimic the physical position of record
at the end of a magnetic tape. You can write code with cursors and
completely mimic a 3GL programming language.

You get to SQL and you have to think in sets and in much more complex
logic. It is hard work if you have never had a class in set theory or
formal logic. So people avoid it with IDENTITY and often miss needed
constraints for real key. It also gives them the feeling that the table
is normalized because it has this "false key" and they leave all kinds
of flaws in the schema. You can clean the results in the front end,
just like you did with COBOL in 1968, right?

And fk is a convention often used to mean Foreign Key.<<

I know, I know, but that was too good a straight line not to use :)

Seriously, you name something for *what it is* in the logical data
model, not for *how it is used* in one particular table. Would drop the
FK- prefix if it were used as a non-key column in another table? Do you
use a PK- prefix on it in the the referenced table? Would add "<table
name>-" prefixes for every occurence of the data element? When you sign
a check, do you change your name to include the room you are in at the
time?

The rule is that a data element has one and only one name, one and only
one meaning. This why a data dictionary can work. The only time you
change a data element name is when it occurs in two or more roles in a
query. Thus, "boss_emp_id" and "worker_emp_id" are both "emp_id"
values, but play two different roles.

--CELKO--
Please post DDL in a human-readable format and not a machne-generated
one. This way people do not have to guess what the keys, constraints,
Declarative Referential Integrity, datatypes, etc. in your schema are.
Sample data is also a good idea, along with clear specifications.
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Jul 23 '05 #9

Erland Sommarskog

--CELKO-- (re**************@earthlink.net) writes:

The *real* reasons have to do with human behavior. In the working world,
RDBMS systems come from legacy file systems and untrained legacy file
system programmers. They mimic the designs they learned in the old
technology.

There was little separation of logical and physical data. Physically
contigous fields made up *physically contigous* records in files that
were in a sequence on a magnetic tape. The tape was sorted on a key and
all the EDP depended on that sort order to locate a record -- you did
not do random access on a tape.
This is complete bullshit. You and few more people may actually have
programmed against tapes, but most of us haven't.

There's no need to involved tapes and other forms of arcane computer
technology to explain why the concept of an artificial key is popular.
Simpler and even older technique is more applicable, technique that is
still in use: pen and paper.
It is so much easier to use some proprietary exposed physical locator
like IDENTITY or a row number to mimic the physical position of record

IDENTITY has nothing to do with a physical location. The row may be
send around - the automatically assigned value for the row will be the
same.

You seriously need to learn how modern RDBMS work, Celko. And unlearn
what you happen to know about magtapes.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #10

--CELKO--

>> This is complete bullshit. You and few more people may actually have
programmed against tapes, but most of us haven't. <<

That does not matter. The newbies are re-discovering tape file systems
without every having worked on them. When you were taking computer
science courses, how many times did you "invent" an algorithm that other
people already knew?

Sequential files are a very natural way to look at data. It mimics
paper forms, index cards and a single processor (person). These evolved
into notched edge cards and then sorted punch cards and then sequential
mag tapes.

here's no need to involved tapes and other forms of arcane computer technology to explain why the concept of an artificial key is popular.
Simpler and even older technique is more applicable, technique that is
still in use: pen and paper. <<

Okay, I'll start telling people that they are 100 years behind instead
of only 50 years behind the technology!

But they do not always re-discover a "pen & paper" model. When they
write nested looping cursors, this is classic COBOL tape merges.
IDENTITY has nothing to do with a physical location. <<
So if I put the same row into a new database, I will aways get the same
IDENTITY value? If I put the same row with a real key into a table, I
will aways get the same IDENTITY value?

The IDENTITY column depends on the internal state of a counter in the
PHYSICAL hardware. It is HARDWARE DEPENDENT, just like a ROWID in other
products (but not as fast).
You seriously need to learn how modern RDBMS work, Celko. <<
Would you like to see how how many of the really "Modern RDBMS" products
to which I have consulted on implementation issues? While I was on the
X3H2 Committee, a lot of my consulting work was reading and explaining
how SQL was supposed to work to new RDBMS developers.

The Nucleus engine, which uses compressed bit vectors for the entire
schema was a client. WATCOM SQL (now part of Sybase) and their single
index structure that ties PK-FK together in one structure was a client
(you still have to index the FK side in SQL Server -- the tables are
seen as disjoint, not related inside a total schema). Looked at minor
stuff at Teradata, Etc. I have been all over the insides of the Modern
RDBMS.

Right now, SQL Server is still based on a file system that uses
contigous storage and is 20+ years old in its basic architecture. So is
DB2 and Oracle really stinks. If RDBMS developers did not understand
thinking in sets or viewing the schema as an integrated whole, instead
of disjoint tables (files), why would application developers understand
non-procedural algorithms?
And unlearn what you happen to know about magtapes. <<

No, I need to know when a magtape model of data processing is
appropriate. It works very nicely in ETL jobs for a data warehouse with
parallelism and scrubbing. I also need to remember it well enough to
know it when I see it and avoid it in an OLTP environment.

--CELKO--
Please post DDL in a human-readable format and not a machne-generated
one. This way people do not have to guess what the keys, constraints,
Declarative Referential Integrity, datatypes, etc. in your schema are.
Sample data is also a good idea, along with clear specifications.
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Jul 23 '05 #11

Erland Sommarskog

--CELKO-- (re**************@earthlink.net) writes:

That does not matter. The newbies are re-discovering tape file systems
without every having worked on them. When you were taking computer
science courses, how many times did you "invent" an algorithm that other
people already knew?
Then probably that tells us something that this is a good model that is
easy to work with. If you want to convince people that it is not, you
should not talk about tapes, because they will not understand what you
are talking about.

An autonumber is a convenient way of numbering rows in the order they
are entered in a database. Sometimes, this is not very useful, sometimes
this is the only way to identify data. (Example: you load a file from
an external source. The file may be supposed to come with real keys, but
you cannot be sure that the file adheres to its supposed format. So the
only key when it comes to describe the file is the line number. (Then
after examining the data, you might be able to get the data into the
target table with some other keys.)
So if I put the same row into a new database, I will aways get the same
IDENTITY value? If I put the same row with a real key into a table, I
will aways get the same IDENTITY value?
Yes, once assigned the value will not change. Do you really think that
if take copy of a database, that all identity values gets replaced? Would
be quite a useless feature.
The IDENTITY column depends on the internal state of a counter in the
PHYSICAL hardware. It is HARDWARE DEPENDENT, just like a ROWID in other
products (but not as fast).

So, if the database is moved from one machine to another, and the last
inserted row before the move got an IDENTITY value of 987, the next
inserted after the move could get the value -234, 876 or 23445?

Nonsense. The IDENTITY value has nothing to do with the hardware, but
it does reflect in which order the rows where inserted. (Although, there
is now guarantee that this is he case.)

You seriously need to learn how modern RDBMS work, Celko. <<

Would you like to see how how many of the really "Modern RDBMS" products
to which I have consulted on implementation issues?

I don't have to see the list. I just see how many inaccurate statements
you make about SQL Server, and that tells me much more that any lists
you may produce.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #12

Similar topics