471,315 Members | 1,519 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,315 software developers and data experts.

Encoding string as double.

For database purposes, I need an easier /faster way of comparing two
strings for equivalency. I have the idea of somehow converting the
string to a unique number, storing that number in the database,
indexing the field, and going from there. I don't know exactly how to
do this, so I was hoping for some pointers!

Another possibility is to encrypt the string using MD5, which would
tend to shorten it, but MD5 doesn't produce a number-only result.

Anyhow, thanks for any help!

--Brent

Feb 16 '06 #1
3 1738
If I'm not mistaken most databases use a binary search against data -
which is what I think your asking about - so there isn't much that you
could optimize. Effective indexing is your best bet to speed things up.
You'd be suprised what an index on a varchar column.

You could try some sort of 'scoring' policy if you know what you're
looking for ahead of time. i.e.: create an extra column that has the
'score' in it (as decided by you) and then use that as an index

On the C# side, for fast complex equivelency test between long strings
you might want to try using Regular Expressions.

Cheers
Russ

Feb 16 '06 #2
wr********@gmail.com wrote:
For database purposes, I need an easier /faster way of comparing two
strings for equivalency. I have the idea of somehow converting the
string to a unique number, storing that number in the database,
indexing the field, and going from there. I don't know exactly how to
do this, so I was hoping for some pointers!

Another possibility is to encrypt the string using MD5, which would
tend to shorten it, but MD5 doesn't produce a number-only result.

Anyhow, thanks for any help!

--Brent

You could also consider storing a 'hash' of the string. but be warned:
the standard String.GetHashCode() returns a different Int32 for the same
string on frameworks v1.1 and v2 if i'm not mistaken. However, i've
seen other posts mentioning the use of a hash function in the
cryptography namespace.

Another thing to keep in mind is that a hash code isn't *guaranteed* to
be unique (so maybe it wouldn't be a good 'primary key' field). If this
is the way you want to go and it's important, I recommend you read up on
hashes before making the leap.

Scott
Feb 16 '06 #3
<wr********@gmail.com> wrote:
For database purposes, I need an easier /faster way of comparing two
strings for equivalency. I have the idea of somehow converting the
string to a unique number, storing that number in the database,
indexing the field, and going from there. I don't know exactly how to
do this, so I was hoping for some pointers!
Another possibility is to encrypt the string using MD5, which would
tend to shorten it, but MD5 doesn't produce a number-only result.


Well, you could convert the MD5 result (or, say, 64 bits of it) *into*
a number reasonably easily. However, you need to be aware that two
different strings *could* produce the same MD5 hash. The result isn't
guaranteed to be unique - it just usually will be. So, if you find a
match on the hash, you've then got to do a normal string comparison to
check.

You could try to find an implementation of String.GetHashCode, too -
but don't just call String.GetHashCode from your code, as the result
could change between framework versions (this has already caught others
out). You could have a look at the Mono implementation though, and take
that, if the licence is friendly enough.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 16 '06 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Marc | last post: by
6 posts views Thread by jmgonet | last post: by
1 post views Thread by Chris Welch | last post: by
11 posts views Thread by utabintarbo | last post: by
4 posts views Thread by shreshth.luthra | last post: by
8 posts views Thread by Erwin Moller | last post: by
5 posts views Thread by Bartholomew Simpson | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.