473,326 Members | 2,124 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

Storing URLs for later lookup

Anyone hashing or storing URLs for later lookup? I was curious for the best
practices on storing such a wide column that needs indexing and if there
were alternatives. We have a table an are anticipating millions of rows and
want to look up the content through the URL, due to SQL server we have a 900
bytes restriction on the index. Would hashing be the way to go (and look up
via the hash) OR just limit the URL length to 450 nvarchar (900 bytes) and
put an index on the URL column.

May 25 '07 #1
3 1161
I know this doesn't sound very unique, but consider storing the first 900
bytes in the indexed column and the full url in another wider column that
isn't indexed.
Peter

--
Site: http://www.eggheadcafe.com
UnBlog: http://petesbloggerama.blogspot.com
Short urls & more: http://ittyurl.net


"Kevin" wrote:
Anyone hashing or storing URLs for later lookup? I was curious for the best
practices on storing such a wide column that needs indexing and if there
were alternatives. We have a table an are anticipating millions of rows and
want to look up the content through the URL, due to SQL server we have a 900
bytes restriction on the index. Would hashing be the way to go (and look up
via the hash) OR just limit the URL length to 450 nvarchar (900 bytes) and
put an index on the URL column.
May 25 '07 #2
Hi Kevin,

Http specification doesn't have requirement or limitation on url length,
generally, it is the webbrowser that will restrict the length of url. e.g.

#Maximum URL length is 2,083 characters in Internet Explorer
http://support.microsoft.com/kb/208427

for your scenario, I also think Peter's suggestion on indexing on the first
900 characters of the url is reasonable since indexing should be performed
on a meaningful url value rather than the hashed value. hashed value is
good for comparing or identifying. Also, if you need to use the url
later(for navigation), you have to store the complete url in another
unindexed column since hashed value is not reversable.

Sincerely,

Steven Cheng

Microsoft MSDN Online Support Lead

==================================================

Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/subscripti...ult.aspx#notif
ications.

Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/subscripti...t/default.aspx.

==================================================

This posting is provided "AS IS" with no warranties, and confers no rights.


May 28 '07 #3
Hi Kevin,

Have you got any further idea on this? If you have any other questions on
this, please feel free to post here.

Sincerely,

Steven Cheng

Microsoft MSDN Online Support Lead
This posting is provided "AS IS" with no warranties, and confers no rights.

May 30 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
by: John Welch | last post by:
Hi all. I'm creating a FE/BE database that will be used by about 6 users. As usual, I have several fields, such as "OrganizationTypeID" that will get values (via combo boxes in forms) from separate...
10
by: jflash | last post by:
Hello all, I feel dumb having to ask this question in the first place, but I just can not figure it out. I am wanting to set my site up using dynamic urls (I'm assuming that's what they're...
4
by: Simon Harris | last post by:
Hi All, I am trying to store an HTML string within an XML node, using: Node.InnerText = strContent Problem is, the HTML ends up like this, when its in the XML:...
13
by: Josip | last post by:
I'm trying to limit a value stored by object (either int or float): class Limited(object): def __init__(self, value, min, max): self.min, self.max = min, max self.n = value def...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.