473,839 Members | 1,631 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

hash from long url to short url

joe
Hello anyone knows how to write a funtion to genereate a tiny url with
letters and numbers only. Something almost always unique. THanks.
Feb 22 '08 #1
11 4408

"joe" <jc*****@gmail. comwrote in message
news:af******** *************** ***********@m23 g2000hsc.google groups.com...
Hello anyone knows how to write a funtion to genereate a tiny url with
letters and numbers only. Something almost always unique. THanks.
It's inherently impossible to collapse 36^N unique URLs to 36^(N/4) unique
tiny urls.
However there's a hash function on my websites. It's in one of the free
chapters under Basic Algorithms. I recommend that you split the string into
2, and generate 2 unsigned longs. The chance of a collision is so low as to
be negligible. Then use the modulus operation to reduce to alpahanumeric.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Feb 22 '08 #2
Malcolm McLean wrote:
>
"joe" <jc*****@gmail. comwrote in message
news:af******** *************** ***********@m23 g2000hsc.google groups.com...
>Hello anyone knows how to write a funtion to genereate a tiny url
with letters and numbers only. Something almost always unique.
THanks.
It's inherently impossible to collapse 36^N unique URLs to 36^(N/4)
unique tiny urls.
However there's a hash function on my websites. It's in one of the
free chapters under Basic Algorithms. I recommend that you split the
string into 2, and generate 2 unsigned longs. The chance of a
collision is so low as to be negligible. Then use the modulus
operation to reduce to alpahanumeric.
Also a Google search seems to show up a lot of TinyURL generators though
not, AFAICS, in C. Maybe the OP can study the code and rewrite it in C.

Feb 22 '08 #3
"Malcolm McLean" <re*******@btin ternet.comwrite s:
"joe" <jc*****@gmail. comwrote in message
news:af******** *************** ***********@m23 g2000hsc.google groups.com...
>Hello anyone knows how to write a funtion to genereate a tiny url with
letters and numbers only. Something almost always unique. THanks.
It's inherently impossible to collapse 36^N unique URLs to 36^(N/4)
unique tiny urls.
However there's a hash function on my websites. It's in one of the
free chapters under Basic Algorithms. I recommend that you split the
string into 2, and generate 2 unsigned longs. The chance of a
collision is so low as to be negligible.
I have a feeling this is a bad idea[1]. The Bernstein hash function
(which is the one on your) site uses unsigned long but will work just
as well with any unsigned integer type. If the OP has access to
longer integers, it seems safer to simply use a longer integer that
than generate two hashes from two parts of the string.

[1] I have no formal argument in support of this, just the feeling
that, since URLs often have similar parts you are wasting the hash
function's mixing ability if you split the string. Anyway, even if
there is no reason to worry here, why take the risk -- unless, of
course, you don't have longer integer types.

--
Ben.
Feb 22 '08 #4

"Ben Bacarisse" <be********@bsb .me.ukwrote in message
I have a feeling this is a bad idea[1]. The Bernstein hash function
(which is the one on your) site uses unsigned long but will work just
as well with any unsigned integer type. If the OP has access to
longer integers, it seems safer to simply use a longer integer that
than generate two hashes from two parts of the string.

[1] I have no formal argument in support of this, just the feeling
that, since URLs often have similar parts you are wasting the hash
function's mixing ability if you split the string. Anyway, even if
there is no reason to worry here, why take the risk -- unless, of
course, you don't have longer integer types.
You might well be right. If two URLs share the same introduction, which is
extremely plausible, then effectively you are wasting a long.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Feb 22 '08 #5
joe wrote:
>
Hello anyone knows how to write a funtion to genereate a tiny url with
letters and numbers only. Something almost always unique. THanks.
Yes. No. You're welcome.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home .att.net>
Try the download section.

--
Posted via a free Usenet account from http://www.teranews.com

Feb 22 '08 #6
joe
May be i could get the char value of each letter multiply it by the
position in the string and come up with a number. Just an idea.
On Feb 22, 2:31 pm, CBFalconer <cbfalco...@yah oo.comwrote:
joe wrote:
Hello anyone knows how to write a funtion to genereate a tiny url with
letters and numbers only. Something almost always unique. THanks.

Yes. No. You're welcome.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home .att.net>
Try the download section.

--
Posted via a free Usenet account fromhttp://www.teranews.co m
Feb 22 '08 #7
Malcolm McLean wrote, On 22/02/08 15:23:
>
"Ben Bacarisse" <be********@bsb .me.ukwrote in message
>I have a feeling this is a bad idea[1]. The Bernstein hash function
(which is the one on your) site uses unsigned long but will work just
as well with any unsigned integer type. If the OP has access to
longer integers, it seems safer to simply use a longer integer that
than generate two hashes from two parts of the string.

[1] I have no formal argument in support of this, just the feeling
that, since URLs often have similar parts you are wasting the hash
function's mixing ability if you split the string. Anyway, even if
there is no reason to worry here, why take the risk -- unless, of
course, you don't have longer integer types.
You might well be right. If two URLs share the same introduction, which
is extremely plausible, then effectively you are wasting a long.
There is also a significant possibility of two URLs sharing a
significant tail. E.g.
http://www.somesite/area/somewhere?p...t=prettyformat
--
Flash Gordon
Feb 22 '08 #8
joe wrote:
May be i could get the char value of each letter multiply it by the
position in the string and come up with a number. Just an idea.
Please don't top-post. Your replies belong following or interspersed
with properly trimmed quotes. See the majority of other posts in the
newsgroup, or:
<http://www.caliburn.nl/topposting.html >
Feb 22 '08 #9
joe <jc*****@gmail. comwrites:
Hello anyone knows how to write a funtion to genereate a tiny url with
letters and numbers only. Something almost always unique. THanks.
May be i could get the char value of each letter multiply it by the
position in the string and come up with a number. Just an idea.
[Top posting corrected]. That is one way, but there has been a lot of
study of how to turn a string into a number and your method has no
particular advantage over others that have been found to be useful.
The advice you've had to use a known hash function is good advice.

--
Ben.
Feb 23 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
14193
by: Damien Morton | last post by:
Ive been doing some investigation of python hashtables and the hash function used in Python. It turns out, that when running pystone.py, as much time is spent in string_hash() as is spent in lookdict_string(). If you look at python's hash function, shown below, you can get an idea why - a multiply is used for every character in the string. static long string_hash1(register char *p, int size)
34
14494
by: pembed2003 | last post by:
Hi All, Does C++/STL have hashtable where I can do stuff like: Hashtable h<int>; h.store("one",1); h.store("two",2); and then later retrieve them like:
2
3799
by: Bryan Olson | last post by:
The current Python standard library provides two cryptographic hash functions: MD5 and SHA-1 . The authors of MD5 originally stated: It is conjectured that it is computationally infeasible to produce two messages having the same message digest. That conjecture is false, as demonstrated by Wang, Feng, Lai and Yu in 2004 . Just recently, Wang, Yu, and Lin showed a short- cut solution for finding collisions in SHA-1 . Their result
6
8262
by: barcaroller | last post by:
I'm looking for a hash function (in C) that will convert a string of arbitrary length (but less than 1024 chars) to a reasonably-unique 16-bit short integer. Can anyone point me to such a hash function?
21
3235
by: Johan Tibell | last post by:
I would be grateful if someone had a minute or two to review my hash table implementation. It's not yet commented but hopefully it's short and idiomatic enough to be readable. Some of the code (i.e. the get_hash function) is borrowed from various snippets I found on the net. Thee free function could probably need some love. I have been thinking about having a second linked list of all entries so that the cost of freeing is in proportion to...
21
3920
by: Hallvard B Furuseth | last post by:
Is the code below valid? Generally a value must be accessed through the same type it was stored as, but there is an exception for data stored through a character type. I'm not sure if that applies in this case though: #include <limits.h> unsigned foo(void) { static const union { unsigned char str;
11
2713
by: merrittr | last post by:
Hi in the code below in the main the hash table from K&R example replaces a node when a collision occurs What I am trying to do is set it up to "chain" another structure beside the node that it collides with can you spot what I am doing wrong? zork and 122 are to 2 that collide so the "zork" node gets overwritten by 121.
1
5519
by: TheCite | last post by:
I am trying to make a function to hash passwords with. Here is the code: Option Compare Database Option Explicit 'function declarations Private Declare Function CryptAcquireContext Lib "advapi32.dll" Alias "CryptAcquireContextA" (ByRef phProv As Long, ByVal pszContainer As Any, ByVal pszProvider As Any, ByVal dwProvType As Long, ByVal dwFlags As Long) As Long
2
5836
by: joe shoemaker | last post by:
I would like to convert url into md5 hash. My question is that md5 hash will create collision at 2^64. If you do long(value,16), where value is the md5 hash string, would value returned from long(value, 16) be unique as long as md5 hashed string is unique? when you move md5 hashed string to long, where will the collision occur, at anything hash = md5.new() hash.update("some_url_") value = hash.digest() value_in_int = long(value, 16)...
0
9856
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10589
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10654
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10297
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7833
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7021
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5867
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
4066
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3136
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.