473,378 Members | 1,314 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

hash()

Hi,

For strings of > 1 character, what are the chances
that hash(st) and hash(st[::-1]) would return the
same value?

My goal is to uniquely identify multicharacter strings,
all of which begin with "/" and never end with "/".
Therefore, st != st[::-1].

Thanks,
John
Dec 5 '05 #1
5 1399
John Marshall wrote:
For strings of > 1 character, what are the chances
that hash(st) and hash(st[::-1]) would return the
same value?


Why not grab a dictionary and do the stats yourself?

--Scott David Daniels
sc***********@acm.org
Dec 5 '05 #2
Scott David Daniels wrote:
John Marshall wrote:
For strings of > 1 character, what are the chances
that hash(st) and hash(st[::-1]) would return the
same value?

Why not grab a dictionary and do the stats yourself?


I was actually interested in the mathematical/probability
side rather than the empirical w/r to the current
hash function in python. Although I imagine I could do
a brute force test for x-character strings.

John
Dec 5 '05 #3
John Marshall wrote:
I was actually interested in the mathematical/probability
side rather than the empirical w/r to the current
hash function in python. Although I imagine I could do
a brute force test for x-character strings.


Hah. No.

At least on the version I have handy (Py 2.2.3 on Itanium2), hash
returns a 64-bit value. Brute-forcing that in any reasonable length of
time is rather impossible.
Dec 5 '05 #4
John Marshall wrote:
Scott David Daniels wrote:
... Why not grab a dictionary and do the stats yourself?

I was actually interested in the mathematical/probability
side rather than the empirical w/r to the current
hash function in python.

Well, the probability depends on the universe you are choosing from.
That was why I was suggesting a dictionary: words may well have a
different distribution than arbitrary strings.

--Scott David Daniels
sc***********@acm.org
Dec 5 '05 #5
[John Marshall]
For strings of > 1 character, what are the chances
that hash(st) and hash(st[::-1]) would return the
same value?
Python's string hash algorithm is non-commutative, so a collision with
a reversed string is not likely. The exact answer depends on the
population of strings being hashed, but it's not hard to compute
collision statistics for a sampling of those strings:

collisions = len(sample) - len(set(hash(s) for s in sample))
FWIW, here is how Python computes string hash values:

static long
string_hash(PyStringObject *a)
{
register int len;
register unsigned char *p;
register long x;

len = a->ob_size;
p = (unsigned char *) a->ob_sval;
x = *p << 7;
while (--len >= 0)
x = (1000003*x) ^ *p++;
x ^= a->ob_size;
if (x == -1)
x = -2;
return x;
}
My goal is to uniquely identify multicharacter strings,
all of which begin with "/" and never end with "/".
Therefore, st != st[::-1].


Just use a set -- no string reversal is needed for detection of unique
multicharacter strings..
Raymond

Dec 5 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Murali | last post by:
I have a requirement where I have to use two unsigned ints as a key in a STL hash map. A couple of ways to do this is 1. create a struct with two unsigned ints and use that as key (write my own...
2
by: Bryan Olson | last post by:
The current Python standard library provides two cryptographic hash functions: MD5 and SHA-1 . The authors of MD5 originally stated: It is conjectured that it is computationally infeasible to...
24
by: kdotsky | last post by:
Hello, I am using some very large dictionaries with keys that are long strings (urls). For a large dictionary these keys start to take up a significant amount of memory. I do not need access to...
12
by: Arash Partow | last post by:
Hi all, I've ported various hash functions to python if anyone is interested: def RSHash(key): a = 378551 b = 63689 hash = 0
21
by: Johan Tibell | last post by:
I would be grateful if someone had a minute or two to review my hash table implementation. It's not yet commented but hopefully it's short and idiomatic enough to be readable. Some of the code...
21
by: Hallvard B Furuseth | last post by:
Is the code below valid? Generally a value must be accessed through the same type it was stored as, but there is an exception for data stored through a character type. I'm not sure if that...
139
by: ravi | last post by:
Hi can anybody tell me that which ds will be best suited to implement a hash table in C/C++ thanx. in advanced
18
by: beginner | last post by:
Hi All. I'd like to do the following in more succint code: if k in b: a=b else: a={} b=a
5
by: Jeff | last post by:
Lets say we have what I would call a "hash": var HASH =new Array(); HASH='first'; HASH='second'; HASH='third'; I'd like to return that as JSON data. Is there a direct way to do that?
1
by: sixtyfootersdude | last post by:
Good Morning! I am a perl newbie and I think that I am struggling with references. I have an array of references to hashes which I am trying to print. This is what I have: for(my...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.