By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,676 Members | 2,249 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,676 IT Pros & Developers. It's quick & easy.

Generating unique row ID ints.

P: n/a
I'm building a web application using sqlalchemy in my db layer.

Some of the tables require single integer primary keys which might be
exposed in some parts of the web interface. If users can guess the next
key in a sequence, it might be possible for them to 'game' or
manipulate the system in unexpected ways. I want to avoid this by
generating a random key for each row ID, and have decided to use the
same approach for all my single key tables.

Are there any best practices for implementing this?

If the random module is suitable, does anyone have any good ideas on
how this could be implemented?

Some questions which came to mind are:
Would I need to save and restore the random module state when
generating id's for each table?
What would be an appropriate seed?
How many random integers can I generate before a repeat becomes
probable?

I've got my own ideas for implementing this, but am interested to see
how/if anyone else has tackled the same problem.
-Sw.

Oct 1 '06 #1
Share this Question
Share on Google+
2 Replies


P: n/a
"Simon Wittber" <si**********@gmail.comwrites:
Some of the tables require single integer primary keys which might be
exposed in some parts of the web interface. If users can guess the next
key in a sequence, it might be possible for them to 'game' or
manipulate the system in unexpected ways. I want to avoid this by
generating a random key for each row ID, and have decided to use the
same approach for all my single key tables.
Normally primary keys are sequential but only live inside the system.
Users are not supposed to enter them.
If the random module is suitable, does anyone have any good ideas on
how this could be implemented?
The random module does not aim to be secure against knowledgeable
attackers trying to guess the output (i.e. it's not cryptographic
randomness). Use os.urandom instead.
I've got my own ideas for implementing this, but am interested to see
how/if anyone else has tackled the same problem.
The simplest thing to do is generate random strings, e.g.

key = os.urandom(16)

for a 16-byte binary string. You can of course encode it as printing
characters with your favorite binascii function. 16-byte strings like
that should be unguessable and collision-free until you have an
enormous number of them (on the order of 2**64).
Oct 1 '06 #2

P: n/a
The primary key is usually there to uniquely identify each row. There
are a couple of ways to generate them:

One is to always create a new integer id (sequentially or random)
when you insert a new row. This means that if you insert the same data
again, the row will always be duplicated. Usually you don't want this.
Then the primary key integer must somehow 'represent' the whole data
row. So you would want for the key to be a function of the whole data
row, such that when the data is the same the row ID is the same and
when the data is different the row ID is different. The answer to your
problem is to use a message digest (actually a message authentication
code function).

For example if your data row is in the tuple 'row' then you can do:
hex_digest_key=md5.new("|".join(row)).hexdigest(), you would have to
import the md5 module before doing this.

What that will do is your row will be contcatenated into one string
with each field separated by "|" then the md5 hash of that string will
be taken and the result returned to you in hexadecimal form. You'll get
back something like hex_digest_key='5d41402abc4b2a76b9719d911017c592'.
Then you can turn that into an integer by doing int(hex_digest_key,16)
-- and you can use that integer as your primary key.

But since one of the problems you want to solve is the user's ability
to predict the next key, you cannot just use a simple message digest
function. If the user finally figures out that you are running an MD5
algorithm, the user can also run the same algorithm and generate the
same message digest -- If that is a problem, then use a MAC (Message
Authentication Code) function. It works almost like a message digest
except you concatenate a secret key to the input so MD5 is run on the
row_as_text+secret_key.

Unless the user knows your secret key, they could not generate a
primary key from a given row even if they know you used MD5 and even if
the know the data content of your row.

NOTE: When using a message digest (and friends) it is important to
realize that there will be some collision between the keys if the
number of all possible digests (as limited by the digest algoritm) is
smaller than the number of the possible messages. In practice if you
have large enough integers (64) you shouldn't see any collisions
occur, but it is still good to be aware of them...

Hope this helps,
-Nick Vatamaniuc

Simon Wittber wrote:
I'm building a web application using sqlalchemy in my db layer.

Some of the tables require single integer primary keys which might be
exposed in some parts of the web interface. If users can guess the next
key in a sequence, it might be possible for them to 'game' or
manipulate the system in unexpected ways. I want to avoid this by
generating a random key for each row ID, and have decided to use the
same approach for all my single key tables.

Are there any best practices for implementing this?

If the random module is suitable, does anyone have any good ideas on
how this could be implemented?

Some questions which came to mind are:
Would I need to save and restore the random module state when
generating id's for each table?
What would be an appropriate seed?
How many random integers can I generate before a repeat becomes
probable?

I've got my own ideas for implementing this, but am interested to see
how/if anyone else has tackled the same problem.
-Sw.
Oct 1 '06 #3

This discussion thread is closed

Replies have been disabled for this discussion.