472,358 Members | 1,708 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,358 software developers and data experts.

Generating unique row ID ints.

I'm building a web application using sqlalchemy in my db layer.

Some of the tables require single integer primary keys which might be
exposed in some parts of the web interface. If users can guess the next
key in a sequence, it might be possible for them to 'game' or
manipulate the system in unexpected ways. I want to avoid this by
generating a random key for each row ID, and have decided to use the
same approach for all my single key tables.

Are there any best practices for implementing this?

If the random module is suitable, does anyone have any good ideas on
how this could be implemented?

Some questions which came to mind are:
Would I need to save and restore the random module state when
generating id's for each table?
What would be an appropriate seed?
How many random integers can I generate before a repeat becomes
probable?

I've got my own ideas for implementing this, but am interested to see
how/if anyone else has tackled the same problem.
-Sw.

Oct 1 '06 #1
2 2682
"Simon Wittber" <si**********@gmail.comwrites:
Some of the tables require single integer primary keys which might be
exposed in some parts of the web interface. If users can guess the next
key in a sequence, it might be possible for them to 'game' or
manipulate the system in unexpected ways. I want to avoid this by
generating a random key for each row ID, and have decided to use the
same approach for all my single key tables.
Normally primary keys are sequential but only live inside the system.
Users are not supposed to enter them.
If the random module is suitable, does anyone have any good ideas on
how this could be implemented?
The random module does not aim to be secure against knowledgeable
attackers trying to guess the output (i.e. it's not cryptographic
randomness). Use os.urandom instead.
I've got my own ideas for implementing this, but am interested to see
how/if anyone else has tackled the same problem.
The simplest thing to do is generate random strings, e.g.

key = os.urandom(16)

for a 16-byte binary string. You can of course encode it as printing
characters with your favorite binascii function. 16-byte strings like
that should be unguessable and collision-free until you have an
enormous number of them (on the order of 2**64).
Oct 1 '06 #2
The primary key is usually there to uniquely identify each row. There
are a couple of ways to generate them:

One is to always create a new integer id (sequentially or random)
when you insert a new row. This means that if you insert the same data
again, the row will always be duplicated. Usually you don't want this.
Then the primary key integer must somehow 'represent' the whole data
row. So you would want for the key to be a function of the whole data
row, such that when the data is the same the row ID is the same and
when the data is different the row ID is different. The answer to your
problem is to use a message digest (actually a message authentication
code function).

For example if your data row is in the tuple 'row' then you can do:
hex_digest_key=md5.new("|".join(row)).hexdigest(), you would have to
import the md5 module before doing this.

What that will do is your row will be contcatenated into one string
with each field separated by "|" then the md5 hash of that string will
be taken and the result returned to you in hexadecimal form. You'll get
back something like hex_digest_key='5d41402abc4b2a76b9719d911017c592'.
Then you can turn that into an integer by doing int(hex_digest_key,16)
-- and you can use that integer as your primary key.

But since one of the problems you want to solve is the user's ability
to predict the next key, you cannot just use a simple message digest
function. If the user finally figures out that you are running an MD5
algorithm, the user can also run the same algorithm and generate the
same message digest -- If that is a problem, then use a MAC (Message
Authentication Code) function. It works almost like a message digest
except you concatenate a secret key to the input so MD5 is run on the
row_as_text+secret_key.

Unless the user knows your secret key, they could not generate a
primary key from a given row even if they know you used MD5 and even if
the know the data content of your row.

NOTE: When using a message digest (and friends) it is important to
realize that there will be some collision between the keys if the
number of all possible digests (as limited by the digest algoritm) is
smaller than the number of the possible messages. In practice if you
have large enough integers (64) you shouldn't see any collisions
occur, but it is still good to be aware of them...

Hope this helps,
-Nick Vatamaniuc

Simon Wittber wrote:
I'm building a web application using sqlalchemy in my db layer.

Some of the tables require single integer primary keys which might be
exposed in some parts of the web interface. If users can guess the next
key in a sequence, it might be possible for them to 'game' or
manipulate the system in unexpected ways. I want to avoid this by
generating a random key for each row ID, and have decided to use the
same approach for all my single key tables.

Are there any best practices for implementing this?

If the random module is suitable, does anyone have any good ideas on
how this could be implemented?

Some questions which came to mind are:
Would I need to save and restore the random module state when
generating id's for each table?
What would be an appropriate seed?
How many random integers can I generate before a repeat becomes
probable?

I've got my own ideas for implementing this, but am interested to see
how/if anyone else has tackled the same problem.
-Sw.
Oct 1 '06 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Justin Lebar | last post by:
Sorry about the huge post, but I think this is the amount of information necessary for someone to help me with a good answer. I'm writing a statistical analysis program in ASP.net and MSSQL7 that...
7
by: eric.gagnon | last post by:
In a program randomly generating 10 000 000 alphanumeric codes of 16 characters in length (Ex.: "ZAZAZAZAZAZAZ156"), what would be an efficient way to ensure that I do not generate duplicates? ...
4
by: Mark | last post by:
BEGINNER QUESTION I have a table which has a compound primary key consisting of two columns. One of these columns is a foreign key which is generated in another table by an identity. I want...
6
by: Poul Møller Hansen | last post by:
I have made a stored procedure, containing this part for generating a unique reference number. SET i = 0; REPEAT SET i = i + 1; SELECT RAND() INTO reference FROM SYSIBM.SYSDUMMY1; SET...
3
by: Kilroy Programmer | last post by:
Is there a way to store a unique numeric identifier (say, for example, an int) into a TreeNode, so that when the TreeNode is checked (since CheckBoxes is enabled) the eventhandler AfterCheck() can...
16
by: Leon | last post by:
I need a program that generate 5 non-duplicates random number between 1-10 as string values store in an array. Do anybody know of any good books or websites that explain how to generator random...
11
by: garyhoran | last post by:
Hi Guys, I have a collection that contains various attributes (stuff like strings, DateTime and Timespan) . I would like to access the collection in various orders at different points in the...
8
by: DaTurk | last post by:
Hi, I was just curious how you would go about creating a unique identifier with 3 ints.
5
by: Claire | last post by:
Hi, I can imagine this question has been brought up before but I've spent all morning trying to find what I need on google without success. My application sits on Mysql or MS sql server engines...
2
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge required to effectively administer and manage Oracle...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was proposed, which integrated multiple engines and...
2
by: Matthew3360 | last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it so the python app could use a http request to get...
1
by: Matthew3360 | last post by:
Hi, I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web server and have made sure to enable curl. I get a...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
0
BLUEPANDA
by: BLUEPANDA | last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS starter kit that's not only easy to use but also...
0
by: Rahul1995seven | last post by:
Introduction: In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python has gained popularity among beginners and experts...
0
by: Ricardo de Mila | last post by:
Dear people, good afternoon... I have a form in msAccess with lots of controls and a specific routine must be triggered if the mouse_down event happens in any control. Than I need to discover what...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.