473,624 Members | 2,252 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Drawbacks of using BYTEA for PK?

Are there any drawbacks of using BYTEA for PK compared to using a
primitive/atomic data types like INT/SERIAL? (like significant
performance hit, peculiar FK behaviour, etc).

I plan to use BYTEA for GUID (of course, temporarily I hope, until
PostgreSQL officially supports GUID data type), since it seems to be the
most convenient+comp act compared to other data types currently
available. I use GUIDs for most PK columns.

--
dave

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 12 '05 #1
32 3858
David Garamond wrote:
Are there any drawbacks of using BYTEA for PK compared to using a
primitive/atomic data types like INT/SERIAL? (like significant
performance hit, peculiar FK behaviour, etc).

I plan to use BYTEA for GUID (of course, temporarily I hope, until
PostgreSQL officially supports GUID data type), since it seems to be
the most convenient+comp act compared to other data types currently
available. I use GUIDs for most PK columns.


GUID? Isn't that really nothing more than an MD5 on a sequence?

SELECT (MD5(NEXTVAL('m y_table_seq'))) AS my_guid;

Since 7.4 has the md5 function built-in, there's your support ;-)
Now just add that to your table's trigger and your good to go.
I think in MS products, they format the guid with dashes in the
style 8-4-4-4-12 but it still looks to me like a 32 character hex
string or a 16 byte (128 bit) value. You can choose to store the
value however you like, I'm not sure what would be optimal, but
bits are bits, right?

Dante

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 12 '05 #2
On Sunday 11 January 2004 22:05, D. Dante Lorenso wrote:
David Garamond wrote:
Are there any drawbacks of using BYTEA for PK compared to using a
primitive/atomic data types like INT/SERIAL? (like significant
performance hit, peculiar FK behaviour, etc).

I plan to use BYTEA for GUID (of course, temporarily I hope, until
PostgreSQL officially supports GUID data type), since it seems to be
the most convenient+comp act compared to other data types currently
available. I use GUIDs for most PK columns.


GUID? Isn't that really nothing more than an MD5 on a sequence?

SELECT (MD5(NEXTVAL('m y_table_seq'))) AS my_guid;


I think the point of a GUID is it's supposed to be unique across any number of
machines without requiring those machines to coordinate their use of GUID
values.

I think the typical approach is to use something like:
hash_fn( network_mac_add ress || other_hopefully _unique_constan t ||
sequence_val )
and make sure that the probability of getting collisions is acceptably low.

ISTR a long discussion a year or two back on one of the lists, for those that
are interested.
--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #3
D. Dante Lorenso wrote:
GUID? Isn't that really nothing more than an MD5 on a sequence?

SELECT (MD5(NEXTVAL('m y_table_seq'))) AS my_guid;
I know there are several algorithms to generate GUID, but this is
certainly inadequate :-) You need to make sure that the generated GUID
will be unique throughout cyberspace (or to be more precise, the GUID
should have a very very small chance of colliding with other people's
GUID). Even OID is not a good seed at all.

Perhaps I can make a GUID by MD5( two random numbers || a timestamp || a
unique seed like MD5 of '/sbin/ifconfig' output)...
Since 7.4 has the md5 function built-in, there's your support ;-)


Well, until there's a GUID or INT128 or BIGBIGINT builtin type I doubt
many people will regard PostgreSQL as fully supporting GUID. I believe
there's the pguuid project in GBorg site that does something like this.

--
dave
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #4
David Garamond <li***@zara.6.i sreserved.com> writes:
Perhaps I can make a GUID by MD5( two random numbers || a timestamp || a
unique seed like MD5 of '/sbin/ifconfig' output)...


Adding an MD5 hash contributes *absolutely zero*, except waste of space,
to any attempt to make a GUID. The hash will add no uniqueness that was
not there before.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 12 '05 #5
Tom Lane wrote:
David Garamond <li***@zara.6.i sreserved.com> writes:
Perhaps I can make a GUID by MD5( two random numbers || a timestamp || a
unique seed like MD5 of '/sbin/ifconfig' output)...


Adding an MD5 hash contributes *absolutely zero*, except waste of space,
to any attempt to make a GUID. The hash will add no uniqueness that was
not there before.


Of course, in the above case, MD5 is used to compress the "uniqueness "
(which should be more than 128-bit, comprised of: a) [good] random
number; b) timestamp; c) a "node ID" element, either from /sbin/config
output which contain MAC address, or from the hash of harddisk content,
etc) into a 128-bit space.

--
dave
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 12 '05 #6
Tom Lane wrote:
Adding an MD5 hash contributes *absolutely zero*, except waste of space,
to any attempt to make a GUID. The hash will add no uniqueness that was
not there before.

The cool thing about a 'GUID' (or in my example a hashed sequence number
[sure
toss in some entropy if you want it]) is that if you happen to reference
that
value as a primary key on a table, the URL that passes the argument can not
be guessed at easily. For example using a sequence:

http://domain.com/application/load_r...tomer_id=12345

Then, users of the web will assume that you have at most 12345
customers. And
they can try to look up information on other customers by doing:

http://domain.com/application/load_r...tomer_id=12346
http://domain.com/application/load_r...tomer_id=12344

....basically walking the sequence. Sure, you will protect against this with
access rights, BUT...seeing the sequence is a risk and not something you
want
to happen. NOW, if you use a GUID:
http://domain.com/application/load_r...3-1b8ce9dcccc1

Right, so now try to guess the next value in this sequence. It's a little
more protective and obfuscated (an advantage in using GUIDs).

Dante


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #7
On Mon, 12 Jan 2004, D. Dante Lorenso wrote:
Tom Lane wrote:
Adding an MD5 hash contributes *absolutely zero*, except waste of space,
to any attempt to make a GUID. The hash will add no uniqueness that was
not there before.

The cool thing about a 'GUID' (or in my example a hashed sequence number
[sure
toss in some entropy if you want it]) is that if you happen to reference
that
value as a primary key on a table, the URL that passes the argument can not
be guessed at easily. For example using a sequence:

http://domain.com/application/load_r...tomer_id=12345

Then, users of the web will assume that you have at most 12345
customers. And
they can try to look up information on other customers by doing:

http://domain.com/application/load_r...tomer_id=12346
http://domain.com/application/load_r...tomer_id=12344

...basically walking the sequence. Sure, you will protect against this with
access rights, BUT...seeing the sequence is a risk and not something you
want
to happen. NOW, if you use a GUID:


Security != obscurity.

While using GUIDs may make it harder to get hacked, it in no way actually
increases security. Real security comes from secure code, period.
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 12 '05 #8
"scott.marl owe" <sc***********@ ihs.com> writes:
they can try to look up information on other customers by doing:

http://domain.com/application/load_r...tomer_id=12346
http://domain.com/application/load_r...tomer_id=12344

...basically walking the sequence. Sure, you will protect against this with
access rights, BUT...seeing the sequence is a risk and not something you
want
to happen. NOW, if you use a GUID:


Security != obscurity.

While using GUIDs may make it harder to get hacked, it in no way actually
increases security. Real security comes from secure code, period.


Well, uh, you're both wrong.

On the one hand if your GUIDs are just an MD5 of a sequence then they're just
as guessable as the sequence. The attacker can try MD5 of various numbers
until he finds the one he is (it's probably on the web site somewhere anyways)
and then run MD5 himself on whatever number he feels.

On the other hand it is possible to do this right. Include a secret of some
kind in the MD5 hash, something that's not publically available. That secret
is in essence the password to the scheme. Now it's not really "obscurity" any
more than any password based scheme is "security through obscurity".

However even that isn't ideal, since you have to be able to change the
password periodically in case it's leaked. I believe there are techniques to
solve this though I can' think of any off the top of my head.

But if your only threat model is people attacking based on the publicly
visible information then an MD5 of the combination of a sequence and a secret
is a perfectly reasonable approach.

In the past I happily exposed the sequence but used an MD5 of the sequence and
a secret as a protection against spoofing. I find exposing the sequence is
very convenient for programming and debugging problems. Spoofing is a serious
security hazard, but worrying about leaking information like the size of the
customer database is usually a sign of people hoping for security through
obscurity.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #9
Greg Stark wrote:
On the other hand it is possible to do this right. Include a secret of some
kind in the MD5 hash, something that's not publically available. That secret
is in essence the password to the scheme. Now it's not really "obscurity" any
more than any password based scheme is "security through obscurity".

However even that isn't ideal, since you have to be able to change the
password periodically in case it's leaked. I believe there are techniques to
solve this though I can' think of any off the top of my head.

But if your only threat model is people attacking based on the publicly
visible information then an MD5 of the combination of a sequence and a secret
is a perfectly reasonable approach.


We're originally talking about using MD5 as a means to generate unique
ID right (and not to store password hash to be checked against later)?

Then this "secret key" is unnecessary. Just get some truly random bits
(if the number of bits is 128, then you can use it as it is. If the
number of bits is > 128, you can hash it using MD5 to get 128 bit. If
the number of bits is < 128, you're "screwed" anyway :-)

--
dave
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 12 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
21944
by: Alvar Freude | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I want to change a column from text to bytea; since it seems that alter table can't change the column type, i have to add a temporary column and copy the data from the old one to the new, delete the old and rename the new.
4
3660
by: David Garamond | last post by:
May I request that connectby() supports BYTEA keys too? My keys are GUID (16-byte stored in BYTEA). In this case, branch_delim does not make sense because the keys should be fixed-length anyway, unless if connectby() also wants to support outputing the branch as encoded text. Btw, is recursive join (CONNECT BY ...) in SQL standard? (I have a copy of the 1992 draft and it doesn't seem to be there). -- dave
1
6253
by: Matthew Hixson | last post by:
I am currently working on a Java web application in which we are making use of the JDBC driver for Postgres 7.4.1. Part of our application allows the administrators to manage a large number of small images, most of them not exceeding 5KB. There is about a gigabyte of these small files. We're currently storing the files on disk and the other information about the file in the database (historical reasons that I won't complain about here)....
7
2772
by: C G | last post by:
Dear All, What's the best way to store jpgs in postgresql to use in a web page? I tried to use large objects, but how would you extract them from a table to be viewed in a web-page without having to write them to a scratch file somewhere first? Thanks
2
2662
by: Carlos | last post by:
Do I need to use the -b option in pg_dump to dump bytea fields? For a while now I have been routinely using the -b option in pg_dump to back up, restore, and copy my databases because I thought that this was necessary to dump bytea fields. I am not using other types of large objects in my database. Recently, I tried to dump a database with this option and the dump failed; however I can dump the database without the -b option and my...
7
4445
by: Dennis Gearon | last post by:
when bytea, text, and varchar(no limit entered) columns are used, do they ALWAYS use an extra table/file? Or do they only do it after a certain size of input? Also, if I wanted to put a *.pdf file in a bytea column, what functions do I use to escape any characters in it? ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster
4
11424
by: Jerry LeVan | last post by:
Hi, I am adding image and large object support in my Cocoa postgresql browser. Are there going to be any enhanced bytea support functions coming along? It seems sorta silly to have to write customized C code to import a file into a bytea field.
0
4025
by: Oliver Nolden | last post by:
Hi everyone, I have a table with a bytea-column: CREATE TABLE picture( id int primary key, preview bytea NOT NULL); How can I insert a value in the bytea-column 'preview'? The function 'lo_import()' does only work
4
4480
by: Együd Csaba | last post by:
Hi, the restoration of a dump stops at the line above. The dump was created with pgsql 7.3.2 and I need to pump it into a 7.4.3 one. Should anybody tell me what the problem can be and how I can solve it. (There are double apostophes many times in the string - is it normal??? Besides of the field separator of course...) Many thanks, Csaba Együd
0
8242
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8177
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8629
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8341
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8488
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6112
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4183
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1793
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1488
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.