473,799 Members | 2,764 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Binary data in PostgreSQL

Hi all,

AFAIK it is possible for columns to be very large, up to about 2 GB. Are
there any hints or experiences about storing binary data (jpg-images,
pdf-documents) in PostgrreSQL with or without the complicated lo-stuff?

Of course it's in many cases a good approach to store those files simply
in the file system but there's always a risk of running out of sync
(filesystem and tables), e.g. by deleting files and not deleting the
table rows with the filenames.

Any ideas and comments welcome.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 22 '05 #1
4 4042
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Well, you have the option to use the bytea types. It's nice to have the stuff
in the database. I wrote an application (quite a while back) where I had to
store documents to the database. At that time I switched to mysql, not
because it's better, but because it handles binary data quite easily. Since
this is a couple of years ago it's quite likely that PostgreSQL byte handling
improved a lot (back then PG was still at a pre 7 version).
Meanwhile I'm developing an application that stores huge amounts of documents.
Since this time I need the features of a real database, like triggers etc.
I'm using PG now. The amount of data I'm storing now is huge, therefor I
decided to store the files in a directory tree. I'm storing about 40000
documents around 100k each a week, thats about 12GB a month. The file storage
directory is handled by the server side of the application and is invisible
to normal users. The main reasons for using the filesystem instead of the
database were:

1) a lot of documents
2) due to 1) I'm able to spread the documents over several
harddrives/filesystems - so this solves the storage space problem (unless you
want to use LVM or similar)
3) much easier handling on the application side. I can use rsync, ftp, scp or
whatever else to transfer files. Also most programing languages are quite
good at handling normal files.
4) You never run into a file problem regarding file type or size, at least not
up to the limit of the operating system you're on.
5) You can use conventional backup methods without interrupting the database
server. It's also much faster to restore the whole system on a different
machine - dump/restore the database and copy your directory tree.

You have some drawbacks though:
1) your application has to make sure that the referencing records and the file
storage is in sync. If there's a bug in the application you can totally
scramble the storage.
2) If your server has a filesystem problem you could lose files due to fsck,
which again would "unsync" the data from the files.

Basically it all depends on the structure of your application. If you can
guarantee that only the application has access to the file storage you can
control the synchronization problem. If not I'd probably go the extra mile to
store the stuff as blobs.
There is another option you might want to consider. If you're storing
relatively small files, like the images for a website, you could even go with
a normal "text" field and base64 encode the data before storing (as well as
decoding it after retrieval).

My $0.02

On Saturday 21 February 2004 10:53 am, Holger Marzen wrote:
Hi all,

AFAIK it is possible for columns to be very large, up to about 2 GB. Are
there any hints or experiences about storing binary data (jpg-images,
pdf-documents) in PostgrreSQL with or without the complicated lo-stuff?

Of course it's in many cases a good approach to store those files simply
in the file system but there's always a risk of running out of sync
(filesystem and tables), e.g. by deleting files and not deleting the
table rows with the filenames.

Any ideas and comments welcome.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


- --
UC

- --
Open Source Solutions 4U, LLC 2570 Fleetwood Drive
Phone: +1 650 872 2425 San Bruno, CA 94066
Cell: +1 650 302 2405 United States
Fax: +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAN8g4jqG XBvRToM4RAreNAJ wMwT1Twtg9c35rp 1SgaghKU7XDiQCg 0bQM
Z7cbgYmGYtjkCFn JfSL3tm8=
=3yhL
-----END PGP SIGNATURE-----
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 22 '05 #2
It's not the most effective use of space, and I'm sure not the best way
to do it, but I store such data as base64 encoded text. Works very
well for my needs, in that regard.

Gavin

Holger Marzen wrote:
Hi all,

AFAIK it is possible for columns to be very large, up to about 2 GB. Are
there any hints or experiences about storing binary data (jpg-images,
pdf-documents) in PostgrreSQL with or without the complicated lo-stuff?

Of course it's in many cases a good approach to store those files simply
in the file system but there's always a risk of running out of sync
(filesystem and tables), e.g. by deleting files and not deleting the
table rows with the filenames.

Any ideas and comments welcome.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 22 '05 #3
Hello,

I would use large objects. Easy to access, easy to use.

J
Gavin M. Roy wrote:
It's not the most effective use of space, and I'm sure not the best way
to do it, but I store such data as base64 encoded text. Works very
well for my needs, in that regard.

Gavin

Holger Marzen wrote:
Hi all,

AFAIK it is possible for columns to be very large, up to about 2 GB. Are
there any hints or experiences about storing binary data (jpg-images,
pdf-documents) in PostgrreSQL with or without the complicated lo-stuff?

Of course it's in many cases a good approach to store those files simply
in the file system but there's always a risk of running out of sync
(filesystem and tables), e.g. by deleting files and not deleting the
table rows with the filenames.

Any ideas and comments welcome.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 22 '05 #4
bytea's suck for large data. I think my processes ate up 5x the size of
the large objects just trying to select them from the database. Anything
over 10 Meg it usually isn't useful for.

Jon

On Sat, 21 Feb 2004, Joshua Drake wrote:
Hello,

I would use large objects. Easy to access, easy to use.

J
Gavin M. Roy wrote:
It's not the most effective use of space, and I'm sure not the best way
to do it, but I store such data as base64 encoded text. Works very
well for my needs, in that regard.

Gavin

Holger Marzen wrote:
Hi all,

AFAIK it is possible for columns to be very large, up to about 2 GB. Are
there any hints or experiences about storing binary data (jpg-images,
pdf-documents) in PostgrreSQL with or without the complicated lo-stuff?

Of course it's in many cases a good approach to store those files simply
in the file system but there's always a risk of running out of sync
(filesystem and tables), e.g. by deleting files and not deleting the
table rows with the filenames.

Any ideas and comments welcome.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 22 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
9324
by: bissatch | last post by:
Hi, I am trying to write script that is run when a form is submitted. The form contains an image input field and when submitted, the image is uploaded, resized and added as binary information to a db table. Please note, I am using a PostgreSQL database I have written all the code out below that deals with the submission processing:
2
2503
by: Gabriele Bartolini | last post by:
Hi guys, just a quick and probably stupid question. When I make a cluster based on an index on a table, how can I remove it later? Should I 'create' a new one by using the primary key index? Also, can I have more than 1 cluster on my table? Let me explain ... can I put for instance the record with, let's say, age between 10 and 20 on a specific cluster, from 20 to 30 on another one, and so on?
0
1113
by: David Garamond | last post by:
I'm making a "relocatable" Postgres binary distribution for my clients. Everything goes into postgresql-7.4.1/ directory, including libraries and binaries. This will be installed by a non-privileged user under his own home directory. The goal is that they could just extract the tarball, adjust PATH if he wants, and then just run initdb + pg_ctl start. For binaries like bin/psql or bin/pg_vacuum that needs libpq.so, I've created a C...
2
1904
by: Ron Snyder | last post by:
I'm attempting to use spamassassin 3.0 (beta) with an SQL backend, and have identified one performance gain so far that makes PostgreSQL a good (IMO) candidate for the backend. I need some advice though on another aspect-- instead of storing each token as text in the database, the token is being sha1()'d and then possibly pack()'d. That's all just background for my real question-- is there anything in the standards (or elsewhere) that...
1
2730
by: Jerry LeVan | last post by:
I am getting ready to try to add image viewing into my postgresql browser... I am probably going to start with storing a thumbnail and a URL for the actual image in the table. Is there any demo code for loading binary (actually images...) data into a bytea field. The data is going to come from a file....
1
1576
by: Igor Shevchenko | last post by:
Hi, I use binary mode for sending params and receiving data using libpq with protocol v3; PostgreSQL version is 7.4.3. Here, varchar-s are returned in binary mode; they're generated from internal structs by "array_send" function in src/backend/utils/adt/arrayfuncs.c. PQftype(..) returns 1015 for those columns. I couldn't find any #define for this OID -- does this mean it can change in future pg releases ? Also, can I safely use the...
1
1095
by: David Garamond | last post by:
When I distribute a binary-only distribution of Postgre to a client, can I exclude some parts of Postgres (to make it smaller), e.g. documentation (all of doc/), some PL's, or even psql, initdb, pg_dump, etc.? -- dave
2
5271
by: Bastian Voigt | last post by:
Hallo List! I just found this old posting on google. Now my question is how can I read an integer value from the PGresult using the binary format? Can someone plz gimme a code example? (please mail to me, because I have not subscribed to the list) Thanks a bunch! Now here's the old posting: On Monday 27 October 2003 09:15, Tomasz Myrta wrote:
4
3797
by: Dmitry Teslenko | last post by:
Hello! I have some postgresql database that stores binary data (windows/linux executables). Field type I use for this is bytea. Also I have php script that allows users download these binaries via their web browser. That stuff was running on ubuntu 7.10 with postgresql 8.2 and most recent php version. That was written like this:
0
9687
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9541
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
10225
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10027
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7564
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6805
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5463
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5585
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
2938
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.