473,883 Members | 1,566 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Filesystem vs. Postgres for images

Hello,

I am working on web portal. There are some ads. We have about 200 000
ads. Every ad have own directory called ID, where is 5 subdirectories
with various sizes of 5 images.

Filesystem is too slow. But I don't know, if I store these images into
postgres, performace will grow.

Second question is, what kind of hardware I need for storing in DB. Now
I have Intel(R) Pentium(R) 4 CPU 1.70GHz with 512MB RAM and 120GB HDD.

thanx for advices...

miso

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05
16 8238
On Tue, 13 Apr 2004, Christopher Petrilli wrote:
2. Retrieval time is limited not by disk bandwidth, but by I/O seek
performance. More spindles = more concurrent I/O in flight. Also, this
is where SCSI takes a massive lead with tag-command-queuing.

In our case, we ended up using a three-tier directory structure, so
that we could manage the number of files per directory, and then
because load was relatively even across the top 20 "directorie s", we
split them onto 5 spindle-pairs (i.e. RAID-1). This is a place where
RAID-5 is your enemy. RAID-1, when implemented with read-balancing, is
a substantial performance increase.


Please explain why RAID 5 is so bad here. I would think that on a not
very heavily updated fs, RAID-5 would be the functional equivalent of a
RAID 0 array with one fewer disks, wouldn't it? Or is RAID 0 also a bad
idea (other than the unreliability of it) because it only puts the data on
one spindle, unlike RAID-1 which puts it on many.

In that case >2 drive RAID 1 setups might be a huge win. The linux kernel
certainly supports them, and I think some RAID cards do too.

Just wondering.
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #11

On Apr 13, 2004, at 11:27 AM, scott.marlowe wrote:
On Tue, 13 Apr 2004, Christopher Petrilli wrote:
2. Retrieval time is limited not by disk bandwidth, but by I/O seek
performance. More spindles = more concurrent I/O in flight. Also, this
is where SCSI takes a massive lead with tag-command-queuing.

In our case, we ended up using a three-tier directory structure, so
that we could manage the number of files per directory, and then
because load was relatively even across the top 20 "directorie s", we
split them onto 5 spindle-pairs (i.e. RAID-1). This is a place where
RAID-5 is your enemy. RAID-1, when implemented with read-balancing, is
a substantial performance increase.


Please explain why RAID 5 is so bad here. I would think that on a not
very heavily updated fs, RAID-5 would be the functional equivalent of a
RAID 0 array with one fewer disks, wouldn't it? Or is RAID 0 also a
bad
idea (other than the unreliability of it) because it only puts the
data on
one spindle, unlike RAID-1 which puts it on many.

In that case >2 drive RAID 1 setups might be a huge win. The linux
kernel
certainly supports them, and I think some RAID cards do too.


The issue comes down to read and write strategies. If your files are
bigger than the stripe size and begin to involve multiple drives, then
the rotational latency of each drive can come into play. This is often
hidden under caching during those wonderful comparison reviews, but
when you're talking about near random distributed access of more
information than could fit in the cache, then you have to face the
rotational issues of drives. Since the spindles are not locked
together, they drift apart in location, and you often end up with
worst-case latency in the drive subsystem. Mirroring doesn't face
this, especially when you can distribute the READS across all the
drives.

For example, if you ran triplex RAID-0, meaning 3 copies of the data,
which is often done in large environments so that you can take one copy
offline for a backup, while maintaining 2 copies online, then you can
basically handle 3 reads for the cost of 1, increasing the number of
read ops you can handle. This doesn't work with RAID-0, or RAID-5.

Chris
--
| Christopher Petrilli
| petrilli (at) amber.org
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #12
Hi,
is the file system approach really easier and faster? What if you need
to protect the image data e.g. you dont want users just to just dowload
the pictures directly from your website?

-a

Jeremiah Jahn wrote:
There has got to be some sort of standard way to do this. We have the
same problem where I work. Terabytes of images, but the question is
still sort of around "BLOBs or Files?" Our final decision was to use the
file system. We found that you didn't really gain anything by storing
the images in the DB, other than having one place to get the data from.
The file system approach is much easier to backup, because each image
can be archived separately as well as browsed by 3rd party tools.

-jj-
On Tue, 2004-04-13 at 07:40, Cott Lang wrote:

On Tue, 2004-04-13 at 01:44, Michal Hlavac wrote:

Hello,

I am working on web portal. There are some ads. We have about 200 000
ads. Every ad have own directory called ID, where is 5 subdirectories
with various sizes of 5 images.

Filesystem is too slow. But I don't know, if I store these images into
postgres, performace will grow.

Consider breaking your directories up, i.e.:

/ads/(ID % 1000)/ID

I use that for a system with several million images, works great. I
really don't think putting them in the database will do anything
positive for you. :)


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #13
Your code is retrieving the file from the file system. It doesn't have
to be accessible from the web server at all. Our current design uses a
JDBC connection to the database for the metadata (digital
signature,path, name,file type, etc..) and a SOAP call to the same server
(but doesn't have to be) to retrieve/store the image data.
-jj-
On Wed, 2004-04-14 at 08:15, Alex wrote:
Hi,
is the file system approach really easier and faster? What if you need
to protect the image data e.g. you dont want users just to just dowload
the pictures directly from your website?

-a

Jeremiah Jahn wrote:
There has got to be some sort of standard way to do this. We have the
same problem where I work. Terabytes of images, but the question is
still sort of around "BLOBs or Files?" Our final decision was to use the
file system. We found that you didn't really gain anything by storing
the images in the DB, other than having one place to get the data from.
The file system approach is much easier to backup, because each image
can be archived separately as well as browsed by 3rd party tools.

-jj-
On Tue, 2004-04-13 at 07:40, Cott Lang wrote:

On Tue, 2004-04-13 at 01:44, Michal Hlavac wrote:
Hello,

I am working on web portal. There are some ads. We have about 200 000
ads. Every ad have own directory called ID, where is 5 subdirectories
with various sizes of 5 images.

Filesystem is too slow. But I don't know, if I store these images into
postgres, performace will grow.
Consider breaking your directories up, i.e.:

/ads/(ID % 1000)/ID

I use that for a system with several million images, works great. I
really don't think putting them in the database will do anything
positive for you. :)


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

--
Jeremiah Jahn <je******@cs.ea rlham.edu>
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 23 '05 #14
On Wed, Apr 14, 2004 at 10:15:51PM +0900, Alex wrote:
Hi,
is the file system approach really easier and faster? What if you need
to protect the image data e.g. you dont want users just to just dowload
the pictures directly from your website?


It can be much faster, if implemented correctly, to put the large
files directly on the filesystem. It makes it a little harder to cluster,
but it can significantly reduce DB overhead.

There's no issue with the users downloading images directly, as you
normally wouldn't mount them directly into the URL namespace. Instead
the URL would point to a script that would lookup the image in the
database, and check permissions. If the user is allowed to load the
image the script will close it's connection to the database, and start
shoveling bytes from the filesystem to the http connection. Most
decent web application platforms have some amount of support for this
sort of thing built in.

That has a number of other advantages too - it can take a long time
for a user to download a large file, and you really don't want the
thread handling them to tie up a database connection for all that
time. If you're on a platform that supports nice things like
sendfile(2) you can even have the kernel do almost all the work.

Cheers,
Steve

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #15
hello all,

if somebody is interested in a script using blobs on pgsql and php,
leave me a mail at ka***@erdtraban t.de
this little skript can upload files to filesystem and directly into db,
release files from db to filesystem, and store files from filesystem to
database.
its tested with php 4.3 and postgres 7.1

greetings,
volker


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #16
hi all,
maybe somewhat off-topic , but here it is....

http://www.erdtrabant.de/index.php?i=500200104

a little php-script to demonstrate how to store files as blobs into
postgres (tested with v7.1) as a base for testing etc., not very
beautiful script-style, but useable
feel free to download and change what and use where ever you want

thanks for looking
volker

Development - multi.art.studi o wrote:
hello all,

if somebody is interested in a script using blobs on pgsql and php,
leave me a mail at ka***@erdtraban t.de
this little skript can upload files to filesystem and directly into
db, release files from db to filesystem, and store files from
filesystem to database.
its tested with php 4.3 and postgres 7.1

greetings,
volker


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 23 '05 #17

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
3686
by: cover | last post by:
The question is, we have two options to store images, either in a Database (MySQL, Postgres, ...) like blob data, or in the hard disk the file and the path in database. Which option is better? When? Why? Thanks you for your answers.
7
3495
by: Benoit St-Jean | last post by:
I am looking at options/ways to store 12 million gif/jpg images in a database. Either we store a link to the file or we store the image itself in the database. Images will range from 4k to 35k in size and there will be 12 millions of them (at the beginning)... I expect a 8% growth every year. We will also have to perform some cleanup jobs to delete images that are not longer referenced by the master table. We'll also have to...
6
3055
by: bissatch | last post by:
Hi, I am currently writing a news admin system. I would like to add the ability to add images to each article. What I have always done in the past is uploaded (using a form) the image to a folder on the server and then in the database table that I INSERT the news article, I'll store the path of the uploaded image. To me this seems a bad idea as if the image paths were changed on the
48
11955
by: Edwin Quijada | last post by:
Hi !! Everybody I am developing app using Delphi and I have a question: I have to save pictures into my database. Each picture has 20 o 30k aprox. What is the way more optimus? That 's table will have 500000 records around. Somebody said the best way to do that was encoder the picture to field bytea but I dont know about this. Another way is save the path to the picture file but I dont like so much because I need to write to disk by OS...
1
6300
by: Matthew Hixson | last post by:
I am currently working on a Java web application in which we are making use of the JDBC driver for Postgres 7.4.1. Part of our application allows the administrators to manage a large number of small images, most of them not exceeding 5KB. There is about a gigabyte of these small files. We're currently storing the files on disk and the other information about the file in the database (historical reasons that I won't complain about here)....
3
1931
by: Bernhard Ankenbrand | last post by:
Hi, we have a table width about 60.000.000 entrys and about 4GB storage size. When creating an index on this table the whole linux box freezes and the reiser-fs file system is corrupted on not recoverable. Does anybody have experience with this amount of data in postgres 7.4.2? Is there a limit anywhere? Thanks
0
2708
by: NM | last post by:
Hello, I've got a problem inserting binary objects into the postgres database. I have binary objects (e.g. images or smth else) of any size which I want to insert into the database. Funny is it works for files larger than 8000 Bytes. If a file is less than 1000 Bytes I get the following message: Error message: --invalid input syntax for type oid: "\074\077......";
1
1289
by: ttamilvanan81 | last post by:
Hai everyone, i am new to javascript. Now i have doing one Image gallary application. In that application i have upload two images, one for Befor image and another one for after image. All those images are stored into the filesystem, the images names only stored into the database. Left side of page: Before Pictures (One picture showing, date, description details). On picture shown (if there are other pictures, there is a button...
7
2009
by: Keith Hughitt | last post by:
Hi all, I am having trouble preloading images in a javascript application, and was wondering if anyone had any suggestions. Basically I have a bunch of images stored in a database as BLOBs. At any given point in time a subset of those images is displayed on- screen. At certains times I want to swap out those on screen with new ones from the database, and do some as seamlessly as possible. So what I've tried to do is first create Image...
0
9932
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9777
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
11109
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10833
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9558
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
7114
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5782
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5979
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
3226
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.