473,498 Members | 310 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Storing files in a BLOB field via SQL

Hello Python fans,

I'm trying and searching for many days for an acceptable solution...
without success. I want to store files in a database using BLOB
fields. The database table has an ID field (INT, auto_increment), an
ORDER field (INT, for knowing the right order) and a "normal" BLOB
field. It is planned to split large files in 64k-parts and sort these
parts by the ORDER field.

Here's some pseudo code how I wanted to implement this in my app:

file = file_open(myFileName, read_only)
order = 0
data = file.read(65535)
while (data):
query = "INSERT INTO table (order,data) VALUES (%i,%s)", order, data
mysql_exec(query)
order = order + 1
data = file.readBlock(65535)

The main problem is the handling of the binary data. There might be
errors in the SQL syntax if some special chars (quotas etc.) appear,
or the SQL statement is incorrect because of strange chars that can't
be encoded by the current codeset. Another problem is, you can't strip
these chars because you would change the binary data or make it bigger
than 64k.

Does anybody of you have an idea?
Any suggestions would be very helpful.

Additionally, I want to compress the data and store a checksum
somewhere. Any hint (links, sites, ...) is welcome...!

Thanks in advance,
Juergen
Jul 18 '05 #1
6 5655
Juergen Gerner wrote:
field. It is planned to split large files in 64k-parts and sort these
parts by the ORDER field.
Is there a special reason why you can't store the whole file in a
single BLOB? That's what it's a BLOB for, after all... L=Large :-)

Additionally, I want to compress the data and store a checksum
somewhere. Any hint (links, sites, ...) is welcome...!


Compress: look at the zlib module (gzip-style compression)
Checksum: what kind? 32 bit-crc: zlib.crc32 md5: md5.md5

--Irmen
Jul 18 '05 #2
Hi Irmen,

first of all, thanks for your help about compression & checksum!
Is there a special reason why you can't store the whole file in a
single BLOB? That's what it's a BLOB for, after all... L=Large :-)


Yes, there's a special reason. After reading a lot of documentation I
think it's better splitting large files in little blobs. It doesn't
matter if the SQL server is on the same machine as the application,
but if both parts are on different machines, large files have to be
transmitted over the network. During this transfer the application
isn't responding, I guess. So splitting would be much more flexible.
Additionally I think, splitting files makes the database more scalable
and the space on the harddrive better used.

But the splitting isn't my main problem. It's the way I transmit the
binary data to the database via an SQL syntax. Today I saw how
PhpMyAdmin handles binary data: it codes each byte in hexadecimal
values ("\0x..."). Is there any way to do something (or similar) with
Python, or maybe with PyQt/QString/QByteArray?

Thanks in advance!
Juergen
Jul 18 '05 #3
Juergen Gerner wrote:
Is there a special reason why you can't store the whole file in a
single BLOB? That's what it's a BLOB for, after all... L=Large :-)

Yes, there's a special reason. After reading a lot of documentation I
think it's better splitting large files in little blobs. It doesn't
matter if the SQL server is on the same machine as the application,
but if both parts are on different machines, large files have to be
transmitted over the network. During this transfer the application
isn't responding, I guess. So splitting would be much more flexible.


How would splitting the file in chunks improve the responsiveness
of the application? This would only work if your app needs only
a specific chunk of the larger file to work on. If you need to read
the full file, reading 10 chunks will take even longer than reading
one big BLOB.
You may decide to do it 'in the background' using a thread, but
then again, you could just as well load the single big BLOB inside
that separate thread.
Additionally I think, splitting files makes the database more scalable
and the space on the harddrive better used.
In my humble opinion these kind of assumptions are generally false.
Let the database decide what the most efficient storage method is
for your 100 Mb BLOB. I don't want to make these kind of assumptions
about the inner workings of my database server, and I certainly don't
want to wire them into my application code... what happens when you
switch platforms/DBMS? Is your code still 'the most efficient' then?
Just my €0.02
But the splitting isn't my main problem. It's the way I transmit the
binary data to the database via an SQL syntax.


Sorry can't help you with this. I would expect the database driver module
to do the 'right' escaping.
--Irmen
Jul 18 '05 #4
"Juergen Gerner" <J.******@GernerOnline.de> wrote in message
news:5a**************************@posting.google.c om...
Hello Python fans,

The main problem is the handling of the binary data. There might be
errors in the SQL syntax if some special chars (quotas etc.) appear,
or the SQL statement is incorrect because of strange chars that can't
be encoded by the current codeset. Another problem is, you can't strip
these chars because you would change the binary data or make it bigger
than 64k.

Does anybody of you have an idea?
Any suggestions would be very helpful.
You can either use the MySQL hex literal format (x'AABBCC'...) or use the
Python DB API which will handle the parameter conversion for you...

In the first case your query becomes somethings like:

query = "INSERT INTO table (order,data) VALUES (%i,x'%s')" % (order,
data.encode('hex'))

In the second, preferable version you use something like:

cur = conn.cursor()
cur.execute("INSERT INTO table (order,data) VALUES (?,?)", (order, data))

and the DBAPI/Database driver takes care of the rest.

Additionally, I want to compress the data and store a checksum
somewhere. Any hint (links, sites, ...) is welcome...!


compressed = data.encode('zip') # Compress the data
Mike.
Jul 18 '05 #5
Juergen Gerner wrote:

But the splitting isn't my main problem. It's the way I transmit the
binary data to the database via an SQL syntax. Today I saw how
PhpMyAdmin handles binary data: it codes each byte in hexadecimal
values ("\0x..."). Is there any way to do something (or similar) with
Python, or maybe with PyQt/QString/QByteArray?


Don't know if this is of help to you, but i use the following
to insert binary data into a MS-SQL Db with ADO.
Suppose var "rawdata" contains the binary data:

def bcd2str(bcs):
""" converts a BCD coded string to a ascii coded string

Note: does also work for all hex values, ie. '\x2d' """

out = ''
for c in bcs:
out = out + (hex(ord(c))[2:]).zfill(2)
return out
def str2hex(s):
""" converts binary byte (hex 0x00 - 0xff)
data in a python string into format needed to
insert into binary datatype on sql server """

return '0x' + bcd2str(s)
insertstring = "insert into foo (sID, RawData) VALUES (%s, %s)" \
% (str2hex(rawdata))
adoconn.Execute(insertstring)
regards,
Bruno
Jul 18 '05 #6
[J.******@GernerOnline.de (Juergen Gerner)]
I want to store files in a database using BLOB [...]Does anybody of you have an idea?


Maybe its a version issue. I recently grabbed the new version 1.0.0 of
MySQLdb. In the readme.html of the Win binary package you will find
this note:

"""
MySQL-Python 1.0.0 for win32 Notes:
June 28 2004
I needed to get mysql-python working for win32, so I compiled it. I
know a lot of people are looking for this, so enjoy... With 0.9.2,
BLOBs weren't working properly for me, ...
"""

Second:
Skimming over the docs I noticed that the Python API converts BLOBs to
array. Don't know if this hint is of significance in your case.

Hope it helps,
Martin
Jul 18 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
8181
by: Tony | last post by:
Hi, I have dynamically created a PDF document in memory as a FileOutputStream Now I have to get it into a DB2 table, storing it as a BLOB. The table has a document id, document name, some date...
12
5281
by: Ryan Stewart | last post by:
I am in extremely urgent need (by tomorrow) of a way to store files in and retrieve files from an Oracle database using TopLink as an intermediary. I have the JSPs for it, and it works for small...
3
4697
by: hamvil79 | last post by:
I'm implementig a java web application using MySQL as database. The main function of the application is basically to redistribuite documents. Those documents (PDF, DOC with an average size around...
6
3181
by: (PeteCresswell) | last post by:
User wants to go this route instead of storing pointers in the DB and the documents outside. Only time I tried it was with only MS Word docs - and that was a loooong time ago - and it seemed to...
0
2683
by: NM | last post by:
Hello, I've got a problem inserting binary objects into the postgres database. I have binary objects (e.g. images or smth else) of any size which I want to insert into the database. Funny is it...
4
3908
by: lorirobn | last post by:
Hi, I need to add photos to my database. Back End is on MS SQL Server (I believe 2000), and Front End is on MS Access. I have read about storing the photos as BLOBS, but I am not sure how to...
5
16301
by: bhodgins | last post by:
Hi, I am new on here, and had a newbie question that I am stumped with. I am not new to access, but am new to VB. I am trying to export BLOBs from a field called photo to external jpeg files. I...
221
366941
Atli
by: Atli | last post by:
You may be wondering why you would want to put your files “into” the database, rather than just onto the file-system. Well, most of the time, you wouldn’t. In situations where your PHP application...
3
3304
by: Annonymous Coward | last post by:
I remember readng that BLOBs can be stored externally (with reference to the BLOB file stored in tables instead). Does anyone have any experience doing this ? I have a few questions: 1).what...
0
7124
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7163
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
1
6884
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
5460
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
4904
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4586
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
1416
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
651
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
287
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.