473,695 Members | 2,039 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Storing files in a BLOB field via SQL

Hello Python fans,

I'm trying and searching for many days for an acceptable solution...
without success. I want to store files in a database using BLOB
fields. The database table has an ID field (INT, auto_increment) , an
ORDER field (INT, for knowing the right order) and a "normal" BLOB
field. It is planned to split large files in 64k-parts and sort these
parts by the ORDER field.

Here's some pseudo code how I wanted to implement this in my app:

file = file_open(myFil eName, read_only)
order = 0
data = file.read(65535 )
while (data):
query = "INSERT INTO table (order,data) VALUES (%i,%s)", order, data
mysql_exec(quer y)
order = order + 1
data = file.readBlock( 65535)

The main problem is the handling of the binary data. There might be
errors in the SQL syntax if some special chars (quotas etc.) appear,
or the SQL statement is incorrect because of strange chars that can't
be encoded by the current codeset. Another problem is, you can't strip
these chars because you would change the binary data or make it bigger
than 64k.

Does anybody of you have an idea?
Any suggestions would be very helpful.

Additionally, I want to compress the data and store a checksum
somewhere. Any hint (links, sites, ...) is welcome...!

Thanks in advance,
Juergen
Jul 18 '05 #1
6 5685
Juergen Gerner wrote:
field. It is planned to split large files in 64k-parts and sort these
parts by the ORDER field.
Is there a special reason why you can't store the whole file in a
single BLOB? That's what it's a BLOB for, after all... L=Large :-)

Additionally, I want to compress the data and store a checksum
somewhere. Any hint (links, sites, ...) is welcome...!


Compress: look at the zlib module (gzip-style compression)
Checksum: what kind? 32 bit-crc: zlib.crc32 md5: md5.md5

--Irmen
Jul 18 '05 #2
Hi Irmen,

first of all, thanks for your help about compression & checksum!
Is there a special reason why you can't store the whole file in a
single BLOB? That's what it's a BLOB for, after all... L=Large :-)


Yes, there's a special reason. After reading a lot of documentation I
think it's better splitting large files in little blobs. It doesn't
matter if the SQL server is on the same machine as the application,
but if both parts are on different machines, large files have to be
transmitted over the network. During this transfer the application
isn't responding, I guess. So splitting would be much more flexible.
Additionally I think, splitting files makes the database more scalable
and the space on the harddrive better used.

But the splitting isn't my main problem. It's the way I transmit the
binary data to the database via an SQL syntax. Today I saw how
PhpMyAdmin handles binary data: it codes each byte in hexadecimal
values ("\0x..."). Is there any way to do something (or similar) with
Python, or maybe with PyQt/QString/QByteArray?

Thanks in advance!
Juergen
Jul 18 '05 #3
Juergen Gerner wrote:
Is there a special reason why you can't store the whole file in a
single BLOB? That's what it's a BLOB for, after all... L=Large :-)

Yes, there's a special reason. After reading a lot of documentation I
think it's better splitting large files in little blobs. It doesn't
matter if the SQL server is on the same machine as the application,
but if both parts are on different machines, large files have to be
transmitted over the network. During this transfer the application
isn't responding, I guess. So splitting would be much more flexible.


How would splitting the file in chunks improve the responsiveness
of the application? This would only work if your app needs only
a specific chunk of the larger file to work on. If you need to read
the full file, reading 10 chunks will take even longer than reading
one big BLOB.
You may decide to do it 'in the background' using a thread, but
then again, you could just as well load the single big BLOB inside
that separate thread.
Additionally I think, splitting files makes the database more scalable
and the space on the harddrive better used.
In my humble opinion these kind of assumptions are generally false.
Let the database decide what the most efficient storage method is
for your 100 Mb BLOB. I don't want to make these kind of assumptions
about the inner workings of my database server, and I certainly don't
want to wire them into my application code... what happens when you
switch platforms/DBMS? Is your code still 'the most efficient' then?
Just my €0.02
But the splitting isn't my main problem. It's the way I transmit the
binary data to the database via an SQL syntax.


Sorry can't help you with this. I would expect the database driver module
to do the 'right' escaping.
--Irmen
Jul 18 '05 #4
"Juergen Gerner" <J.******@Gerne rOnline.de> wrote in message
news:5a******** *************** ***@posting.goo gle.com...
Hello Python fans,

The main problem is the handling of the binary data. There might be
errors in the SQL syntax if some special chars (quotas etc.) appear,
or the SQL statement is incorrect because of strange chars that can't
be encoded by the current codeset. Another problem is, you can't strip
these chars because you would change the binary data or make it bigger
than 64k.

Does anybody of you have an idea?
Any suggestions would be very helpful.
You can either use the MySQL hex literal format (x'AABBCC'...) or use the
Python DB API which will handle the parameter conversion for you...

In the first case your query becomes somethings like:

query = "INSERT INTO table (order,data) VALUES (%i,x'%s')" % (order,
data.encode('he x'))

In the second, preferable version you use something like:

cur = conn.cursor()
cur.execute("IN SERT INTO table (order,data) VALUES (?,?)", (order, data))

and the DBAPI/Database driver takes care of the rest.

Additionally, I want to compress the data and store a checksum
somewhere. Any hint (links, sites, ...) is welcome...!


compressed = data.encode('zi p') # Compress the data
Mike.
Jul 18 '05 #5
Juergen Gerner wrote:

But the splitting isn't my main problem. It's the way I transmit the
binary data to the database via an SQL syntax. Today I saw how
PhpMyAdmin handles binary data: it codes each byte in hexadecimal
values ("\0x..."). Is there any way to do something (or similar) with
Python, or maybe with PyQt/QString/QByteArray?


Don't know if this is of help to you, but i use the following
to insert binary data into a MS-SQL Db with ADO.
Suppose var "rawdata" contains the binary data:

def bcd2str(bcs):
""" converts a BCD coded string to a ascii coded string

Note: does also work for all hex values, ie. '\x2d' """

out = ''
for c in bcs:
out = out + (hex(ord(c))[2:]).zfill(2)
return out
def str2hex(s):
""" converts binary byte (hex 0x00 - 0xff)
data in a python string into format needed to
insert into binary datatype on sql server """

return '0x' + bcd2str(s)
insertstring = "insert into foo (sID, RawData) VALUES (%s, %s)" \
% (str2hex(rawdat a))
adoconn.Execute (insertstring)
regards,
Bruno
Jul 18 '05 #6
[J.******@Gerner Online.de (Juergen Gerner)]
I want to store files in a database using BLOB [...]Does anybody of you have an idea?


Maybe its a version issue. I recently grabbed the new version 1.0.0 of
MySQLdb. In the readme.html of the Win binary package you will find
this note:

"""
MySQL-Python 1.0.0 for win32 Notes:
June 28 2004
I needed to get mysql-python working for win32, so I compiled it. I
know a lot of people are looking for this, so enjoy... With 0.9.2,
BLOBs weren't working properly for me, ...
"""

Second:
Skimming over the docs I noticed that the Python API converts BLOBs to
array. Don't know if this hint is of significance in your case.

Hope it helps,
Martin
Jul 18 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
8223
by: Tony | last post by:
Hi, I have dynamically created a PDF document in memory as a FileOutputStream Now I have to get it into a DB2 table, storing it as a BLOB. The table has a document id, document name, some date fields and this BLOB column that stores PDF Files. Until now, the PDF files were read off of a disk drive. The code used was: byte fileAsBytes = (byte) adminDocEvent.getFile(); which returns an object.
12
5326
by: Ryan Stewart | last post by:
I am in extremely urgent need (by tomorrow) of a way to store files in and retrieve files from an Oracle database using TopLink as an intermediary. I have the JSPs for it, and it works for small files, but larger ones like Word documents and Excel Spreadsheets give an error saying that the data is too large for the field. Can anyone help with this? Our file object has a fileData field which is an array of bytes which is mapped in TopLink to...
3
4721
by: hamvil79 | last post by:
I'm implementig a java web application using MySQL as database. The main function of the application is basically to redistribuite documents. Those documents (PDF, DOC with an average size around 2Mb) are stored in BLOB column. The amount of documents for the first year should not exceed 5/6 Giga, but I cannot make prevision for the next years. Those documents are mainly just accessed (update and delete are not so
6
3195
by: (PeteCresswell) | last post by:
User wants to go this route instead of storing pointers in the DB and the documents outside. Only time I tried it was with only MS Word docs - and that was a loooong time ago - and it seemed to me like there were performance issues at the time. How about the different types? The MS docs I would expect Access to differentiate and handle appropriately (i.e. .DOC and .XLS).. but how about ..PDF? and can I stash a .TXT document in the...
0
2701
by: NM | last post by:
Hello, I've got a problem inserting binary objects into the postgres database. I have binary objects (e.g. images or smth else) of any size which I want to insert into the database. Funny is it works for files larger than 8000 Bytes. If a file is less than 1000 Bytes I get the following message: Error message: --invalid input syntax for type oid: "\074\077......";
4
3923
by: lorirobn | last post by:
Hi, I need to add photos to my database. Back End is on MS SQL Server (I believe 2000), and Front End is on MS Access. I have read about storing the photos as BLOBS, but I am not sure how to do this with SQL Server. Does this mean store the photo as OLE image, but do something else to it to make it a "Blob"? I have also read about linking to the photo rather than storing it on
5
16353
by: bhodgins | last post by:
Hi, I am new on here, and had a newbie question that I am stumped with. I am not new to access, but am new to VB. I am trying to export BLOBs from a field called photo to external jpeg files. I have tried the MS kb 210486 and successfully got the import/export to work with a sample table, but only the first record. I do not wish to impost binary data to the database, it's already there. I simply wish to pick up the binary BLOB from the photo...
221
367506
Atli
by: Atli | last post by:
You may be wondering why you would want to put your files “into” the database, rather than just onto the file-system. Well, most of the time, you wouldn’t. In situations where your PHP application needs to store entire files, the preferred method is to save the file onto the server’s file-system, and store the physical location of the file in your database. This is generally considered to be the easiest and fastest way to store files. ...
3
3316
by: Annonymous Coward | last post by:
I remember readng that BLOBs can be stored externally (with reference to the BLOB file stored in tables instead). Does anyone have any experience doing this ? I have a few questions: 1).what are the things to watch out for (apart from obvious ones like 'file not found' type errors). 2). How may a stored proc be written to fetch the BLOB data ? (An example would be very helpful) 3). How are errors handled in the stored proc that...
0
8631
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8572
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9115
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8984
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
7661
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
4340
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4579
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3003
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
1976
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.