473,395 Members | 1,393 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

How to store ASCII encoded python string?

I currently have a Python program that reads in emails from a POP3
server. As soon as the message is read in it is fed directly into a
PostgreSQL database for storage. Later, it is broken down into it's
parts and displayed to the user.

My problem is that when I try to pass "\tsome text\xa7some more text\n"
into the database it gives me a unicode decode error. At this point in
the program I don't know what codec it is (I won't know that until I
break apart the message later) and it may even be binary data so I just
want to store it in the database with the escape characters, not as a
decoded/encoded string.

So, how do I store "\tsome text\xa7 some more text\n" as that instead
of:
" some text§ some more text
"

I don't have a problem escaping it so the above would look like
"\\tsome text\\xa7 some more text\\n" as long as I have a way to later
unescape it when I want to actual do something with the data.

Aug 28 '06 #1
4 4731
mi****@gmail.com wrote:
I currently have a Python program that reads in emails from a POP3
server. As soon as the message is read in it is fed directly into a
PostgreSQL database for storage. Later, it is broken down into it's
parts and displayed to the user.

My problem is that when I try to pass "\tsome text\xa7some more text\n"
into the database it gives me a unicode decode error.
"\xa7" is not a valid ASCII character, so that's not really an "ASCII
encoded" string.

looks like your database expects Unicode strings, but you're passing in
binary data. to solve this, you can:

1) change the database table to use a "blob" field instead of a text field

or

2) configure the database interface to pass 8-bit strings through to the
database engine (if possible; see the database interface documentation
for details)

or

3) convert the data to Unicode before passing it to the database
interface, and leave it to the interface to convert it to whatever
encoding your database uses:

data = ... get encoded string from email ...
text = data.decode("iso-8859-1")
... write text to database ...

</F>

Aug 28 '06 #2
In <11*********************@i42g2000cwa.googlegroups. com>, micahc wrote:
So, how do I store "\tsome text\xa7 some more text\n" as that instead
of:
" some text§ some more text
"

I don't have a problem escaping it so the above would look like
"\\tsome text\\xa7 some more text\\n" as long as I have a way to later
unescape it when I want to actual do something with the data.
In [6]: '\tsome text\xa7some more text\n'.encode('string_escape')
Out[6]: '\\tsome text\\xa7some more text\\n'

Ciao,
Marc 'BlackJack' Rintsch
Aug 28 '06 #3
Fredrik Lundh wrote:
3) convert the data to Unicode before passing it to the database
interface, and leave it to the interface to convert it to whatever
encoding your database uses:

data = ... get encoded string from email ...
text = data.decode("iso-8859-1")
... write text to database ...
Wouldn't that have to assume that all incoming data is in iso-8859-1?
If someone sends me an email with chinese characters would that still
work (I don't know the character set at data insert time)?
Marc 'BlackJack' Rintsch wrote:
In [6]: '\tsome text\xa7some more text\n'.encode('string_escape')
Out[6]: '\\tsome text\\xa7some more text\\n'
Thanks, I think this is what I will end up doing just for simplicity,
though I'm still curious about the above question.

Aug 28 '06 #4
mi****@gmail.com wrote:
Fredrik Lundh wrote:
>3) convert the data to Unicode before passing it to the database
interface, and leave it to the interface to convert it to whatever
encoding your database uses:

data = ... get encoded string from email ...
text = data.decode("iso-8859-1")
... write text to database ...

Wouldn't that have to assume that all incoming data is in iso-8859-1?
If someone sends me an email with chinese characters would that still
work (I don't know the character set at data insert time)?
if you're reading mail, chances are that you know the encoding (it's
specified in the message headers).

or are you saying that you're treating the mail as binary data? if so,
why are you trying to store that in a *text* field in the database?

</F>

Aug 28 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Robin Siebler | last post by:
I have no idea what is causing this error, or how to fix it. The full error is: Traceback (most recent call last): File "D:\ScriptRuntime\PS\Automation\Handlers\SCMTestToolResourceToolsBAT.py",...
1
by: enrique | last post by:
Our server-side software is reading in Big5-encoded data as ASCII when the web pages are generated. It seems to work most of the time, since the HTML meta tag is declaring Big5 as the charset. ...
4
by: webdev | last post by:
lo all, some of the questions i'll ask below have most certainly been discussed already, i just hope someone's kind enough to answer them again to help me out.. so i started a python 2.3...
24
by: ChaosKCW | last post by:
Hi I am reading from an oracle database using cx_Oracle. I am writing to a SQLite database using apsw. The oracle database is returning utf-8 characters for euopean item names, ie special...
5
by: Mike Currie | last post by:
Can anyone explain why I'm getting an ascii encoding error when I'm trying to write out using a UTF-8 encoder? Thanks Python 2.4.3 (#69, Mar 29 2006, 17:35:34) on win32 Type "help",...
2
by: joakim.hove | last post by:
Hello, I am having great problems writing norwegian characters æøå to file from a python application. My (simplified) scenario is as follows: 1. I have a web form where the user can enter his...
19
by: Thomas W | last post by:
I'm getting really annoyed with python in regards to unicode/ascii-encoding problems. The string below is the encoding of the norwegian word "fødselsdag". I stored the string as "fødselsdag"...
3
by: ddtl | last post by:
Hello everybody, I want to create a script which reads files in a current directory and renames them according to some scheme. The file names are in Russian - sometimes the names encoded as...
10
by: silverburgh.meryl | last post by:
Hi, Is there a string function to trim all non-ascii characters out of a string? Let say I have a string in python (which is utf8 encoded), is there a python function which I can convert that...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.