473,327 Members | 1,997 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,327 software developers and data experts.

non standard path characters

A kind user reports having problems running the reportlab tests because his path
has non-ascii characters in it eg

......\Mes documents\Mes Téléchargements\Firefox\...

somewhere in the tests we look at the path and then try and convert to utf8 for
display in pdf.

Is there a standard way to do these path string conversions?

Paths appear to come from all sorts of places and given the increasing use of
zip file packaging it doesn't seem appropriate to rely on the current platform
as a single choice for the default encoding.
--
Robin Becker

May 31 '07 #1
4 1752

I thing you should change the code page before to run the test, doing
something like :

c:\chcp 850
c:\....\python.exe ......\test.py

look for the good code page for you, maybe 850, 437 or 1230 or 1250
should work

Regards

On 31 mai, 12:17, Robin Becker <r...@reportlab.comwrote:
A kind user reports having problems running the reportlab tests because his path
has non-ascii characters in it eg

.....\Mes documents\Mes Téléchargements\Firefox\...

somewhere in the tests we look at the path and then try and convert to utf8 for
display in pdf.

Is there a standard way to do these path string conversions?

Paths appear to come from all sorts of places and given the increasing use of
zip file packaging it doesn't seem appropriate to rely on the current platform
as a single choice for the default encoding.
--
Robin Becker

May 31 '07 #2
Robin Becker wrote:
A kind user reports having problems running the reportlab tests because
his path has non-ascii characters in it eg

.....\Mes documents\Mes Téléchargements\Firefox\...

somewhere in the tests we look at the path and then try and convert to
utf8 for display in pdf.

Is there a standard way to do these path string conversions?

Paths appear to come from all sorts of places and given the increasing use
of zip file packaging it doesn't seem appropriate to rely on the current
platform as a single choice for the default encoding.
Zip files contain a bit flag for the character encoding (cp430 or utf-8),
see the ZipInfo object in module zipfile and the link (on that page) to the
file format description.
But I think some zip programs just put the path in the zipfile, encoded in
the local code page, in which case you have no way of knowing.

--

Regards,
Tijs
May 31 '07 #3
Tijs wrote:
Robin Becker wrote:
........
Zip files contain a bit flag for the character encoding (cp430 or utf-8),
see the ZipInfo object in module zipfile and the link (on that page) to the
file format description.
But I think some zip programs just put the path in the zipfile, encoded in
the local code page, in which case you have no way of knowing.
thanks for that. I guess the problem is that when a path is obtained from such
an object the code that gets the path usually has no way of knowing what the
intended use is. That makes storage as simple bytes hard. I guess the correct
way is to always convert to a standard (say utf8) and then always know the
required encoding when the thing is to be used.
--
Robin Becker

May 31 '07 #4
thanks for that. I guess the problem is that when a path is obtained
from such an object the code that gets the path usually has no way of
knowing what the intended use is. That makes storage as simple bytes
hard. I guess the correct way is to always convert to a standard (say
utf8) and then always know the required encoding when the thing is to be
used.
Inside the program itself, the best things is to represent path names
as Unicode strings as early as possible; later, information about the
original encoding may be lost.

If you obtain path names from the os module, pass Unicode strings
to listdir in order to get back Unicode strings. If they come from
environment variables or command line arguments, use
locale.getpreferredencoding() to find out what the encoding should
be.

If they come from a zip file, Tijs already explained what the encoding
is.

Always expect encoding errors; if they occur, chose to either skip
the file name, or report an error to the user. Notice that listdir
may return a byte string if decoding fails (this may only happen
on Unix).

Regards,
Martin
May 31 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

70
by: Michael Hoffman | last post by:
Many of you are familiar with Jason Orendorff's path module <http://www.jorendorff.com/articles/python/path/>, which is frequently recommended here on c.l.p. I submitted an RFE to add it to the...
6
by: I am Sam | last post by:
I keep getting this error and I don't know why: The path is too long after being fully qualified. Make sure path is less than 260 characters. Description: An unhandled exception occurred...
2
by: Seth | last post by:
Ok, here is my setup. I have a fully functioning HTTP Handler implemented. The handler is supposed to handle every single request that comes in to a particular virtual directory. Thus, in IIS, I...
5
by: Sakharam Phapale | last post by:
Hi All, I am using an API function, which takes file path as an input. When file path contains special characters (@,#,$,%,&,^, etc), API function gives an error as "Unable to open input file"....
3
by: Zenu | last post by:
Hi, I have a system with very long paths and I'm trying to use file.exist(Longpath) to test the paths existance but it gives a message that the path is too long. Can someone tell me how to...
3
by: Eckhard Schwabe | last post by:
I only found one post on Google where someone mentions the same problem with a DataSet: XmlDataReader in .Net 1.1 can not read XML files from a path which contains "%10" or "%3f". code to...
6
by: Lubomir | last post by:
Hi, Where in .NET are definded constants for maximal file name length and maximal file path? Thanks, Lubomir
3
by: schaf | last post by:
Hi ! I have determined, that Path.InvalidPathChars does not return an array with all invalid path characters. For instance the question mark (?) does not appear in the array, but it is not...
130
by: euler70 | last post by:
char and unsigned char have specific purposes: char is useful for representing characters of the basic execution character set and unsigned char is useful for representing the values of individual...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.