473,394 Members | 1,875 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

codecs.getencoder encodes entire string ?


When I use an encoder function from codecs module, documentation says
that it encodes the object input and returns a tuple (output object,
length consumed).
import codecs
enc=codecs.getencoder('iso-8859-1')
enc(u'asdf') ('asdf', 4)


I just don't understand why it returns the "length consumed".

Does it means that in some case, the input string can be only partially
converted ?

What can be the use of the "length consumed" value ?
And a last question: can I call this "enc" function from multiple
threads ?

Jul 28 '05 #1
3 1828

On Thu, Jul 28, 2005 at 08:42:57AM -0700, nicolas_riesch wrote:
And a last question: can I call this "enc" function from multiple
threads ?


Yes.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFC6QI2Jd01MZaTXX0RAonWAKCsU6HMqRCvpRN7xODVrB vwSsrDMACePUTU
ZED3yr8mF9Hk3kgGzvrBsic=
=M76o
-----END PGP SIGNATURE-----

Jul 28 '05 #2
nicolas_riesch wrote:
I just don't understand why it returns the "length consumed".

Does it means that in some case, the input string can be only partially
converted ?
For an encoder, I believe the answer is "no". For a decoder, it is
a definite yes: if the input does not end with a complete character,
you may have bytes left at the end which did not get decoded.

For an encoder, the same *might* happen if you want to encode
half-surrogates into, say, UTF-8; the encoder might refuse to
encode the half-surrogate, and wait for the other half. Of course,
the current UTF-8 encoder will then just encode the surrogate
codepoint as if it was a proper character.

If you extend the notion of "encoding", similar things may happen
all the time. E.g. a DES encoder may only support multiples of
the block size, and leave bytes at the end.
What can be the use of the "length consumed" value ?


It's primarily intended for stream writers, which may need
to buffer extra characters at the end that did not get encoded,
and wait until more input is provided.

For all practical purposes, you can ignore the length on
encoding. If you are paranoid, assert that it equals the
length of the input.

Regards,
Martin
Jul 28 '05 #3
Thank you very much !

Nicolas

Jul 29 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Your Name | last post by:
Hi, I have been trying to generate codecs for my language in Python using gencodec.py. The problem is the codec created does not work. Here is the process that I followed. I created a directory...
3
by: Eric Brunel | last post by:
Hi all, I just found a problem in the xreadlines method/module when used with codecs.open: the codec specified in the open does not seem to be taken into account by xreadlines which also returns...
3
by: Ivan Van Laningham | last post by:
Hi All-- As far as I can tell, after looking only at the documentation (and not searching peps etc.), you cannot query the codecs to give you a list of registered codecs, or a list of possible...
3
by: Paul Watson | last post by:
$ python Python 2.4.1 (#1, May 16 2005, 15:19:29) on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import codecs >>> codecs.lookup('ascii') (<built-in...
1
by: Zhongjian Lu | last post by:
Hi Guys, I was processing a UTF-16 coded file with BOM and was not aware of the codecs package at first. I wrote the following code: ===== Code 1============================ for i in...
7
by: Mike Currie | last post by:
I'm trying to write out files that have utf-8 characters 0x85 and 0x08 in them. Every configuration I try I get a UnicodeError: ascii codec can't decode byte 0x85 in position 255: oridinal not in...
1
by: David Hughes | last post by:
I used this function successfully with Python 2.4 to alter the encoding of a set of database records from latin-1 to utf-8, but the same program raises an exception using Python 2.5. This small...
0
by: yrogirg | last post by:
Actually, I need utf-8 to utf-8 encoding which would change the text to another keyboard layout (e.g. from english to russian ghbdtn -> ÐÒÉ×ÅÔ) and would not affect other symbols. I`m totally...
2
by: George Sakkis | last post by:
I'm trying to use codecs.open() and I see two issues when I pass encoding='utf8': 1) Newlines are hardcoded to LINEFEED (ascii 10) instead of the platform-specific byte(s). import codecs f =...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.