EOL created by .write or .encode

Xah Lee

Why is that some of my files written out by
outF.write(outtext.encode('utf-8'))
has ascii 10 as EOL, while others has ascii 13 as EOL?
both of these files's EOL are originally all ascii 10.

If i remove the EOL after the tt below in the place string, then this
doesn't happen.

findreplace = [
(ur'</body>',
ur'''tt
</body>'''),
]

....

inF = open(filePath,'rb')
s=unicode(inF.read(),'utf-8')
inF.close()

for couple in findreplace:
outtext=s.replace(couple[0],couple[1])
s=outtext
outF = open(tempName,'wb')
outF.write(outtext.encode('utf-8'))
outF.close()

thanks.

Xah
xa*@xahlee.org
âˆ‘ http://xahlee.org/PageTwo_dir/more.html â˜„

Jul 18 '05 #1

Subscribe Post Reply

1950

Fredrik Lundh

"Xah Lee" <xa*@xahlee.org> wrote:

Why is that some of my files written out by
outF.write(outtext.encode('utf-8'))
has ascii 10 as EOL, while others has ascii 13 as EOL? outF = open(tempName,'wb')
outF.write(outtext.encode('utf-8'))
outF.close()

UTF-8 is not a binary format. get rid of the "b" flags, and things
will work as expected.

</F>

Jul 18 '05 #2

Xah Lee

I found the problem now. (after some one hour debug time) Python
didn't have problem. Emacs does.

If you open a file in emacs, it will open fine regardless whether the
EOL is ascii 10 or 13. (unix or mac) This is a nice feature. However,
the what-cursor-position which is used to show cursor position and the
char's ascii code, says the EOL is ascii 10 when it is in fact ascii
13. Fuck the irresponsible fuckhead who is responsible for this.

http://xahlee.org/UnixResource_dir/w...e_license.html

Xah
xa*@xahlee.org
âˆ‘ http://xahlee.org/

Xah Lee wrote:

Why is that some of my files written out by
outF.write(outtext.encode('utf-8'))
has ascii 10 as EOL, while others has ascii 13 as EOL?
both of these files's EOL are originally all ascii 10.

If i remove the EOL after the tt below in the place string, then this
doesn't happen.

findreplace = [
(ur'</body>',
ur'''tt
</body>'''),
]

...

inF = open(filePath,'rb')
s=unicode(inF.read(),'utf-8')
inF.close()

for couple in findreplace:
outtext=s.replace(couple[0],couple[1])
s=outtext
outF = open(tempName,'wb')
outF.write(outtext.encode('utf-8'))
outF.close()

thanks.

Xah
xa*@xahlee.org
âˆ‘ http://xahlee.org/PageTwo_dir/more.html â˜„

Jul 18 '05 #3

Aidan Kehoe

Ar an naoiÃº lÃ¡ de mÃ* AibrÃ©an, scrÃ*obh Xah Lee:

If you open a file in emacs, it will open fine regardless whether the
EOL is ascii 10 or 13. (unix or mac) This is a nice feature. However,
the what-cursor-position which is used to show cursor position and the
char's ascii code, says the EOL is ascii 10 when it is in fact ascii
13.

This _is_ the right thing to do--thereâ€™s no reason naive programs written in
Emacs Lisp should have to worry about different on-disk representations of
line-endings. If you want to open a file which uses \015 as its line
endings, and have those \015 characters appear in the buffer, open it using
a coding system ending in -unix. C-u C-x C-f /path/to/file RET
iso-8859-1-unix RET in XEmacs, something I donâ€™t know but Iâ€™m certain exists
in GNU Emacs.

--
â€œI, for instance, am gung-ho about open source because my family is being
held hostage in Rob Maldaâ€™s basement. But who fact-checks me, or Enderle,
when we say something in public? No-one!â€ -- Danny Oâ€™Brien

Jul 18 '05 #4

Xah Lee

can any GNU person or emacs coder answer this?

specifically: why does what-cursor-position give incorrect answer.

Xah
xa*@xahlee.org
âˆ‘ http://xahlee.org/PageTwo_dir/more.html â˜„

Xah Lee wrote:

I found the problem now. (after some one hour debug time) Python
didn't have problem. Emacs does.

If you open a file in emacs, it will open fine regardless whether the
EOL is ascii 10 or 13. (unix or mac) This is a nice feature. However,
the what-cursor-position which is used to show cursor position and the char's ascii code, says the EOL is ascii 10 when it is in fact ascii
13. Fuck the irresponsible fuckhead who is responsible for this.

http://xahlee.org/UnixResource_dir/w...e_license.html

Xah
xa*@xahlee.org
âˆ‘ http://xahlee.org/

Xah Lee wrote:
Why is that some of my files written out by
outF.write(outtext.encode('utf-8'))
has ascii 10 as EOL, while others has ascii 13 as EOL?
both of these files's EOL are originally all ascii 10.

If i remove the EOL after the tt below in the place string, then this doesn't happen.

findreplace = [
(ur'</body>',
ur'''tt
</body>'''),
]

...

inF = open(filePath,'rb')
s=unicode(inF.read(),'utf-8')
inF.close()

for couple in findreplace:
outtext=s.replace(couple[0],couple[1])
s=outtext
outF = open(tempName,'wb')
outF.write(outtext.encode('utf-8'))
outF.close()

thanks.

Xah
xa*@xahlee.org
âˆ‘ http://xahlee.org/PageTwo_dir/more.html â˜„

Jul 18 '05 #5

Alan Mackenzie

In comp.emacs.xemacs Xah Lee <xa*@xahlee.org> wrote:

I found the problem now. (after some one hour debug time) Python
didn't have problem. Emacs does. If you open a file in emacs, it will open fine regardless whether the
EOL is ascii 10 or 13. (unix or mac) This is a nice feature. However,
the what-cursor-position which is used to show cursor position and the
char's ascii code, says the EOL is ascii 10 when it is in fact ascii
13.
The problem is that there are many ways (at least 3) of indicating where
one line of text ends and the next one begins. Emacs deals with this
problem by converting the file loaded from disk to an internal format,
and converting back again when the time comes to save it again. The
alternatives would have been worse: noting the line-end convention of
each file, and complicating many routines (and we're talking about more
than "at least 3") to take account of that.

The internal representation of an EOL is 0x0a. Now that you know this,
it shouldn't bother you again. Alternatively, you could write a patch
for `what-cursor-position' to fix the problem (if such it be) and submit
it to the mailing list (xe*********@xemacs.org, or something like that).
However, it might introduce more problems than it would solve. I suspect
the developers would reject it.
Fuck the irresponsible fuckhead who is responsible for this.

You having a bad day, or something? ;-) The fuckhead was probably RMS
(Richard Stallman, he of the Free Software Foundation), and he's been
fucked so many times that once more wouldn't achieve anything at all.
;-)

--
Alan Mackenzie (Munich, Germany)
Email: aa**@muuc.dee; to decode, wherever there is a repeated letter
(like "aa"), remove half of them (leaving, say, "a").

Jul 18 '05 #6

by: Newbie | last post by:

How would I modify this form to encode *all* the characters in the 'source' textarea to the '%xx' format & place result code into the 'output' textarea? (cross browser compatable) Any help is...

Javascript

encode a number in a javascript string

by: Peter | last post by:

Hi, I try to make up a javascript string which contains numeric numbers in any positions. For example, I want to make a string: secretcode, where secretcode.charAt(0)==(-21),...

Javascript

When to use HTML encode and when not to?

by: Darrel | last post by:

How does HTML.encode work? I'm trying to save text in a hidden form field into a SQL DB. The tedt is HTML (from a WYSIWYG editor...X-standard). One problem I have is that stray apostrophe's in...

ASP.NET

Error when trying to write unicode xml to zipfile

by: Martin | last post by:

I get below error when trying to write unicode xml to a zipfile. zip.writestr('content.xml', content.toxml()) File "/usr/lib/python2.4/zipfile.py", line 460, in writestr zinfo.CRC =...

Python

encode() question

by: 7stud | last post by:

s1 = "hello" s2 = s1.encode("utf-8") s1 = "an accented 'e': \xc3\xa9" s2 = s1.encode("utf-8") The last line produces the error: --- Traceback (most recent call last):

Python

print vs sys.stdout.write, and UnicodeError

by: Brent Lievers | last post by:

Greetings, I have observed the following (python 2.5.1): UTF-8 Ã© Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode...

Python

usage of <string>.encode('utf-8','xmlcharrefreplace')?

by: J Peyret | last post by:

Well, as usual I am confused by unicode encoding errors. I have a string with problematic characters in it which I'd like to put into a postgresql table. That results in a postgresql error so I...

Python

write Python dict (mb with unicode) to a file

by: dmitrey | last post by:

hi all, what's the best way to write Python dictionary to a file? (and then read) There could be unicode field names and values encountered. Thank you in advance, D.

Python

Write utf8 encoded string

by: Samuel | last post by:

Hi I am trying to write to a string text encoded to utf8 as oppose to utf16 Since the data comes from an XML object (and I serialize it) I need to pass either StreamWriter or a StringWriter...

Visual Basic .NET

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

EOL created by .write or .encode

Similar topics