473,385 Members | 1,397 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Unicode line endings

After switching text editors, my code started causing mysterious PHP
errors. I narrowed the problem down to the Unicode line endings I
started using with the new text editor: when I save documents using
unicode line endings, PHP no longer registers the line endings, meaning
that:

<?php

echo "Hello World!";

?>

registers as:

<?phpecho "Hello World!";?>

I've verified that I'm using correct unicode line endings. PHP accepts
all files without problem when they are saved using Unix/DOS line
endings, but unicode line endings really seem to confuse it.

Does anyone know what could be causing this? Are there any known fixes?

Thanks for any help - I'm pulling hair out over this one.

Jun 21 '06 #1
5 4463
*** jdbartlett escribió/wrote (21 Jun 2006 14:00:23 -0700):
After switching text editors, my code started causing mysterious PHP
errors. I narrowed the problem down to the Unicode line endings registers as:

<?phpecho "Hello World!";?>


We'd need to know two things: what's what you call "unicode line endings"
and what do you mean with "register"...

Anyway, I'd say you're using a Mac to edit files and you upload them using
FTP in binary mode. Try ascii mode instead.

--
-+ Álvaro G. Vicario - Burgos, Spain
++ http://bits.demogracia.com es mi sitio para programadores web
+- http://www.demogracia.com es mi web de humor libre de cloro
--
Jun 21 '06 #2
Thanks for the response. I am using a Mac, but I'm not uploading files
at all, just saving them and then using command line PHP to execute
them.

The text editor I'm using is called TextWrangler from BareBones
software. According to the TextWrangler manual, Unicode has its own
standard for line endings (page 36, second para). In the 'line ending'
menu, TextWrangler offers 4 options (Unicode in addition to Unix,
Macintosh and WIN/DOS). I have selected "Unicode". PHP recognizes Unix,
Mac and WIN/DOS line endings just fine, but seems to have trouble
recognizing these "Unicode" line endings when no other apps do.

Alvaro G. Vicario wrote:
We'd need to know two things: what's what you call "unicode line endings"
and what do you mean with "register"...

Anyway, I'd say you're using a Mac to edit files and you upload them using
FTP in binary mode. Try ascii mode instead.


Jun 21 '06 #3
*** jdbartlett escribió/wrote (21 Jun 2006 14:48:21 -0700):
Thanks for the response. I am using a Mac, but I'm not uploading files
at all, just saving them and then using command line PHP to execute
them.

The text editor I'm using is called TextWrangler from BareBones
software. According to the TextWrangler manual, Unicode has its own
standard for line endings (page 36, second para). In the 'line ending'
menu, TextWrangler offers 4 options (Unicode in addition to Unix,
Macintosh and WIN/DOS).


Oh my... No matter how much I learn about web development, there's always
more :)

http://en.wikipedia.org/wiki/Newline

Sorry, I couldn't find any references about PHP so my best educated guess
is that it isn't supported :-?
--
-+ Álvaro G. Vicario - Burgos, Spain
++ http://bits.demogracia.com es mi sitio para programadores web
+- http://www.demogracia.com es mi web de humor libre de cloro
--
Jun 21 '06 #4
I e-mailed BareBones, and they informed me they are using 0x2029 for
Unicode line endings. They also recommended against using Unicode line
endings for web content and everything else unless there is a specific
need.

With that in mind, I'm switching to UTF-8 encoding with Unix line
endings.

Thanks again!

Alvaro G. Vicario wrote:
*** jdbartlett escribió/wrote (21 Jun 2006 14:48:21 -0700):
Thanks for the response. I am using a Mac, but I'm not uploading files
at all, just saving them and then using command line PHP to execute
them.

The text editor I'm using is called TextWrangler from BareBones
software. According to the TextWrangler manual, Unicode has its own
standard for line endings (page 36, second para). In the 'line ending'
menu, TextWrangler offers 4 options (Unicode in addition to Unix,
Macintosh and WIN/DOS).


Oh my... No matter how much I learn about web development, there's always
more :)

http://en.wikipedia.org/wiki/Newline

Sorry, I couldn't find any references about PHP so my best educated guess
is that it isn't supported :-?
--
-+ Álvaro G. Vicario - Burgos, Spain
++ http://bits.demogracia.com es mi sitio para programadores web
+- http://www.demogracia.com es mi web de humor libre de cloro
--


Jun 21 '06 #5
jdbartlett (jd*****@gmail.com) wrote:
: I e-mailed BareBones, and they informed me they are using 0x2029 for
: Unicode line endings. They also recommended against using Unicode line
: endings for web content and everything else unless there is a specific
: need.

: With that in mind, I'm switching to UTF-8 encoding with Unix line
: endings.
Google can tell you about unicode line ending. Basically the character
0x85 is called "NEL" - Newline character, plus there is 0x2029 called
Paragraph separator, and 0x2028 called Line separator (probably what
BareBones meant to tell you, not 0x2029). Unicode suggests that about
eight (?) characters be recognized as denoting new lines, including the
normal things like carriage-return, plus the NEL LS PS things, plus ones
like form-feed.

The 0x85 character in the default dos codepage is "a grave", which is the
letter "a" with an accent somewhat like \ only smaller and on top.

However 0x85 in my default windows codepage is three dots in a row, like
"..." only fitting into a single character.

If you use utf-8 then 0x85 requires two bytes, so it isn't even a single
"character" for any older software.

PS and LS can't be included directly as themselves at all in a byte stream
since they are bigger than a byte, so they will always under go some kind
of (posssible mis) interpretation. In utf-8 I assume they take three
bytes though I havnen't checked.

It seems to me that the whole thing is a bit problematical, rather like
using a word processor to do your coding - it can be done but do you
really need the headaches?

The key thing is that a programmer is not writing "text" at all - these
are not english essays to be read to your friends - in fact you are laying
out a carefully arranged set of bytes that the compiler can understand.
The compiler accepts things that look a lot like text to make it practical
for a programmer to work with, but it's not text at all, it's a
communication protocol between you and the compiler.
google: unicode line ending

gives all sorts of interesting details.
Jun 22 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

15
by: ajikoe | last post by:
Hello, I use windows notepad editor to write text. For example I write (in d:\myfile.txt): Helo World If I open it with python: FName = open(d:\myfile.txt,'r')
24
by: chri_schiller | last post by:
I have a home-made website that provides a free 1100 page physics textbook. It is written in html and css. I recently added some chinese text, and since that day there are problems. The entry...
5
by: Matthew Thompson | last post by:
I have as issue I am finding hard to research. I use a stored proecdure in SQL 2000 to provide search capability for our database of news stories and articles. Being an international magazine...
4
by: Fuzzyman | last post by:
Hello all, I'm handling some text files where I don't (necessarily) know the encoding beforehand. Because I use regular expressions to parse the text I *must* decode UTF16 encoded text...
18
by: Fuzzyman | last post by:
Hello all, I'm trying to detect line endings used in text files. I *might* be decoding the files into unicode first (which may be encoded using multi-byte encodings) - which is why I'm not...
8
by: Richard Schulman | last post by:
The following program fragment works correctly with an ascii input file. But the file I actually want to process is Unicode (utf-16 encoding). The file must be Unicode rather than ASCII or...
1
by: jandhondt | last post by:
IN Visual Studio 2005 with VB.NET when I open a solution I often get this warning: The line endings in the following file are not consistent. Do you want to normalize the line endings? The warning...
5
by: fidtz | last post by:
The code: import codecs udlASCII = file("c:\\temp\\CSVDB.udl",'r') udlUNI = codecs.open("c:\\temp\\CSVDB2.udl",'w',"utf_16") udlUNI.write(udlASCII.read()) udlUNI.close()
3
by: towers | last post by:
Hi I'm probably doing something stupid but I've run into a problem whereby I'm trying to add a csv file to a zip archive - see example code below. The csv just has several rows with carriage...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.