473,721 Members | 2,067 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Character encoding

I'm preparing a site for a client which includes several pages
containing Cyrillic characters. I used the UTF-8 charset, but the
Cyrillic characters appeared as question marks (and, oddly, some
Chinese characters as well.) I tried every Cyrillic charset I could
find and nothing worked.

I usually just hand-code all my PHP and HTML, but I swallowed hard and
went to Dreamweaver CS3, searched around, and found that I could set
each file's encoding to UTF-8 using the Modify =Page Properties =>
Title/Encoding command.

Now it works fine, but I don't really understand what the command did.
It didn't add any code, and it didn't change the http-equiv tag. In
fact, I have to perform the command on every file that is included in
the PHP file.

So: a) what exactly did Dreamweaver do, and b) how could I have hand-
coded whatever it is?

Thank you in advance.

(Also posted in alt.html -- my apologies if I've violated etiquette.)
Jun 27 '08 #1
7 4897
Mambo Bananapatch wrote:
I'm preparing a site for a client which includes several pages
containing Cyrillic characters. I used the UTF-8 charset, but the
Cyrillic characters appeared as question marks (and, oddly, some
Chinese characters as well.) I tried every Cyrillic charset I could
find and nothing worked.
So: a) what exactly did Dreamweaver do, and b) how could I have hand-
coded whatever it is?
Well it all depends on what exactly you do when you say "I used the
UTF-8 charset" or "I tried every Cyrillic charset"? Have you used an
editor that supports saving as UTF-8 (or a Cyrillic charset) and have
you used it so that it saved your documents as UTF-8 (or a Cyrillic
charset)? That is all what you need to do to ensure your files are
properly encoded. Then, when serving them over HTTP you need to make
sure the server sends a HTTP Content-Type response header indicating the
used charset as a paramter e.g.
Content-Type: text/html; charset=UTF-8

--

Martin Honnen
http://JavaScript.FAQTs.com/
Jun 27 '08 #2
Scripsit Mambo Bananapatch:
(Also posted in alt.html -- my apologies if I've violated etiquette.)
Oh, you'll just be ignored in the sequel. No problem.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/
Jun 27 '08 #3
On Apr 26, 7:53 am, Martin Honnen <mahotr...@yaho o.dewrote:
Mambo Bananapatch wrote:
I'm preparing a site for a client which includes several pages
containing Cyrillic characters. I used the UTF-8 charset, but the
Cyrillic characters appeared as question marks (and, oddly, some
Chinese characters as well.) I tried every Cyrillic charset I could
find and nothing worked.
So: a) what exactly did Dreamweaver do, and b) how could I have hand-
coded whatever it is?

Well it all depends on what exactly you do when you say "I used the
UTF-8 charset" or "I tried every Cyrillic charset"? Have you used an
editor that supports saving as UTF-8 (or a Cyrillic charset) and have
you used it so that it saved your documents as UTF-8 (or a Cyrillic
charset)? That is all what you need to do to ensure your files are
properly encoded. Then, when serving them over HTTP you need to make
sure the server sends a HTTP Content-Type response header indicating the
used charset as a paramter e.g.
Content-Type: text/html; charset=UTF-8

--

Martin Honnen
http://JavaScript.FAQTs.com/
Thanks Martin, that's exactly what I did. Dreamweaver saved the files
with the correct encoding, and I used the response header you
suggested, and all's well.

I guess my question was more about what Dreamweaver did; if I were to
hand-code a page with Cyrillic characters, and didn't have access to
Dreamweaver, how would I encode each file? And why must I encode each
file, in addition to including the UTF-8 Content-Type response
header?

I just wanted to understand what I was doing.

Thanks for your time.

MB
Jun 27 '08 #4
Hello!

You did not really answer Martin's question - what did you do _before_
you decided to use Dreamweaver.
On a non-Russian OS one can get question marks in many cases, for
example:
- typing in an editor such as Notepad and save as "ANSI", that is, in
a character set encoding = system code page
- using copy/paste between Unicode and not-Unicode programs
- converting to UTF-8 without explicitely providing source encoding
and thus system code page is assumed
- etc.

You may want to read some explanations on my site:
- section "for developers: Cyrillic (Russian) in HTML"
- section "for developers: Cyrillic (Russian) in Multilingula HTML -
UTF-8"
- chapter "Copy/Paste; Word, .TXT" in the section
"Unicode and Cyrillic"

:)

--
Regards,
Paul
http://RusWin.net
Jun 27 '08 #5
On Sun, 27 Apr 2008, Mambo Bananapatch wrote:
if I were to
hand-code a page with Cyrillic characters, and didn't have access to
Dreamweaver, how would I encode each file?
You do not write with a pencil, do you? You have some editor
(word-processor, etc.) on some operating system on some computer.
We don't know what they are - but you know. Your editor saves
files in some character set, such as

MacCyrillic
http://www.unics.uni-hannover.de/nhtcapri/cyrillic.mac

ISO-8859-5
http://www.unics.uni-hannover.de/nht...cyrillic.html5

Windows-1251
http://www.unics.uni-hannover.de/nhtcapri/cyrillic.win

Unicode UTF-8
http://www.unics.uni-hannover.de/nht...gual1#cyrillic
And why must I encode each
file, in addition to including the UTF-8 Content-Type response
header?
I don't understand what this question means.

--
Top-posting.
What's the most irritating thing on Usenet?
Jun 27 '08 #6
On 2008-05-01, David Trimboli <da***@trimboli .namewrote:
[...]
Normally the browser learns what encoding to read by the server's HTTP
headers. An http-equiv declaration in an HTML file is a way to override
a server's content-type (encoding).
It doesn't override it-- if both are present, the server header wins.
You only use this if your server isn't serving files with the correct
content-type.
Yes, or because you're using file:// urls during development.
Jun 27 '08 #7
On Thu, 1 May 2008, David Trimboli wrote:
An http-equiv declaration in an HTML file is a way to override
a server's content-type (encoding).
No, it is not. See
http://www.unics.uni-hannover.de/nht...a-http-equiv.1
http://www.unics.uni-hannover.de/nht...a-http-equiv.2

--
Bugs in Internet Explorer 7
http://www.unics.uni-hannover.de/nhtcapri/ie7-bugs
Jun 27 '08 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
8154
by: Safalra | last post by:
The idea here is relatively simple: a java program (I'm using JDK1.4 if that makes a difference) that loads an HTML file, removes invalid characters (or replaces them in the case of common ones like Microsoft's 'smartquotes'), and outputs the file. The problem is these files will be on disk, so the program won't have the character encoding information from the server. Questions:
7
4970
by: Mark | last post by:
Hi... I've been doing a lot of work both creating and consuming web services, and I notice there seems to be a discontinuity between a number of the different cogs in the wheel centering around windows-1252 and that it is not equivalent to iso-8859-1. Looking in the registry under HKEY_CLASSES_ROOT\MIME\Database\Charset and \Codepage, it seems that all variations on iso-8859-1 (latin1, etc) are mapped to code page 1252, which I'm...
6
2505
by: Pavils Jurjans | last post by:
Hello, I am experiencing a weird behaviour on my ASP.NET project. The project consists from client-side, which can be whatever environment - web page, EXE application, etc. The client sends HTTP POST request to the server with data, and the server has ASP.NET application that handles the request and gives answer. I have biled all the fat code down to a very simple test case, which consists from three files - HTML page, which does...
18
4624
by: james | last post by:
Hi, I am loading a CSV file ( Comma Seperated Value) into a Richtext box. I have a routine that splits the data up when it hits the "," and then copies the results into a listbox. The data also has some different characters in it that I am trying to remove. The small a with two dots over it and the small y with two dots over it. Here is my code so far to remove the small y: Private Sub Button2_Click(ByVal sender As System.Object, ByVal...
8
11878
by: Brand Bogard | last post by:
Does the C standard include a library function to convert an 8 bit character string to a 16 bit character string?
37
3374
by: Zhiv Kurilka | last post by:
Hi, I have a text file with following content: "((^)|(.* +))" if I read it with: k=System.IO.StreamReader( "file.txt",System.Text.Encoding.ASCII); k.readtotheend()
44
9483
by: Kulgan | last post by:
Hi I am struggling to find definitive information on how IE 5.5, 6 and 7 handle character input (I am happy with the display of text). I have two main questions: 1. Does IE automaticall convert text input in HTML forms from the
5
4895
by: Timothy Madden | last post by:
Hello Is there a function that will allow me to output text written in utf-8 (from db for example) if my document has Content-Type: text/html; charset=ISO-8859-1 I mean htmlspecialchars() and htmlentities() will only convert characters that have an associated entity defined in HTML.
17
10692
by: =?Utf-8?B?R2Vvcmdl?= | last post by:
Hello everyone, Wide character and multi-byte character are two popular encoding schemes on Windows. And wide character is using unicode encoding scheme. But each time I feel confused when talking with another team -- codepage -- at the same time. I am more confused when I saw sometimes we need codepage parameter for wide character conversion, and sometimes we do not need for conversion. Here are two examples,
10
3976
by: Paul W | last post by:
Hi all, I have an application that reads data in from a text file and stores it in a database. My problem is that there are some characters in the file that aren't being handled properly. For instance, one of the characters has an ASCII code of 150 (it looks like a dash '-'), when I'm debugging this character is displayed as the square box that Windows uses for unsupported characters and when it's copied to the database it's stored as...
0
8852
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
8736
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9373
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9227
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9145
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
4497
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3206
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2590
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2143
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.