473,408 Members | 1,951 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

Multibyte character?

When I do a readfile or file_get_contents on a web page the string I
get back get corrupted for non-ASCII characters. For instance when do
a readfile("http://abc/def") "São Paulo" became "São Paulo" on the
calling page although http://abc/def shows "São Paulo" correctly. Any
idea on how to fix this problem.

Let me try to explain it more. I have two pages http://abc/def,
http://abc/ghi.php and I am trying to read the contents of http://abc/def
from http://abc/ghi.php.
Jun 2 '08 #1
1 1620
vijay wrote:
When I do a readfile or file_get_contents on a web page the string I
get back get corrupted for non-ASCII characters. For instance when do
a readfile("http://abc/def") "São Paulo" became "São Paulo" on the
calling page although http://abc/def shows "São Paulo" correctly. Any
idea on how to fix this problem.

Let me try to explain it more. I have two pages http://abc/def,
http://abc/ghi.php and I am trying to read the contents of http://abc/def
from http://abc/ghi.php.
What you get is exactly right. From your example, it appears that your
text is utf-8 encoded and that the second page is (probably) latin-1
encoded. A "readfile" without respecting any encodings is not enough to
display "human" text.

If you use curl, you can catch the headers that contain the encoding
used and use mbstring to convert it. Or if it is always the same page
you read, you know the encoding beforehand.

Best regards.
--
Willem Bogaerts

Application smith
Kratz B.V.
http://www.kratz.nl/
Jun 2 '08 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: lian | last post by:
Hi all, I want to write some UTF-8 Chinese characters to file with following php codes: <code> ....... $fp = fopen($filepath,'wb'); fwrite($fp,$utf8string,strlen($utf8string)); fclose($fp);...
5
by: Ma Siva Kumar | last post by:
Running postgresql-7.3.2-3 which came with Red Hat 9.0. Created a database with unicode encoding (in psql) as below: create database leatherlink with encoding='unicode' template=leatherlinkdb;...
18
by: Zygmunt Krynicki | last post by:
Hello I've browsed the FAQ but apparently it lacks any questions concenring wide character strings. I'd like to calculate the length of a multibyte string without converting the whole string. ...
3
by: yazan jab | last post by:
Is it true that Multibyte characters are : char arrays (witch represent a string from the basic characters set). In this case Wide characters are the way for encoding characters from the...
3
by: Simon Morgan | last post by:
Hi, The following code is meant to validate a string of multibyte characters by using mbcheck() to call mblen() on each character on the string passed to it. The problem is that it isn't working...
3
by: Weiping | last post by:
Hi, while upgrade to 8.0 (beta3) we got some problem: we have a database which encoding is UNICODE, when we do queries like: select upper('ÖÐÎÄ'); --select some multibyte character, then...
1
by: Kevin Laurence | last post by:
Can someone help me with a character set/encoding problem? I am using a MySQL database with PHP to store the name "Bedrich". Notice the letter "r" in the name. It has an accent, just as it does...
1
by: Marcel Ruff | last post by:
Hi, i have the question on how to determine the string length of a wide string and a multibyte string: 1. Number of letters (one letter may use three bytes) 2. Number of bytes In the code...
13
by: TK | last post by:
Hi, how can I handle multibyte characters like ä, ü (german vowel mutation)? This does't work: switch(c) case 'ä': ... some action
2
by: George2 | last post by:
Hello everyone, I need to know the wide character (unicode) and multibyte (UTF-8) values of a character string of czech. I personally know nothing about czech. Is the following approach...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.