473,320 Members | 1,900 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Problem with length of Multibyte String

Hi all,
I want to write some UTF-8 Chinese characters to file with following
php codes:

<code>
.......
$fp = fopen($filepath,'wb');
fwrite($fp,$utf8string,strlen($utf8string));
fclose($fp);
........
</code>

Problem happened on function "strlen". utf-8 string consists of
multibye characters. Every characters have more than one byte. But to
function "strlen", every character is just one byte.
Take Chinese character "我" for example, its utf-8 code is "0xE6
0x88 0x91“,obviously 3 bytes, but strlen return 1 byte. And then
function "fwrite" just write 1 byte to the file.
So I wonder if there are any way to get actural length of multibyte
string in PHP?
Thank you for suggestions!
Jul 17 '05 #1
3 3041
*** lian escribió/wrote (Sun, 05 Sep 2004 17:27:28 +0800):
Problem happened on function "strlen". utf-8 string consists of
multibye characters. Every characters have more than one byte. But to
function "strlen", every character is just one byte.


There's a chapter in PHP manual titled "Multi-Byte String Functions". There
you have info about mb_strlen().

It's an extension so if it isn't installed in your server you'll probably
have to write your own function. There're some user comments about it in
strlent() manual page.

In any case, please note that the length parameter in fwrite is optional.
If not set, it'll write the whole string.

--
--
-+ Álvaro G. Vicario - Burgos, Spain - ICQ 46788716
+- http://www.demogracia.com (la web de humor para mayores de 100 años)
++ «Sonríe, que te vamos a hacer una foto para la esquela»
--
Jul 17 '05 #2
lian wrote:
Problem happened on function "strlen". utf-8 string consists of
multibye characters. Every characters have more than one byte. But to
function "strlen", every character is just one byte.


Is mbstring.func_overload on? If so, this is overridding the normal
strlen function with mb_strlen which returns the number of characters
instead of bytes. Try turning it off.

See http://www.php.net/mb_string

-- brion vibber (brion @ pobox.com)
Jul 17 '05 #3
"lian" <li**@fed.com> wrote in message news:2q************@uni-berlin.de...
Hi all,
I want to write some UTF-8 Chinese characters to file with following
php codes:

<code>
.......
$fp = fopen($filepath,'wb');
fwrite($fp,$utf8string,strlen($utf8string));
fclose($fp);
........
</code>

Problem happened on function "strlen". utf-8 string consists of
multibye characters. Every characters have more than one byte. But to
function "strlen", every character is just one byte.
Take Chinese character "?" for example, its utf-8 code is "0xE6
0x88 0x91",obviously 3 bytes, but strlen return 1 byte. And then
function "fwrite" just write 1 byte to the file.
So I wonder if there are any way to get actural length of multibyte
string in PHP?


First you all, you don't need to pass the length to fwrite() if you want to
whole string written. Just fwrite($fp, $utf8string) will do.

Second, your description of strlen() is wrong. It returns the byte count,
never the Unicode character count.
Jul 17 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

18
by: Zygmunt Krynicki | last post by:
Hello I've browsed the FAQ but apparently it lacks any questions concenring wide character strings. I'd like to calculate the length of a multibyte string without converting the whole string. ...
3
by: yazan jab | last post by:
Is it true that Multibyte characters are : char arrays (witch represent a string from the basic characters set). In this case Wide characters are the way for encoding characters from the...
2
by: Alex Guryanow | last post by:
Hi, I have windows app written in Borland C++ Builder 5.0. Using ODBC driver windows app connects to database on linux server. Database is created with UNICODE encoding. When pg-server is...
2
by: Billow | last post by:
And how about MultiByte to unicode string?
3
by: Jordan Abel | last post by:
Is there a function to find the length, in wide characters, of a multibyte string?
8
by: Rui Maciel | last post by:
I've just started learning how to use the wchar_t data type as the basis for Unicode strings and unfortunately I'm having quite a bit of problems, both in the C front and the Unicode front. In...
5
by: gezerpunta | last post by:
Hi strlen does not return the correct value .I compared the filesize() and strlen byte size but they are not equal. I must find binary string length and it must be equal to filesize() thks.
2
by: allez | last post by:
Hi, I'm trying to convert a wide character string in UTF-8 into a multibyte string using wctomb and I'm running into a problem when I try to convert characters that take more than one byte (ie, non...
2
by: George2 | last post by:
Hello everyone, I need to know the wide character (unicode) and multibyte (UTF-8) values of a character string of czech. I personally know nothing about czech. Is the following approach correct?...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.