473,320 Members | 2,088 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

strlen issues - no mb_strlen available

15
I'm not sure if this is an encoding issue or not, but for some reason strlen is not giving me the correct length of a string stored in a mysql database.

I have two strings, one containing letters and spaces, and another containing letters, spaces and underscores.

Even if both strings are exactly the same length, the reported length given by strlen is incorrect.

For testing I used these two strings:

This is a long Phrase

and blanked out some of the letters with underscores to get:
Expand|Select|Wrap|Line Numbers
  1. ___s __ _ ____ ______
As far as I can tell, they are the same length, however, strlen will report the first phrase as having a length of 21, and the second as having a length of 25.

Now, there are 4 spaces in the second phrase, and this coincidentally how far off the length is...

Any thoughts?

Ray
Jun 8 '09 #1
4 2701
Dormilich
8,658 Expert Mod 8TB
well, the string you posted has 21 characters (including the spaces)
Jun 8 '09 #2
Atli
5,058 Expert 4TB
Hi.

So the second one, the one with all the underscores, who is according to my count exactly 21 characters, is being reported as 25 characters long?

Eliminating the obvious first:
Are you sure there aren't white-spaces trailing or leading the string?
Have you tried trim?

Moving on to the less obvious:
Is the string being stored in a Unicode field?
I'm no expert on the internals of Unicode strings, but according to what I know (or at least think I know), Unicode characters can take between 1 and 3 bytes.
PHP5 stores all characters as a single byte.

Which would suggest to me that if a Unicode string, containing 21 Unicode characters, four of which required 2 bytes to be stored (or some other mix that adds up to 25 bytes), were to be read into a PHP string, then PHPs string length function, which actually just counts the bytes, would report it being 25 characters long.

Sounds plausible, right?

If that is the case, you could try:
Expand|Select|Wrap|Line Numbers
  1. strlen(utf8_decode($unicdeString))
Jun 8 '09 #3
RayDube
15
Thanks for the thoughtful replies.

Yes, trim, ltrim and rtrim were all tried without success.

utf8_decode was also tried, again, without success...

And, as a matter of fact, after rebooting the machine with mb_ functions, I still got the same result (so not a multibyte thing)

It's still got me puzzled, but in the meanwhile,I've switched to using dashes "-" instead of underscores "_" to get a similar effect.

For whatever reason, dashes are counted correctly,but it's underscores that seems to be counting the spaces twice... odd behaviour in my mind anyway.

So my hangman game is working fine now, excellent exercise for the brain, but now it hurts... :)

I should also note that these were stored in a mysql table, as varchar(250), if that has any impact... frankly I like the look with underscores better than with the dashes, if anyone has a solution that will help me fix this...

Ray
Jun 9 '09 #4
Atli
5,058 Expert 4TB
That's odd.
It tried making a UTF8 table on my test server and fetching the data via PHP. It always counts both dashes and underscores correctly in my tests.

Could you post the exact structure of your table?
The output of the SHOW CREATE TABLE command would be best. (It includes all the charset info.)

And perhaps the code that fetches the data from the database?
Most importantly, the extension you use to connect to MySQL and the method you use to query the data.
Jun 9 '09 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

45
by: Matt Parkins | last post by:
Hi, (I realise this probably isn't precisely the right group for this - could someone direct me to the appropriate group to post this question? - thanks !) I'm using Visual C++ 2005 Express...
12
by: Nollie | last post by:
I need to write a couple of my own string manipulation routines (e.g. a strcpy() alternative that returns the number of chars copied). I've started with one of the simpler functions, strlen(). I've...
81
by: Matt | last post by:
I have 2 questions: 1. strlen returns an unsigned (size_t) quantity. Why is an unsigned value more approprate than a signed value? Why is unsighned value less appropriate? 2. Would there...
66
by: roy | last post by:
Hi, I was wondering how strlen is implemented. What if the input string doesn't have a null terminator, namely the '\0'? Thanks a lot Roy
9
by: raxitsheth2000 | last post by:
Hi, we have strNcpy, strNcmp etc, (small n obviously), but why not in strlen ? the question is i am having some buffer strlen(buff); what happen if buff is not containing '\0', (i have...
44
by: sam_cit | last post by:
Hi Everyone, I tried the following program unit in Microsoft Visual c++ 6.0 and the program caused unexpected behavior, #include <stdio.h> #include <string.h> int main() {
53
by: ¬a\\/b | last post by:
strlen is wrong because can not report if there is some error e.g. char *a; and "a" point to an array of size=size_t max that has no 0 in it
1
by: =?Utf-8?B?c3NyOTI=?= | last post by:
Hi, I need to a function that performs the same task than mb_strlen (in php)? Any help ?
2
by: Keith Thompson | last post by:
jacob navia <jacob@nospam.comwrites: H and L are used only within myStrlen; I'd declare them inside the function. Yes, I know macro definitions aren't scoped that way, but it's useful for...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.