435,426 Members | 2,928 Online
Need help? Post your question and get tips & solutions from a community of 435,426 IT Pros & Developers. It's quick & easy.

# Unicode characters converting to minus-numbers??

 100+ P: 180 Hi again all, here's something I'm stuck on... I'm making a function to convert a unicode character into the kind of code you need to put on a UTF-8 encoded web page (ampersand, hash, digits, semicolon). The program sees whether it's a unicode character by seeing if AscW(character)>255. It works fine most of the time, e.g. it converts this: Expand|Select|Wrap|Line Numbers original: 泳ぐ converted: &# 27891;&# 12368; (I put spaces so that they didn't get converted on the forum - the spaces are not made by the function) But, for some characters, I found that AscW(character) gives back minus numbers! I can't find the relation between the minus-number and what the number should be, either, so I can't make some formula to convert the minus-number to the appropriate number. Here are a few it messes up on with hex version in square brackets in case that helps: Expand|Select|Wrap|Line Numbers Character: こ - Number should be: 12371 [3053] - AscW() gives: -28740 Character: い - Number should be: 12356 [3044] - AscW() gives: -30644 Character: み - Number should be: 12415 [307F] - AscW() gives: -30325   Sorry if it seems confusing, but it does also to me. >_< I tried also to 'manually' work out the number by looking at the 2 bytes, multiplying the leftmost one by 255 and adding the rightmost byte, by using MidB(character,1,1) for the first byte and MidB(character,2,1) for the second byte, but something weird happens here also. Doing MidB(character, 1, 2) gives you back the character, as I expected, because the character's made up of 2 bytes. However, if I try to get just 1 byte (length 1), it seems to fail to get anything, because Expand|Select|Wrap|Line Numbers len(MidB(character, 1, 1))   is 0. Also Asc(MidB(character, 1, 1)) causes an error saying an argument is invalid, also making me think MidB() is giving back an empty string. Agh... my head hurts... does anyone know why on earth AscW() doesn't give the correct number like it does with most other characters? Apr 22 '07 #1
6 Replies

 100+ P: 180 I have finally fixed it! I have no idea what's up with VB's AscW() function, but I've made my own alternative to it which works even when AscW() doesn't! So I'm posting it here, to benefit anyone who may have the same problem... Only works with a Unicode character, so only use it if AscW() gives back >255. Actually you should probably check to see if AscW() gives some minus-number instead of the correct number before resorting to using my function at all because AscW() is a little faster, I think. Expand|Select|Wrap|Line Numbers Public Function UnicodeAsc(RealUnicode As String)     'This function returns what AscW() should return, but which it sometimes doesn't.     'I think it's slightly slower than AscW(), but at least it doesn't ever fail. 'RealUnicode - String containing the unicode character to return the code of. 'If it contains more than one character, only the first one is used.   Dim UnicodeByte1 As Long Dim UnicodeByte2 As Long Dim UnicodeRecreate As Long       UnicodeByte1 = AscB(MidB(RealUnicode, 1, 1))     UnicodeByte2 = AscB(MidB(RealUnicode, 2, 1))     UnicodeRecreate = (UnicodeByte2 * 256!) + UnicodeByte1   UnicodeAsc = UnicodeRecreate   End Function     As you can see, I might be able to reduce the number of variables used and the size of them (Double -> Long -> Integer -> Byte) in the function, but I was getting lots of overflow errors, so I increased their size to Long. Also, as Killer42 mentioned in this thread , at the expense of a little more memory, Long is faster than Integer on 32-bit processors (as most are nowadays) (it's their 'native' format - no converting internally!). Apr 24 '07 #2

 Expert 5K+ P: 8,434 Glad to see you've got it sorted. :) I was puzzling over this one for a while, but couldn't think of anything helpful. It would still be nice to find an answer rather than just working around the problem, but as long as you've achieved what you want to do I guess that's the important thing. By the way, I feel I should mention that they are called "negative numbers" not "minus-numbers". Apr 24 '07 #3