473,797 Members | 3,144 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Xah's edu corner: the Journey of Foreign Characters thru Internet

the Journey of Foreign Characters thru Internet

Xah Lee, 20051101

There's a bunch of confusions about the display of non-ascii characters
such as the bullet "•". These confusions are justifiable, because the
underlying stuff is technology, computing technologies, are in a laymen
sense, extremely complex.

In order to be able to type the bullet char, post it to a newsgroup,
and receive the posted material, and have that bullet displayed as a
bullet as it was intended, truly involves the availability of several
technologies, on the sender's computer, on the receiver's computer, and
thru the network that received the posting, and the network the post
was retrieved, as well as the configuration of the sender and poster's
computers. And, cross your fingers, that all things should go well, but
unfortunately, because the fucking asses criminals such as Larry Wall
in the computing industry, mostly likely things will not go well.

[Disclaimer: all mention of real persons are opinion only.]

Here's a quick rundown:

• there needs to be agreed upon a character set. (that is, the set of
symbols to be used on computer) Many such character sets includes the
bullet symbol.

• there needs to be a code map that maps the alphabets (and any other
symbols) to numbers.

There are various standard bodies that standardize these character sets
and code maps. (usually, but not always, they come together as one)

• now, more technically, once each character has a associated number,
this number needs to be turned into actually binary number. This is the
encoding part. There are various standards of encoding a character set,
that is, turning a sequence of numbers into binary. (the issue involves
not just turning integers into binary, but for example marking or
demarcating combined characters such as umlaut or initiate or terminate
right-to-left writings) Usually but not always, the encoding business
is intertwined together with the character set/code map specification,
even though they are entirely separate concepts.

• now on your computer, say you are using Windows and OutlookExpress,
there's a menu or option somewhere you can see that says text encoding
or character set. Now, that's where you tell the computer which of
these standardized character/encoding stuff set to choose from to
actually represent what you type on the keyboard. (in the case of
Chinese for example, you can't type directly, you need another
technology Input Methods to type stuff.)

• one of these standard, is called Unicode, wich has a character set
that encompasses practically all the world's language's written
symbols, including all Chinese (and includes Japanese phonetics and
Korean alphabets), as well as Arabic alphabets. (i.e. those hateful
Islamic twists the WASPs see)

• once you typed your letter and send it thru a particular encoding
in your email/newsreader software, the message went to the network
“news” servers. For a ride around internet, there needs to be more
protocols. That is, a way to distinguish from a string a binary digits
where does your subject actually starts, where is From, where the To
address starts, where is your message content, ... among other things.

• now we are getting really complex... because in the history of
software and the internet, in the beginning there's really no support
of any character set or all that complex stuff except the ascii (among
others), that is to say, only the characters you can see on the
keyboard. There isn't much that of Standards. Things basically went on
on a as they work basis. Later on these protocols improved in a
patch-wise way, to allow one actually to use non-ascii characters or
foreign languages, or include pictures or other files such as sound &
video as attachment.

• remember that we are bypassing the whole technology of the internet
transport protocols themselves. i.e. IP addresses, various layers...
down to the physics of wiring, copper optical etc.

• OK, now the newserver received your message, it distribute to other
newservers like a spam.

• When you wake up, you open your newsreader hungrily anticipating
news. What happens is that your newsreader software (called client)
contact the particular server and download the message. (all thru
decoding the various many protocols)

• in order for the bullet character to display on your screen, you
assume: (1) your computer supports the whole charset/encoding scheme
the sender used. (2) your computer has the proper font to display it.
(suppose i write chinese to you using Unicode, although your computer
supports (understands) Unicode, your computer theoretically understand
everything, but because you don't have Chinese fonts, your computer
can't display them) (3) and most importantly, nothing has been screwed
up in the message's journey on the net.

• Chances are, things did fuck up somewhere. That is why you see
"E2=80=A2" (which is due to it being fucked up around the news servers)
or a bunch of gibberish (due to you don't have the right font, or
software didn't use the right charset/decoding)

Now, many of you are actually using google to post/read. Here, google
website acts as your newsreader software. Google is pretty good on the
whole. It won't fuckup the encoding. However, your computer still needs
to support unicode and have the font to show bullet. If you have
Windows XP and using Internet Explorer, than you are all fine. If you
have latest Mac OS X, you are all fine too. If you have older
Windows/Mac, or linux, Solaris or other unixes, you are quite fucked
and nobody can help you. Try to see in the menu if there's a
encoding/charset/languages and try to see if it has one item called
unicode or utf8. Use that.

Hactar wrote:

«And "=E2=80=A2" is a good example of why using a bullet is a bad
idea, especially when you can't control the charset (or whatever)
used.»

Now with all the trouble, as to why would someone use a bullet • that
requires some “advanced” technology then resorting to the simple
asterisk * ?

Such basically came down to choice. If you really want massive
compatibility, you go with the universally available asterisk. If you
truly care, you really should write on paper with pen instead.
Remember, folks, not everyone on earth has computer. But if you have
advanced formality perfection obsession that the moronic grammarian
idiots wont, then perhaps asterisks must be done away with by explicit
itemization embedded in your writings.

“O brave new worlds, That have such people in them!”

Enjoy my unicode rhapsody: http://xahlee.org/Periodic_dosage_dir/t1/
see how your computer does.

--------------
This post is archived at:
http://xahlee.org/Periodic_dosage_di...i_journey.html

Xah
xa*@xahlee.org
http://xahlee.org/

Nov 1 '05
13 2510
I should not response here but just to show what would be the
consequence. If you send me a private mail instead of posting here,
there would be two less post for this thread, and if I didn't post, it
would be three less.

That is, only response privately to people responding to a troll would
reduce the length of it, much more effective that what I have seen.

this would be my last post about this subject.

Fredrik Lundh wrote:
bo****@gmail.co m wrote:
And if anyone wants to take the responsibiblity to warn the general
public that it is troll so innocent readers may not be tempted into
one, at least do it privately


huh?

</F>


Nov 2 '05 #11
Jarek Zgoda wrote:
And if anyone wants to take the responsibiblity to warn the general
public that it is troll so innocent readers may not be tempted into
one, at least do it privately


huh?


Don't fed the troll, don't give him any public audience. Easy.


by sending private mail to anyone who might have seen his posts?

</F>

Nov 2 '05 #12
Fredrik Lundh napisa(a):
And if anyone wants to take the responsibiblity to warn the general
public that it is troll so innocent readers may not be tempted into
one, at least do it privately

huh?


Don't fed the troll, don't give him any public audience. Easy.


by sending private mail to anyone who might have seen his posts?


If you absolutely, positively must do something about that... But anyway
it's stupid idea. Private feeding is not much better than in public.

--
Jarek Zgoda
http://jpa.berlios.de/
Nov 2 '05 #13
So just stop talking. It's funny that you guys are having a
conversations about not responding to a guys post. First of all,
freedom of speech, blah blah, who cares, just let him alone. But
certainly don't go on his post, reply, telling people not to reply.
That's like saying EVEN THOUGH I'M doing this, YOU should not do it.
JUST STOP ALREADY :-).

There is of course, the option...instea d of starving the troll...FEED
HIM TILL HE BURSTS!
----== Posted via Newsgroups.com - Usenet Access to over 100,000 Newsgroups ==----
Get Anonymous, Uncensored, Access to West and East Coast Server Farms!
----== Highest Retention and Completion Rates! HTTP://WWW.NEWSGROUPS.COM ==----
Nov 3 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
1614
by: Xah Lee | last post by:
Dear Joe, It is well known that you are an avid hater of Microsoft, from their technologies to their leader to their business practices. I have now and then seen your impassioned expression of this hatred, scattered among your newsgroup posts. Personally, i have an inherent distrust toward big organizations. This applies to Microsoft. Since perhaps 1995, MS has become more and more large, and as well becoming a hate target especially...
102
7143
by: Xah Lee | last post by:
i had the pleasure to read the PHP's manual today. http://www.php.net/manual/en/ although Pretty Home Page is another criminal hack of the unix lineage, but if we are here to judge the quality of its documentation, it is a impeccability. it has or possesses properties of:
14
1916
by: Xah Lee | last post by:
sometimes in the last few months, apparently Microsoft made changes to their JavaScript documentation website: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/script56/html/1e9b3876-3d38-4fd8-8596-1bbfe2330aa9.asp so that, one has to goddamn press the "expand" button to view the documentation, for every goddamn page. What the fuck is going on?
28
2484
by: Xah Lee | last post by:
Sometimes you want your text to flow into multiple columns, as in newspaper's layout. However, as of 2005-12 this is not yet possible. One can make-do by hard-coding it into HTML TABLE using multiple columns. It is a pain because when you change your text, you have to manually cut and paste to justify each and every columns by trial-n-error. A proposed solution is in CSS3 “Multi-column layout”, drafted in 2001 but not yet in any...
23
3650
by: Xah Lee | last post by:
The Concepts and Confusions of Pre-fix, In-fix, Post-fix and Fully Functional Notations Xah Lee, 2006-03-15 Let me summarize: The LISP notation, is a functional notation, and is not a so-called pre-fix notation or algebraic notation. Algebraic notations have the concept of operators, meaning, symbols placed around arguments. In algebraic in-fix notation, different
62
3937
by: Xah Lee | last post by:
Criticism versus Constructive Criticism Xah Lee, 2003-01 A lot intelligent people are rather confused about criticism, especially in our “free-speech” free-for-all internet age. When they say “constructive criticisms are welcome” they mostly mean “bitching and complaints not welcome”. Rarely do people actually mean that “criticism without suggestion of possible solutions are not welcome” or “impolite criticism not...
12
3606
by: Xah Lee | last post by:
Of Interest: Introduction to 3D Graphics Programing http://xahlee.org/3d/index.html Currently, this introduction introduces you to the graphics format of Mathematica, and two Java Applet utilities that allows you to view them with live rotation in a web browser. Also, it includes a introductory tutorial to POV-Ray.
0
9685
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
1
10209
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10023
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9066
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7560
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6803
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5459
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3750
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2934
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.