"Sathyaish" <Sa*******@Yahoo.com> wrote
When you say char in C, it internally means "an unsigned small integer
with 1-byte memory", right? More importantly, the internal
representation of char does not mean "int" as in
"machine-dependant/register-size dependant integer, which is normally
four-byte on 32-bit processors", right?
In English, we use glyphs to repesent characters. So capital A is a upwards
pointing triangle with a raised lower edge, capital B is a straight line
with two semi circles, and so on.
This is a good system for pencil and paper, but trying to store such shapes
directly on computer would be very wasteful. So instead we use a code - 10
means A, 11 means B, 12 means C, and so on.
Usually this code will be ascii, and usually characters will occupy 8 bits.
However you normally don't have to worry about this. C abstracts the
representation, and handles it for you. If you want an A, you just type char
ch = 'A';
Unfortunately the designers of C made a mistake. On their machine, bytes,
the smallest addressable unit of memory, happend to be 8 bits, which was
also perfect for the ascii code. So they decided to use the same word for a
character and a byte, "char". This causes huge problems when we try to go to
non-Latin languages, but we have to live with it.
The result is that you will often see "unsigned char" or more occasionally
"signed char" used as a small integer. You are not guaranteed 8 bits, though
this is by far the most common value. The macro CHAR_BIT gives you the
number of bits in a char.