StupidScript wrote:
Bill:
>Somewhere else, the mapping from binary values
to the four strings is stored, but it is stored
only once."
But where is it stored only once? And why doesn't that data retention
add to the storage requirements for a set of variable string data, like
an ENUM row?
Somewhere the name is associated with the number, and the length of
that name's characters would seem to take up additional storage space
... somewhere.
They're stored as part of the table definition. Just as when you store
a column with type VARCHAR(200), the table definition stores the value
200 as the maximum length of that column. An ENUM column must store all
the possible values (strings).
In the case of MySQL, it goes in the .FRM file for a given table, which
is where it stores all information about the columns for the respective
table. Try it:
CREATE TABLE test.test_enum (
col1 ENUM("prestidigitation", "sleight of hand",
"disappearances","card tricks")
);
EXIT;
Now go look at "<datadir>/test/test_enum.FRM":
$ cd <datadir>/test
$ od -c test_enum.FRM
.. . .
0020520 004 \0 005 c o l 1 \0 004 005 020 020 \0 002 \0 \0
0020540 \b 201 020 \0 001 367 \b \0 \0 377 c o l 1 377 \0
0020560 377 p r e s t i d i g i t a t i o
0020600 n 377 s l e i g h t o f h a n
0020620 d 377 d i s a p p e a r a n c e s
0020640 377 c a r d t r i c k s 377 \0
.. . .
"od" is the UNIX command for "octal dump", or output a binary file with
unprintable characters represented by their base-eight value.
Notice how the values are separated with a byte with octal value 377.
This is apparently how MySQL knows where each enum value terminates.
Then at the end of all these values, there's an 0 byte value. I assume
that MySQL uses the ordering of these strings to determine the value to
store in rows of data.
I don't know the purpose of all those other bytes, but one can assume
they say other things about the column, like the fact that it is an
ENUM, the position of the column within the table, whether it is NOT
NULL or not, etc.
I'm delighted with the avenues these comments are making into my brain,
and thank you both very much. I'm just wondering where my understanding
of 'storage' should break with the laws of physics?
I think you are making this harder than it needs to be! :-)
A similar technique is used by GIF image files. You can have 256
distinct colors in the palette in a GIF file, but you can choose any
24-bit color for each of those palette entries. The palette is stored
near the beginning of the file, before the image data, that maps values
in the image data to the 24-bit RGB values. Therefore a GIF image
stores 1 byte per pixel, even though the color is a 3-byte RGB value.
There's a small amount of overhead (256 x 3 bytes) near the beginning of
the file. Any software that reads GIF images knows to use that palette
data as an index for the image data, and therefore knows how to display
the right color.
Regards,
Bill K.