Length is encoded on 1, 2, 3, 4 or 5 bytes as follows:
* The int (32 bits) is split in 7 bit chunks.
* The 8th bit is used to indicate if the reader should read further (bit
set) or stop (bit clear).
So, if len < 0x7F, it is encoded on one byte as b0 = len
if len < 0x3FFF, is is encoded on 2 bytes as b0 = (len & 0x7F) | 0x80, b1 =
len >> 7
if len < 0x 1FFFFF, it is encoded on 3 bytes as b0 = (len & 0x7F) | 0x80, b1
= ((len >> 7) & 0x7F) | 0x80, b2 = len >> 14
etc.
len is the length of the UTF8 encoding and it is followed by the UTF8 byte
representation of the string.
If you want the source code, I suggest that you download the Shared Source
Common Language Infrastructure from the Microsoft site (just search for
"Shared Source Common Language Infrastructure" on
www.microsoft.com) . The
source files for the binary reader/writer are in the sscli\clr\src\system\io
directory
Bruno.
"John Aldrin" <Jo********@msn.com> a écrit dans le message de
news:kh********************************@4ax.com...
Hi,
I'm looking for info that explains the format of a string data type
when written to a stream using a BinaryWriter. I've looked all over
MSDN and Internet and I cannot seem to find it.
I did some simple testing and it seems that the string data is
prefixed w/a variable number of bytes that indicate the length.
1 byte if length <= 127
2 bytes if length > 127 - 1st byte has high order bit set, remaining
bits indicate number of bytes + (2nd byte * 127).
I haven't explored any further.
Thanx
jra