473,325 Members | 2,442 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,325 software developers and data experts.

Own filters(?) to streams


Hi, I wrote few filters working on streams of bytes, in example
enciption, UTF-8 decoding and such. Now I wonder how can I turn them
into classes derived from std::stream(?) or in other way to use them
with code working on std::stream std::ifsteam and such.

Keyword codecvc is probably related.

With "connecting to std:: streams/strings" I so far succeded in
std::string, I just wrote my own class of UnicodeChar (stored as full 32
bit) and made class UnicodeString derived from
std::basic_stream<UnicodeChar>, added operator<<(ostream that encodes to
Utf8 and so on...

But how to do simmilar thing for streams and stringstreams? Should I
only add
operator<<(ostringstream
and such - is it a good way to do it?

Or do std allow better way?
Feb 16 '06 #1
4 2231
Rafa? Maj Raf256 wrote:
[character encoding/decoding by fiddling with streams]
Or do std allow better way?


Certainly! The first thing to note is that character encoding and
decoding moves between different things: encoding turns characters
into bytes and decoding turns bytes into characters. It is important
to distinguish between bytes and characters: characters do not care
about their encoding and you can investigate characters to determine
their semantics. Bytes on the other hand are encoded characters (in
this context; bytes might represent other stuff, too) which are
useless when taken out of context. The only reasonable thing to do
to them, except of passing the whole sequence around, of course, is
to decode them and use the resulting characters.

OK, this sets the stage for the 'std::codecvt' facets: these turn
bytes into characters or vice versa. Each file stream, well, actually
the 'std::filebuf' stream buffer, internally uses the code conversion
facets to blockwise encode characters or decode bytes. Unfortunately,
the same mechanism is not readily available for other streams although
Dinkumware's standard library ships with a class which can be used to
do the conversions. It isn't too hard to implement a simple filtering
stream buffer which converts bytes into a characters using an
appropriate 'std::codecvt' facet (the aspect which makes the code
conversion stuff pretty complex e.g. for 'std::basic_filebuf' is
support for positioning which is rarely necessary on streams
representing characters). This would be the way to go: create a
filtering stream buffer which internally uses code conversion facets,
e.g. the ones you provide to implement the Unicode encodings. These
filtering stream buffer is then used with stream classes to actually
use the encoding.
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
Feb 16 '06 #2
Dietmar Kuehl wrote:
Certainly! The first thing to note is that character encoding and
decoding moves between different things: encoding turns characters
into bytes and decoding turns bytes into characters. It is important
to distinguish between bytes and characters: characters do not care
about their encoding and you can investigate characters to determine
their semantics. Bytes on the other hand are encoded characters (in
this context; bytes might represent other stuff, too) which are
useless when taken out of context. The only reasonable thing to do
to them, except of passing the whole sequence around, of course, is
to decode them and use the resulting characters.
Thank you, yes I was familliar a bit with that (chars_traits<> is needed
AFIAR)
OK, this sets the stage for the 'std::codecvt' facets: these turn
bytes into characters or vice versa. Each file stream, well, actually
the 'std::filebuf' stream buffer, internally uses the code conversion
facets to blockwise encode characters or decode bytes. Unfortunately,
the same mechanism is not readily available for other streams although
Dinkumware's standard library ships with a class which can be used to
do the conversions. It isn't too hard to implement a simple filtering
stream buffer which converts bytes into a characters using an
appropriate 'std::codecvt' facet (the aspect which makes the code
conversion stuff pretty complex e.g. for 'std::basic_filebuf' is
support for positioning which is rarely necessary on streams
representing characters). This would be the way to go: create a
filtering stream buffer which internally uses code conversion facets,
e.g. the ones you provide to implement the Unicode encodings. These
filtering stream buffer is then used with stream classes to actually
use the encoding.


Hmm yes, I +/- understand the theory but I can't find none good examples
nor documentations.... counld You perahps write a tiny example of a
filter, like that just reads two bytes A and B into singe char (discards
A and reads B), and on write writes first 'x', and then give character?
Or any other example.

Like: "xaxbxc" is read into "abc" and vice-versa.





Feb 16 '06 #3
Rafa? Maj Raf256 wrote:
Hmm yes, I +/- understand the theory but I can't find none good examples
nor documentations....
Concerning examples, this is indeed a little bit tricky. In my standard
library implementation I have a converting buffer implemented (you can
get it following the CXXRT link on my homepage). I'm not sure whether
it is complete, however, and I know that it does not support seeking.
You might also want to have a look at STLPort's and/or libstdc++'s
implementations of 'std::basic_filebuf' (I haven't looked at them
myself, though).

With respect to documentation, "The C++ Standard Library" (N.Josuttis;
Addison-Wesley) has some documentation on all facets and "Standard
C++ IOStreams and Locales" (A.Langer, K.Kreft; Addison-Wesley) should
also document them.
counld You perahps write a tiny example of a
filter, like that just reads two bytes A and B into singe char (discards
A and reads B), and on write writes first 'x', and then give character?
Or any other example.

Like: "xaxbxc" is read into "abc" and vice-versa.


I thought I could do it on the fly but it turns out to be, well, a
little bit more involved than I remembered...
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
Feb 17 '06 #4
Check out boost's iostream library.

Rafal Maj Raf256 wrote:
Hi, I wrote few filters working on streams of bytes, in example
enciption, UTF-8 decoding and such. Now I wonder how can I turn them
into classes derived from std::stream(?) or in other way to use them
with code working on std::stream std::ifsteam and such.

Keyword codecvc is probably related.

With "connecting to std:: streams/strings" I so far succeded in
std::string, I just wrote my own class of UnicodeChar (stored as full 32
bit) and made class UnicodeString derived from
std::basic_stream<UnicodeChar>, added operator<<(ostream that encodes to
Utf8 and so on...

But how to do simmilar thing for streams and stringstreams? Should I
only add
operator<<(ostringstream
and such - is it a good way to do it?

Or do std allow better way?


Feb 17 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

17
by: Will Stuyvesant | last post by:
Here is a question for people who are more comfortable than I am with new Python stuff like generators. I am having fun implementing things from the Wizard book (Abelson, Sussman, "Structure and...
22
by: kalio80 | last post by:
Hi everyone I am trying to create a file that converts text files from unix to windows and windows to unix I understand the general concept of it as unix uses line feed LF Windows uses CRLF...
15
by: Adam H. Peterson | last post by:
I would like to make a stream or streambuf that tracks the number of lines that have been read and stuff like that (so, for example, when I get an error message, I can ask the stream for the line...
103
by: Steven T. Hatton | last post by:
§27.4.2.1.4 Type ios_base::openmode Says this about the std::ios::binary openmode flag: *binary*: perform input and output in binary mode (as opposed to text mode) And that is basically _all_ it...
25
by: electrixnow | last post by:
in MS VC++ Express I need to know how to get from one comma delimited text string to many strings. from this: main_string = "onE,Two,Three , fouR,five, six " to these: string1 =...
5
by: voronwae | last post by:
Hi folks. Either I'm missing something really obvious (most likely) or I'm missing something really subtle. I've been building up a machine as an IMP webmail server, with php 5.1.2, cyrus-sasl,...
6
by: hegyvari | last post by:
Hi, I have a few apps written in PHP, running on Fedora Core. After porting these applications from PostgreSQL to Oracle a bug appeared: sometimes, not in a reproducable manner, the web page...
1
by: Scott | last post by:
Hello, I am not entirely clear on the difference between cerr and clog. When it is more appropriate to use one versus the other? I am coding a simple C++ application that will issue error and...
30
by: xiao | last post by:
HI~ EVERY ONE~ I have a small program here, when I tried to compile it , it always reminds that arrary.c: In function `main': arrary.c:39: error: `header' undeclared (first use in this...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.