<ga***********@ yahoo.com> wrote in message
news:11******** **************@ g14g2000cwa.goo glegroups.com.. .
: We need to process a very large amount of delimited variable length
: ASCII data in files as large as 3-4 gigs. We need a high performance
: parser for this and as always, we have no money to buy one. We are ok
: with building one as long as that can be done quick enough and I was
: wondering if Boost has a panacea for us. Can anyone help with their
: ideas / experience.
:
: I am also very open to any suggestions outside Boost. Any outline on
: how to build such a parser would be very welcome. If some comparative
: performance figures can be mentioned, it would be of tremendous help.
: Any fast C++ library would be of help.
The parser itself may not be the performance-limiting factor as much
as the technique you use for i/o.
In similar circumstances, I usually use memory-mapping (map, or
MapViewOfFile on Windows) to bring the file (or large segments of it)
into memory. The OS page caching is typically much more efficient
than any file i/o API.
For parsing, I tend to rely on the tried and true flex tool
(
http://www.gnu.org/software/flex/). Flex-generated code is very
likely to be faster than boost::spirit (but I have no data).
A hand-coded parser might be fastest if the structure of the
records is simple enough.
Maybe you can just delimit lines and use sscanf?
: We develop a market analytics tool on HP-UX and Linux on 32/64 bits.
Wishing you success - Ivan
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form