By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,798 Members | 1,766 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,798 IT Pros & Developers. It's quick & easy.

rope vs. string

P: n/a
Hey all,

I have a program that runs the Manber-Myers suffix array search algorithm
with longest common prefix values. I need an efficient way to represent a
sequential array of characters to be passed into the algorithm. Here are
the features I need for the representation:

* Must be randomly accessible in constant time
* Must be capable of efficiently storing strings of more than 10 million
characters

Here are some plusses:

* Dereferencing an iterator should be fast, not just constant time, but
quick constant time, since it will be happening often (at least (size of
pattern) + log(size of text) times)
* Incrementing an iterator should be similarly fast

It would also be useful if this representation provided an efficient method
of extracting all the characters for the problem from the standard input
(again, no less than 5 million characters, randomly distributed among A, C,
G, and T).

Now, here's the question: which is better for this problem, a string or a
rope. If a rope would be best, can you offer any guidance as to ways to
make the use of the rope as fast and painless as possible (if that is the
preferable representation).

- JFA1
Jul 23 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
"James Aguilar" <jf**@cec.wustl.edu> wrote in message
news:d2**********@newsreader.wustl.edu...
Hi James,
I have a program that runs the Manber-Myers suffix array search algorithm
with longest common prefix values. I need an efficient way to represent a
sequential array of characters to be passed into the algorithm. Here are
the features I need for the representation:

* Must be randomly accessible in constant time
* Must be capable of efficiently storing strings of more than 10 million
characters

Here are some plusses:

* Dereferencing an iterator should be fast, not just constant time, but
quick constant time, since it will be happening often (at least (size of
pattern) + log(size of text) times)
* Incrementing an iterator should be similarly fast

It would also be useful if this representation provided an efficient
method of extracting all the characters for the problem from the standard
input (again, no less than 5 million characters, randomly distributed
among A, C, G, and T).

Now, here's the question: which is better for this problem, a string or a
rope. If a rope would be best, can you offer any guidance as to ways to
make the use of the rope as fast and painless as possible (if that is the
preferable representation).


The implementation of rope::iterator is somewhat more complex than for
string (where it is basically a pointer), and will introduce an overhead.

The only true advantage of a rope (i.e. the non-standard rope class from
the SGI STL) is that it supports efficient *editing* of a small portion
of the string (not just replacing, but also inserting/deleting/splicing
subsequences within the original string). Unless this is relevant to
what you need to do, there is no point in using rope instead of string.

The caveat with std::string, depending on how the platform you use
implements the standard library, is to try to avoid unnecessary copies
of the data structure (be careful to pass the strings as references
whenever possible, to avoid object copies in case your implementation
does not use reference-counting).

In case top performance is critical, passing the input as a file
and mapping it into memory (with platform-specific mmap/MapViewOfFile),
will most likely be the fastest approach at run-time.

Note also that your algorithm could be templated on the iterator type,
allowing it to be used on any type of source data range.
I hope this helps,
Ivan
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
Brainbench MVP for C++ <> http://www.brainbench.com
Jul 23 '05 #2

P: n/a
Oh! Thanks for the help. I didn't realize the rope was not in -the- STL.
I got the sense from the site that I was reading that it was. If it's not
standard, I'm not going to mess with it, as it is only for a homework
assignment. Thank you so much for your advice nonetheless.

- JFA1
Jul 23 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.