473,698 Members | 2,217 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Faster than STL string class?

Hi,
I learned C++ recently and I made a string class.
A code example is this:

class CString
{
public:
inline CString(const char *rhs)
{
m_size = strlen(rhs);
m_capacity = m_size * 2 + 1;
m_str = new char[m_capacity];
memmove(m_str, rhs, m_size+1);
}

inline CString& __cdecl operator+= (const CString& rhs)
{
unsigned __int32 tmp_size = m_size + rhs.m_size;

if (m_capacity <= tmp_size)
{
m_capacity = tmp_size * 2 + 1;
char *tmp_char = new char[m_capacity];
memmove(tmp_cha r, m_str, m_size);
delete m_str;
m_str = tmp_char;
}

memmove(m_str+m _size, rhs.m_str, rhs.m_size);
m_size = tmp_size;
return *this;
}
private:
char *m_str;
unsigned __int32 m_capacity;
unsigned __int32 m_size;
};
//stl string class
string tmp1 = "123123";
string tmp2 = "999999";
for(int i=0;i<50000;++i )
tmp1 += tmp2;
//1936033.000000 microsecond

//the "CString" class
CString tmp1 = "123123";
CString tmp2 = "999999";
for(int i=0;i<50000;++i )
tmp1 += tmp2;
//21106.333333 microsecond

Have any bug or harmful code in the "CString" class?
Why the "CString" class faster than STL string class?
Many books said that "You should not develop a class that have been developed"
In this case, Which is better?
Who can help me make a choice?

Regards,
YinTat
Jul 22 '05 #1
23 4751
YinTat wrote:
My results are different:
//stl string class
string tmp1 = "123123";
string tmp2 = "999999";
for(int i=0;i<50000;++i )
********tmp1 += tmp2;
//1936033.000000 microsecond
9780 microseconds //the "CString" class
CString tmp1 = "123123";
CString tmp2 = "999999";
for(int i=0;i<50000;++i )
tmp1 += tmp2;
//21106.333333 microsecond
4147 microseconds

I wonder why string handling is so horribly slow on your system, or was
that on some embedded system or very old PC?
Still, yours is quite a bit faster than the standard string, but the
question is how closely your "benchmark" resembles the typical use of
strings.
Have any bug or harmful code in the "CString" class?


It is not thread safe.

Jul 22 '05 #2
YinTat wrote:
Hi,
I learned C++ recently and I made a string class.
A code example is this:

class CString
{ [snip] }; [snip]

By the way, the Microsoft Compiler already has a class called
CString.

Should the name be changed to CppString?
A "C" string generally refers to a null terminated array
of characters; which is a string in the C language sense.

See:
http://www.jelovic.com/articles/stupid_naming.htm

Regards,
YinTat

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.l earn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Jul 22 '05 #3
Rolf Magnus wrote:
YinTat wrote:
My results are different:
//stl string class
string tmp1 = "123123";
string tmp2 = "999999";
for(int i=0;i<50000;++i )
tmp1 += tmp2;
//1936033.000000 microsecond


9780 microseconds
//the "CString" class
CString tmp1 = "123123";
CString tmp2 = "999999";
for(int i=0;i<50000;++i )
tmp1 += tmp2;
//21106.333333 microsecond


4147 microseconds

I wonder why string handling is so horribly slow on your system, or was
that on some embedded system or very old PC?
Still, yours is quite a bit faster than the standard string, but the
question is how closely your "benchmark" resembles the typical use of
strings.
Have any bug or harmful code in the "CString" class?


It is not thread safe.

It appears as if the std::string class from the OP suffers from a O(n^2)
flaw that was hurting, e.g., some older version of MFC CString
implementations . This has been fixed, though.

On my machine the handcoded CString class beats std::string but not by a
factor of 2. However, it should be pointed out, that there is an obvious
parameter in the speed-space tradeoff: CString doubles the allocated
memory. It would surely waste less memory at the cost of more allocations
if that factor was decreased. Similarly, one could still speed it up by
increasing that factor.

Moreover, on many implementations , std::string is trying to be smart about
assignments. In reference count based implementations , many times an
assignment of strings will result in only little more than a pointer copy.
Thus, which implementation is better for a given project is very hard to
predict. In any case, I would doubt that the gain is worth the effort of
actually coding a complete string class. Well, unless of course std::string
completely sucks as appears to be the case with the OP.
Best

Kai-Uwe
Jul 22 '05 #4
> It is not thread safe.

Silly question: why? I don't get all this thread safe, multithreading
things. Can anyone explain it briefly?
-Gernot
Jul 22 '05 #5
YinTat wrote:
I learned C++ recently and I made a string class.
A code example is this:

class CString
{
public:
inline CString(const char *rhs)
{
m_size = strlen(rhs);
m_capacity = m_size * 2 + 1;
m_str = new char[m_capacity];
memmove(m_str, rhs, m_size+1);
}

inline CString& __cdecl operator+= (const CString& rhs)
{
unsigned __int32 tmp_size = m_size + rhs.m_size;

if (m_capacity <= tmp_size)
{
m_capacity = tmp_size * 2 + 1;
char *tmp_char = new char[m_capacity];
memmove(tmp_cha r, m_str, m_size);
delete m_str;
delete[] m_str;
m_str = tmp_char;
}

memmove(m_str+m _size, rhs.m_str, rhs.m_size);
m_size = tmp_size;
return *this;
}
private:
char *m_str;
unsigned __int32 m_capacity;
unsigned __int32 m_size;
};
//stl string class
string tmp1 = "123123";
string tmp2 = "999999";
for(int i=0;i<50000;++i )
tmp1 += tmp2;
//1936033.000000 microsecond

//the "CString" class
CString tmp1 = "123123";
CString tmp2 = "999999";
for(int i=0;i<50000;++i )
tmp1 += tmp2;
//21106.333333 microsecond

Have any bug or harmful code in the "CString" class?
Well, it's not supposed to compile with a conforming compiler because
__cdecl and __int32 are not defined. I presume they are something
like
#define __cdecl
and
typedef long __int32;
in reality.

As to the bugs, yes, most definitely. First, see above. You use
'delete' where you supposed to use 'delete[]'. Second, your class
has a memory leak: the allocated memory is never released if the
string doesn't have to grow, and even then, the first one is freed
when the second allocated, but there is always one array that is
never released at the end. It may be acceptable to you, but it's
definitely unacceptable in a standard library implementation.
Why the "CString" class faster than STL string class?
Because it has very little functionality and is buggy. Analogy:
a race car can go faster than a production car or go over a rougher
terrain, but it can never compare with the production car in comfort,
universality, gas mileage, etc.

Remove all the glass from your car, remove the spare tire, drop the
back seat, the passenger seat, reduce the gas tank to one gallon,
replace the hood, the quarter-panels, door panes, with plastic, and
you will have something that drives faster than it did before all
the modifications. Is it better? Depends on how you look at it.
Many books said that "You should not develop a class that have been developed"
In this case, Which is better?
Better is the one that is bug-free. Once you make your class bug-
free, your class can still be better in some applications (i.e. for
doing what you need it to do), but it can never be as suitable for
being in a library as the one in the library.
Who can help me make a choice?


Wise people that write books can. If you don't listen to them, or
to those who give you advice elsewhere, how can they help you make
your choice?

Some folks I've known have developed their own custom string types
for very specific purposes. Those string classes are fast and slim
but can only be used in a very limited set of conditions. The string
class in the Standard Library is designed with a different purpose.
It's generic, it's conforming to requirements to standard containers,
it's stable, and it's portable. If you don't like it, or if you have
other requirements (fewer, less strict, whatever), and you have enough
time to spare, do design your own, by all means. Just please do not
put such class in an application running on a life support machine to
which I'm going to be hooked up after a major surgery.

V
Jul 22 '05 #6
Gernot Frisch wrote:
It is not thread safe.


Silly question: why? I don't get all this thread safe, multithreading
things. Can anyone explain it briefly?
-Gernot


In the context of the C++ language, it doesn't matter and is actually
off-topic.

Unless you are using operating-system specific calls, a C++ program has no
notion of threads, multi-threadedness, etc. and doesn't need to be considered
when writing code.

If you are using threads, then you need to refer to the specifics of that
implementation as to how it affects the C++ libraries.

In short, you don't need to worry about it unless you specifically use threads.

However, in answer to your question: thread-safe describes some construct,
class, etc. that is specifically protected against corruption if two (or more)
threads try to access that object at the same time. For example, suppose you
have one string 'str' and two threads and perform the following:

//thread 1
str = "hello from Earth";

// thread 2
str = "greetings to all inhabitants of Mars";

If both were executed simultaneously, what would be the outcome? If str isn't
thread-safe, corruption would be inevitable. If str is thread-safe, then
typically the first thread to write would be granted exclusive write-access
until the assignment is completed, and the second thread would block until the
first is done, and then would gain exclusive write access.

Books are devoted to the subject, this is a trivial example, but should serve
to answer your question.
Jul 22 '05 #7
pr******@hotmai l.com (YinTat) wrote in message news:<4c******* *************** ****@posting.go ogle.com>...
class CString
{
public:
inline CString(const char *rhs)
{
m_size = strlen(rhs);
m_capacity = m_size * 2 + 1;
m_str = new char[m_capacity];
memmove(m_str, rhs, m_size+1);
}

inline CString& __cdecl operator+= (const CString& rhs)
{
unsigned __int32 tmp_size = m_size + rhs.m_size;

if (m_capacity <= tmp_size)
{
m_capacity = tmp_size * 2 + 1;
char *tmp_char = new char[m_capacity];
memmove(tmp_cha r, m_str, m_size);
delete m_str;
m_str = tmp_char;
}

memmove(m_str+m _size, rhs.m_str, rhs.m_size);
m_size = tmp_size;
return *this;
}
private:
char *m_str;
unsigned __int32 m_capacity;
unsigned __int32 m_size;
};

Have any bug or harmful code in the "CString" class?


I see two problems with this code:

1) No destructor.
2) Not exception safe. If new throws in operator+=, m_capacity will have
a larger value then the actual capacity. If you call operator+= after
catching an exception, the code can write to memory it does not own.]

samuel
Jul 22 '05 #8
"YinTat" <pr******@hotma il.com> wrote in message
news:4c******** *************** ***@posting.goo gle.com...
Have any bug or harmful code in the "CString" class?
Why the "CString" class faster than STL string class?


Besides the problems already mentioned, you are missing a destructor, copy
constructor and copy assignment.

Regards,
Martin
Jul 22 '05 #9
In article <4c************ **************@ posting.google. com>,
pr******@hotmai l.com (YinTat) wrote:
Hi,
I learned C++ recently and I made a string class.
I would respectfully suggest that someone who recently learned C++, may
not be the best person to decide that they can write a better string
class than multi-year professionals.

A code example is this:

class CString
{
public:
inline CString(const char *rhs)
{
m_size = strlen(rhs);
m_capacity = m_size * 2 + 1;
Why are you adding one here?
m_str = new char[m_capacity];
memmove(m_str, rhs, m_size+1);
I see that you are copying the null into m_str here...
}
You need a copy c_tor and d_tor or everytime you are done using a
CString, you will be leaking memory, and everytime you copy one CString
to another, you will be both leaking memory and making it impossible for
the two CString objects to work properly.

inline CString& __cdecl operator+= (const CString& rhs)
{
unsigned __int32 tmp_size = m_size + rhs.m_size;

if (m_capacity <= tmp_size)
{
m_capacity = tmp_size * 2 + 1;
Again, your adding one. Why?
char *tmp_char = new char[m_capacity];
If new throws, m_capacity will equal the wrong value.
memmove(tmp_cha r, m_str, m_size);
delete m_str;
That's delete [] m_str;
m_str = tmp_char;
}

memmove(m_str+m _size, rhs.m_str, rhs.m_size);
Here you don't copy the null into m_str like you did above. Why?
m_size = tmp_size;
return *this;
}
private:
char *m_str;
unsigned __int32 m_capacity;
unsigned __int32 m_size;
}; Have any bug or harmful code in the "CString" class?
CString s1( "hello" );
CString s2( "world" );
s1 = s2;
s1 += s1;
// now look at the contents of s2...

while (true) {
CString s1( "abcdefghijklmn opqrstuvwxyz" );
}

While the above is running track your programs memory usage...

Why the "CString" class faster than STL string class?
Because your CString class doesn't do everything std::string does. Your
CString class is designed more like a std::vector<cha r>. Compare your
CString to that and see what happens. (Make sure you are compiling in
debug mode when you do this.)

Many books said that "You should not develop a class that have been developed"
Why do you think they say that?

In this case, Which is better?


(1) The one that works.
(2) If (and only if) they both work, then use the one that is more
efficient for the task at hand. In some cases a class like yours (that
routenly holds twice as much memory as needed,) would be a very bad
choice; in some cases any class that requires the memory be contigious
would be a bad choice. Use the one that best solves your particular
problem.
You may very well be able to implement a better string class than the
one that comes with your library. If you can, then more power to you!
Implementing std::string is one of the many worthy projects for a
beginner learning the language (though implementing std::vector would be
easier.)
Jul 22 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
4054
by: Kamilche | last post by:
I was looking for a way to speed up detecting invalid characters in my TCP string, and thought of yet another use for the translate function! If you were to 'translate out' the bad characters, and compare string lengths afterwards, you would know whether or not the line contained invalid characters. The new method is more than 10x faster than the standard 'if char in string' test! So - here's the code plus sample timings: ''' Translate...
1
2145
by: pawel | last post by:
I have made some comparision C# to Java RegularExpression. The problem was to find out if the rule match some text. Matching were done for precompiled regular expressions, in 100000 iterations loop. Those loops were executed 11 times and average value of consumend time was calculated. Below are codes for both classes. And I found, that Java implementation is 2 to 5 times faster than C# (it depends on complexity of expression). Maybe my...
7
1841
by: Doker | last post by:
I've made some small examples in VB and C# thay were doing this: private void Form1_Load(object sender, System.EventArgs e){ int b = 9966; Doh (b); } and that Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As
14
15039
by: Bob | last post by:
I have a function that takes in a list of IDs (hundreds) as input parameter and needs to pass the data to another step as a comma delimited string. The source can easily create this list of IDs in a comma-delimited string or string array. I don't want it to be a string because I want to overload this function, and it's sister already uses a string input parameter. Now if I define the function to take in a string array, it solves my...
8
3266
by: Scott Emick | last post by:
I am using the following to compute distances between two lat/long coordinates for a store locator - (VB .NET 2003) it seems to take a long time to iterate through like 100-150 locations - about 10-15 seconds...I want to make the code faster. I changed it to be multi-threaded, and it doesn't really make it any faster. The bottleneck seems to be with the math computations. Any ideas like changing my data types or other ideas etc would...
4
1960
by: sunilkher | last post by:
Here is a small sample program that I have. #include <stdlib.h> #include <pthread.h> #include <string> using namespace std; pthread_t threads; pthread_attr_t thr_attr;
10
2177
by: Extremest | last post by:
I know there are ways to make this a lot faster. Any newsreader does this in seconds. I don't know how they do it and I am very new to c#. If anyone knows a faster way please let me know. All I am doing is quering the db for all the headers for a certain group and then going through them to find all the parts of each post. I only want ones that are complete. Meaning all segments for that one file posted are there. using System;
12
1978
by: brey_maastricht | last post by:
Dear all, I'm trying to rewrite a Java program into C++. The Java programm works fast but I hoped that C++ would even be faster. But that is not the case ! (to be complete: both the Java and C++ version of the program are intented to use within Matlab) I used as less 'new', 'delete' and 'delete' statements as possible.
34
3537
by: raylopez99 | last post by:
StringBuilder better and faster than string for adding many strings. Look at the below. It's amazing how much faster StringBuilder is than string. The last loop below is telling: for adding 200000 strings of 8 char each, string took over 25 minutes while StringBuilder took 40 milliseconds! Can anybody explain such a radical difference?
0
8673
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8601
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9021
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8892
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8860
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7716
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6518
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5860
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
2
2327
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.