size_t / long - .NET Framework

songie D

If a size_t is cast to a long, and size_t is the length of a unicode string, does the resulting long need to be divided by sizeof(_TCHAR) in order to get the actual length in _TCHARs?

Nov 17 '05 #1

Subscribe Post Reply

2672

Carl Daniel [VC++ MVP]

songie D wrote:

If a size_t is cast to a long, and size_t is the length of a unicode
string, does the resulting long need to be divided by sizeof(_TCHAR)
in order to get the actual length in _TCHARs?

If the size_t contains the size of a TCHAR string, you should always divide
by sizeof(TCHAR) to get the length of the string.

-cd

Nov 17 '05 #2

Carl Daniel [VC++ MVP]

songie D wrote:

ok Carl.
maybe you can help me:
I've got a string (of _TCHARs), for instance
_TCHAR* mystring = "quick brown fox jumps over lazy dog";

I then use this:
wordlen = (long)_tcsspn(mystring, "abcdefgh...wxyz");
(where the second argument is the whole alphabet but without space)

which of course returns 6. This is what I would expect, as 6 is the
It returns 5...
position of the first character that ISN'T an alphabetic character,
i.e.
a space.

I now want to extract the word "quick" and copy it into a string of
its own.
For this I'm allocating a dynamic array of _TCHARs on the heap (I
don't
know how long the word might be).
using:
_TCHAR* word = new _TCHAR[wordlen];
_tcsncpy(word, mystring, wordlen);
...<do some operations on the 'word' variable>
delete[] word;

You NEED to allocate space for the terminating NULL, so do new
TCHAR[wordlen+1]. As is, you're getting an un-terminated word in your array
and seeing garbage that's past the end of your allocation. Note that in
general, new will allocate more memory than you ask for due to
alignment/granularity requirements.

btw, all of this is much easier and less error-prone if you use std::string
instead of dealing with low-level details yourself:

typedef std::basic_string<TCHAR> tstring;

tstring mystring("quick brown fox jumps over lazy dog");
tstring alphabet("abcdefgh...wxyz");
tstring word = mystring.substr(0,mystring.find_first_not_of(alpha bet,0));

-cd

Nov 17 '05 #3

songie D

It returns 5...
yes, sorry 5. I meant 5.
You NEED to allocate space for the terminating NULL, so do new
TCHAR[wordlen+1]. As is, you're getting an un-terminated word in your array and seeing garbage that's past the end of your allocation. Note that in
general, new will allocate more memory than you ask for due to
alignment/granularity requirements.
OK so supposing I do, then I'll get even more memory. But I see the point,
it *might* fill it. I'll put it another way...
no I won't, I'll just repeat the question. How can I get the variable to
contain JUST
the string, with terminating null if that's what it entails, and then still
successfully delete[] it?

btw, all of this is much easier and less error-prone if you use std::string instead of dealing with low-level details yourself:
Nah, that'd defeat the point of writing this part of the program in
unmanaged
C++. I might aswell write it in C#, that the rest of the program's written
in.
This is a routine that's going to be called probably every time
the user types a key, possibly many times per the user types a key.

typedef std::basic_string<TCHAR> tstring;

tstring mystring("quick brown fox jumps over lazy dog");
tstring alphabet("abcdefgh...wxyz");
tstring word = mystring.substr(0,mystring.find_first_not_of(alpha bet,0));

mmm. wonder where find_first_not_of() comes from , the tooth fairy?

Nov 17 '05 #4

Carl Daniel [VC++ MVP]

songie D wrote:

It returns 5...
yes, sorry 5. I meant 5.
You NEED to allocate space for the terminating NULL, so do new
TCHAR[wordlen+1]. As is, you're getting an un-terminated word in
your array and seeing garbage that's past the end of your
allocation. Note that in general, new will allocate more memory
than you ask for due to alignment/granularity requirements.

OK so supposing I do, then I'll get even more memory. But I see the
point, it *might* fill it. I'll put it another way...
no I won't, I'll just repeat the question. How can I get the variable
to contain JUST
the string, with terminating null if that's what it entails, and then
still successfully delete[] it?

You can: allocate wordlen+1. There's no way you can force new[] to allocate
exactly as much space as you request - it's free to allocate more. That
said, any attempt by you to access beyond the size you requested is
undefined behavior. It's free to allocate exactly the amount you request
one time, and 10X the amount you request the next time - you simply cannot
assume anything beyond:

1. the allocation was at least as large as you requested.
2. you can safely access all of the elements that you requested (i.e. for
new T[n], you can access indexes 0..n-1).
3. assuming you haven't violated #2, that you can pass the same pointer
returned by new[] to delete[].

btw, all of this is much easier and less error-prone if you use
std::string instead of dealing with low-level details yourself:

Nah, that'd defeat the point of writing this part of the program in
unmanaged
C++. I might aswell write it in C#, that the rest of the program's
written in.

Then there's likely to valid reason to write it in unmanaged C++. You're
apparently operating under the falacious assumption that managed code is
slow, or that this function is going to be a bottleneck in your program
(have you profiled it to find out?).
This is a routine that's going to be called probably every time
the user types a key, possibly many times per the user types a key.

So? Users typing keys are monumentally slow - you can run 10's (maybe
100's) of millions of CPU instructions between keystrokes on a modern CPU.
Besides, it's likely that the std::string solution, if properly written,
will be the same speed as your fragile hand-crafted solution.

typedef std::basic_string<TCHAR> tstring;

tstring mystring("quick brown fox jumps over lazy dog");
tstring alphabet("abcdefgh...wxyz");
tstring word =
mystring.substr(0,mystring.find_first_not_of(alpha bet,0));

mmm. wonder where find_first_not_of() comes from , the tooth fairy?

find_first_not_of is a member function of std::basic_string<CharT> - note
the . between mystring and find_first_not_of in the above sample. It comes
not from the tooth fairy, nor from Microsoft, but from the ISO C++ standard.

-cd

Nov 17 '05 #5

songie D

> You can: allocate wordlen+1. There's no way you can force new[] to
allocate

exactly as much space as you request - it's free to allocate more. That
said, any attempt by you to access beyond the size you requested is
undefined behavior. It's free to allocate exactly the amount you request
one time, and 10X the amount you request the next time - you simply cannot
assume anything beyond:
ok. I get the picture. I'll try it.

1. the allocation was at least as large as you requested.
2. you can safely access all of the elements that you requested (i.e. for
new T[n], you can access indexes 0..n-1).
3. assuming you haven't violated #2, that you can pass the same pointer
returned by new[] to delete[].
Presumably you can also assume that the memory will be contigious?
(Thus, T[n] = T + n)
I think I was just under the false impression that even though I'd
discovered
that I wasn't allocating enough space, the fact that I happened to have got
more
meant that this couldn't be the problem. I'm getting the image that it would
be
undefined behaviour anyway.

Then there's likely to valid reason to write it in unmanaged C++.
If you mean write the whole lot in unmanaged C++, no - I don't want
to do that. The reason simply being that it would take me far too long.
The algorithms are what should be taking my programming time, not
spending hours writing code to display a user interface.
You're
apparently operating under the falacious assumption that managed code is
slow, or that this function is going to be a bottleneck in your program
(have you profiled it to find out?).
No, I'm not saying managed code is slow. I'm just saying it's a known fact
that it's slightly slowER, than unmanamaged C++ code.
If this program was being developed commercially, then some bod with
a degree in design architecture and who never actually has to write any code
would make a decision about which bits are going to be written in which
language. I figured that since I'm more of a "just do it" programmer
(both in profession and in hobby) I should take this step myself, rather
than
simply writing it all in the same language.

So? Users typing keys are monumentally slow - you can run 10's (maybe
100's) of millions of CPU instructions between keystrokes on a modern CPU.
Not the speed I type at (50 - 70 wpm). Which yes, ~1Hz is slow in
electronics terms
you're right. But I don't think that I want to be cutting any slack
nevertheless.

Besides, it's likely that the std::string solution, if properly written,
will be the same speed as your fragile hand-crafted solution.
I doubt it'll be fragile, since it'll be tested with all possible inputs
(and checked for memory leaks if I'm feeling pedantic). I wouldn't have
thought that writing with a class library that I have no knowledge of
and that is generic enough to handle many different scenarios, would be as
fast as writing with native functions that I do have knowledge of and that
are specifically only programmed to do the task I have in mind.
find_first_not_of is a member function of std::basic_string<CharT> - note
the . between mystring and find_first_not_of in the above sample. It comes not from the tooth fairy, nor from Microsoft, but from the ISO C++ standard.

oh ok, I stand corrected then. Although the STL is made by hewlett-packard,
you realise. But under the hood it's still probably a similar algorithm to
_tcscspn, and is just another header file of mainly unnecessary
bumph compiled into the application.

-cd

Nov 17 '05 #6

Carl Daniel [VC++ MVP]

songie D wrote:

oh ok, I stand corrected then. Although the STL is made by
hewlett-packard, you realise. But under the hood it's still probably
a similar algorithm to _tcscspn, and is just another header file of
mainly unnecessary
bumph compiled into the application.

The STL was a proposal by Alex Stepanov (et al) of Hewlett Packard to the
C++ standards committee. The C++ standard incorporates the components
originally included in STL. Note that std::string was not part of the STL
proposal, but rather was created by the C++ committee based on other
proposals. In fact, the string class was in the proposed standard before
STL was proposed, with a number of changes being made to the string class
after STL was introduced to make the string more "STL-like".

And yes, find_first_not_of, under the covers, is no doubt a very similar
algorithm to _tcscspn, but it's standard and portable.

-cd

Nov 17 '05 #7

Similar topics

size_t - why?

by: rayw | last post by:

I used to believe that size_t was something to do with integral types, and the std. Something along the lines of .. a char is 8 bits, a int >= a char a long >= int

C / C++

size_t in a struct

by: bwaichu | last post by:

To avoid padding in structures, where is the best place to put size_t variables? According the faq question 2.12 (http://c-faq.com/struct/padding.html), it says: "If you're worried about...

C / C++

318

size_t or int for malloc-type functions?

by: jacob navia | last post by:

Rcently I posted code in this group, to help a user that asked to know how he could find out the size of a block allocated with malloc. As always when I post something, the same group of people...

C / C++

finding max value of size_t

by: subramanian100in | last post by:

Consider the following program #include <limits.h> #include <stddef.h> int main(void) { size_t size; size_t bytes = sizeof(size_t);

C / C++

C89, size_t, and long

by: Yevgen Muntyan | last post by:

Hey, I was reading C99 Rationale, and it has the following two QUIET CHANGE paragraphs: 6.5.3.4: "With the introduction of the long long and extended integer types, the sizeof operator may...

C / C++

Assuming size_t is unsigned long

by: Paulo Matos | last post by:

Hello, Is it safe to assume a size_t is an unsigned long? (is it forced by the standard?) Thank you, Paulo Matos

C / C++

size_t and ptr_diff_t

by: Bob Cassel | last post by:

I have the idea that ptr_diff_t had to be the same size as size_t from Plauger's "The Standard C Library," where he states "... It is always the signed type that has the same number of bits as the4...

C / C++

defining the size_t type

by: lubomir dobsik | last post by:

hi, i have seen an interesting thing: #if sizeof((char*)0 - (char*)0) == sizeof(unsigned int) typedef unsigned int size_t; #elif sizeof((char*)0 - (char*)0) == sizeof(unsigned long) typedef...

C / C++

What's the deal with size_t?

by: Tubular Technician | last post by:

Hello, World! Reading this group for some time I came to the conclusion that people here are split into several fractions regarding size_t, including, but not limited to, * size_t is the...

C / C++

size_t literals?

by: jacek.dziedzic | last post by:

Hi! On a machine where size_t is 64-bit, unsigned long is 32-bit, how does one construct a size_t literal that says 2^32? Typing in size_t x = 4294967296UL; complains about the value being...

C / C++

Wordpress or something else?

by: Faith0G | last post by:

I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

Content Management Systems

Access Europe: Command bars, the Access Shortcut Tool and a simple Audit Log - Wed 3 April

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

General

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++