473,548 Members | 2,636 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

size_t problems

I am trying to compile as much code in 64 bit mode as
possible to test the 64 bit version of lcc-win.

The problem appears now that size_t is now 64 bits.

Fine. It has to be since there are objects that are more than 4GB
long.

The problem is, when you have in thousands of places

int s;

// ...
s = strlen(str) ;

Since strlen returns a size_t, we have a 64 bit result being
assigned to a 32 bit int.

This can be correct, and in 99.999999999999 9999999999999%
of the cases the string will be smaller than 2GB...

Now the problem:

Since I warn each time a narrowing conversion is done (since
that could loose data) I end up with hundreds of warnings each time
a construct like int a = strlen(...) appears. This clutters
everything, and important warnings go lost.
I do not know how to get out of this problem. Maybe any of you has
a good idea? How do you solve this when porting to 64 bits?

jacob
Aug 29 '07
409 10804
In article <11************ **********@22g2 000hsm.googlegr oups.com>,
user923005 <dc*****@connx. comwrote:
>I doubt that the chance a string is longer than 2GB is always
negligible.
"Always negligible" is irrelevant. Of course it's not negligible in
programs chosen to demonstrate the problem.
>Consider the characters 'C', 'T', 'A', 'G' in various combinations in
a long sequence of (say) 3 billion.
That's the human genome.
The chance of a given program being one that stores the complete human
genome in a string is negligible. People with such programs can set the
option I suggested.

-- Richard
--
"Considerat ion shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Aug 29 '07 #21

"Richard Tobin" <ri*****@cogsci .ed.ac.ukwrote in message
news:fb******** ***@pc-news.cogsci.ed. ac.uk...
The chance of a given program being one that stores the complete human
genome in a string is negligible. People with such programs can set the
option I suggested.
I work in that field.
Whilst generally you'd want a "rope" type-structure to handle such a long
sequence, there might well be reasons for storing the whole genome as a flat
string. Certainly if I had a 64-bit machine with enough memory installed, I
would expect to have the option, and I'd expect to be able to write the
program in regular C.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
Aug 29 '07 #22
jacob navia <ja***@jacob.re mcomp.frwrites:
Keith Thompson wrote:
>Why didn't you get the same warnings in 32-bit mode? If int and
size_t are both 32 bits, INT_MAX < SIZE_MAX, and there are values of
size_t that cannot be stored in an int. If the "narrowing conversion"
warning is based on the sizes of the type rather than the ranges, I'd
say you've just discovered a compiler bug.

2GB strings are the most you can get under the windows schema in 32 bits.
Ok. Does your compiler know that?

Assigning an arbitrary size_t value to an object of type int, if both
types are 32 bits, could potentially overflow. Your compiler
apparently doesn't issue a warning in that case. Is it because it
knows that the value returned by strlen() can't exceed INT_MAX (if so,
well done, especially since it seems to be smart enough not to make
that assumption on a 64-bit system), or is it because it doesn't issue
a warning when both types are the same size?

For example:

size_t s = func(-1);
/* Assume func() takes a size_t argument and returns it.
Assume func() is defined in another translation unit,
so the compiler can't analyze its definition. In other
words, 's' is initialized to SIZE_MAX, but the compiler
can't make any assumptions about its value. */

signed char c = s;
/* Presumably this produces a warning. */

int i = s;
/* This is a potential overflow. Does this produce
a warning? Should it? */

If your compiler warns about the initialization of 'c' but not about
the initialization of 'i', then IMHO it's being inconsistent. This
doesn't address your original question, but it's related.

[...]
There isn't any string longer than a few K in this program!
Of course is a potential bug, but it is practically impossible!
You know that, and I know that, but what matters is what the compiler
knows.

Is it conceivable that a bug in the program and/or some unexpected
input could cause it to create a string longer than 2GB?

You asked how to suppress the bogus warnings without losing any valid
warnings. To do that, your compiler, or some other tool, has to be
able to tell the difference. Telling me that none of the strings are
longer than 2GB doesn't address that concern, unless you can convey
that knowledge to the compiler.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 29 '07 #23
Richard Tobin wrote:
In article <11************ **********@22g2 000hsm.googlegr oups.com>,
user923005 <dc*****@connx. comwrote:
>I doubt that the chance a string is longer than 2GB is always
negligible.

"Always negligible" is irrelevant. Of course it's not negligible in
programs chosen to demonstrate the problem.
>Consider the characters 'C', 'T', 'A', 'G' in various combinations in
a long sequence of (say) 3 billion.
That's the human genome.

The chance of a given program being one that stores the complete human
genome in a string is negligible. People with such programs can set the
option I suggested.

-- Richard
The program has strings of at most a few K. It is an IDE (Integrated
development environment, debugger, etc)

An int can hold string lengths of more than 2 billion... MORE than
enough for this environment. This program has been running under 32 bit
windows where all user space is at most 2GB.
Aug 29 '07 #24
"Malcolm McLean" <re*******@btin ternet.comwrite s:
"jacob navia" <ja***@jacob.re mcomp.frwrote in message
news:46******** *************** @news.orange.fr ...
>Malcolm McLean wrote:
>>There's a very obvious answer to that one. As a compiler-writer,
youa re in a position to do it.

???

(Please excuse my stupidity by I do not see it...)
The campaign for 64 bit ints T-shirts obviously didn't generate enough
publicity. I still have a few left. XXL, one size fits all.
One *shirt* fits all (unless somebody other than you actually wants
one).
There are some good reasons for not making int 64 bits on a 64 bit
machine, which as a compiler-writer you will be well aware of. However
typical computers are going to have 64 bits of main address space for
a very long time to come, so it makes sense to get the language right
now, and keep it that way for the forseeable future, and not allow
decisions to be dominated by the need to maintain compatibility with
legacy 32 bit libraries.
lcc-win32 (and presumably lcc-win64, if that's what it's called) is a
Windows compiler. jacob does not have the option of changing the
Windows API, and a compiler that's incompatible with the underlying
operating system isn't going to be very useful.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 29 '07 #25
Malcolm McLean wrote:
>
"Richard Tobin" <ri*****@cogsci .ed.ac.ukwrote in message
news:fb******** ***@pc-news.cogsci.ed. ac.uk...
>The chance of a given program being one that stores the complete human
genome in a string is negligible. People with such programs can set the
option I suggested.
I work in that field.
Whilst generally you'd want a "rope" type-structure to handle such a
long sequence, there might well be reasons for storing the whole genome
as a flat string. Certainly if I had a 64-bit machine with enough memory
installed, I would expect to have the option, and I'd expect to be able
to write the program in regular C.
YES SIR!

With my new lcc-win32 YOU WILL BE ABLE TO DO IT!

But I am not speaking of that program. I am speaking about
other programs I am PORTING from 32 bit, whose strings are never
bigger than a few Kbytes at most!

Aug 29 '07 #26
Keith Thompson wrote:
lcc-win32 (and presumably lcc-win64, if that's what it's called) is a
Windows compiler. jacob does not have the option of changing the
Windows API, and a compiler that's incompatible with the underlying
operating system isn't going to be very useful.
Yes. Mr Gates decided that

sizeof(int) == sizeof(long) == 4.

Only long long is 64 bits. PLease address alll flames to him.

NOT TO ME!!!

:-)
Aug 29 '07 #27

jacob navia:
The problem is, when you have in thousands of places

int s;

// ...
s = strlen(str) ;

Since strlen returns a size_t, we have a 64 bit result being
assigned to a 32 bit int.
<snip>
I do not know how to get out of this problem. Maybe any of you has
a good idea? How do you solve this when porting to 64 bits?
Assuming that you've a shred of intelligence, I'm led to believe that
you suffer from "int syndrome".

"int syndrome" reminds me of old drivers, the kind of people who
always drive the canonical route somewhere. Even during rush-hour,
even at night when the streets are clear, they always take the same
route. I don't know if you'd call it stubbornness or stupidity. They
lack dynamic-ity.

These drivers remind me of the programmers who are "int" people. The
solution to your boggle is so blatantly oblivious that I'm not even
gonna mention what the solution is.

The real problem is why you feel so indoctrinated into using int,
especially places where you shouldn't be using it.

If you want advice though, I'd say use the appropriate types where
appropriate, and to edit any code that uses types wrongly.

Martin

Aug 30 '07 #28
jacob navia wrote:
>
I am trying to compile as much code in 64 bit mode as
possible to test the 64 bit version of lcc-win.

The problem appears now that size_t is now 64 bits.

Fine. It has to be since there are objects that are more than
4GB long. The problem is, when you have in thousands of places

int s;

// ...
s = strlen(str) ;

Since strlen returns a size_t, we have a 64 bit result being
assigned to a 32 bit int.

This can be correct, and in 99.999999999999 9999999999999%
of the cases the string will be smaller than 2GB...
Simply define s as a long long or (better) as a size_t.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 30 '07 #29
jacob navia wrote:
>
.... snip ...
>
int s = strlen(str) is NOT broken.
Yes it is. How can you guarantee that strlen never returns a value
that exceeds the capacity of an int? However:

size_t s = strlen(str);

is NOT broken, assuming suitable #includes and definitions.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 30 '07 #30

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

18
23536
by: Steffen Fiksdal | last post by:
Can somebody please give me some rules of thumb about when I should be using size_t instead of for example int ? Is size_t *always* typedef'd as the largest unsigned integral type on all systems ? I know that the strlen() functions and such returns a size_t, and it should be received that way. I am at the time creating some sort of...
17
2779
by: G Patel | last post by:
E. Robert Tisdale wrote: > > int main(int argc, char* argv) { > quad_t m = {0, 1, 2, 3}; > int r; > fprintf(stdout, "m = ("); > for (size_t j = 0; j < 4; ++j) Why did you declare j as type size_t ?
17
134407
by: candy_init | last post by:
I sometimes comes across statements which invloves the use of size_t.But I dont know exactly that what is the meaning of size_t.What I know about it is that it is used to hide the platform details.I tried to find its meaning in the header files but did'nt got a good answer.So can somebody please tell me that what is the meaning of size_t and...
5
3153
by: edware | last post by:
Hello, I have some questions about the size_t type. First, what do we know about size_t? From what I have read I believe that it is an unsigned integer, but not necessarily an int. Am I correct? Does this mean that if I need to compare two variables, one of the size_t type, should the other also be a size_t variable? There could be...
12
10683
by: Alex Vinokur | last post by:
Why was the size_t type defined in compilers in addition to unsigned int/long? When/why should one use size_t? Alex Vinokur email: alex DOT vinokur AT gmail DOT com http://mathforum.org/library/view/10978.html http://sourceforge.net/users/alexvn
23
4853
by: bwaichu | last post by:
To avoid padding in structures, where is the best place to put size_t variables? According the faq question 2.12 (http://c-faq.com/struct/padding.html), it says: "If you're worried about wasted space, you can minimize the effects of padding by ordering the members of a structure based on their base types, from largest to smallest."
318
12783
by: jacob navia | last post by:
Rcently I posted code in this group, to help a user that asked to know how he could find out the size of a block allocated with malloc. As always when I post something, the same group of people started to try to find possible errors, a harmless passtime they seem to enjoy. One of their remarks was that I used "int" instead of "size_t"...
73
7360
by: Yevgen Muntyan | last post by:
Hey, I was reading C99 Rationale, and it has the following two QUIET CHANGE paragraphs: 6.5.3.4: "With the introduction of the long long and extended integer types, the sizeof operator may yield a value that exceeds the range of an unsigned long." 6.5.6: "With the introduction of the long long and extended integer
89
5657
by: Tubular Technician | last post by:
Hello, World! Reading this group for some time I came to the conclusion that people here are split into several fractions regarding size_t, including, but not limited to, * size_t is the right thing to use for every var that holds the number of or size in bytes of things. * size_t should only be used when dealing with library functions.
0
7512
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7707
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
7466
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
7803
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6036
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5362
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5082
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3495
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
1926
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.