By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,214 Members | 2,038 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,214 IT Pros & Developers. It's quick & easy.

count and count_if algorithms return type seem to insufficient forlarge values

P: n/a
Below is my understanding about count algorithms.

Return type of count and count_if algorithms is
iterator_traits<InputIterator>::difference_type.

If the container contains more than 'difference_type' elements
satisfying the condition, then count and count_if algorithm cannot
return a value greater than 'difference_type'.

As an example, suppose maximum value of 'difference_type' is INT_MAX.
Then the corresponding maximum value of 'size_type' is UINT_MAX. Now
consider the declaration
vector<charv;
where 'v' contains UINT_MAX elements each of which is 'a'.

Now if I do
count(v.begin(), v.end(), 'a')
then the actual count would be UINT_MAX but it cannot be represented
in 'difference_type' which is the return type of count algorithm.

So' won't we get erroneous results from count algorithms ?

I understand that difference_type is same as ptrdiff_t for
pointers(belonging to the same array) and size_type if same as size_t.
This means that the count algorithms can never return values from
ptrdiff_t + 1 upto size_t.

Is this understanding correct ?

If what I have mentioned is really a problem with count algorithms,
how should I count the elements in a container satisfying the
condition for large number of elements(ie greater than
difference_type) ?

Kindly clarify.

Thanks
V.Subramanian
Jun 27 '08 #1
Share this Question
Share on Google+
12 Replies


P: n/a
su**************@yahoo.com, India wrote:
Below is my understanding about count algorithms.

Return type of count and count_if algorithms is
iterator_traits<InputIterator>::difference_type.

If the container contains more than 'difference_type' elements
satisfying the condition, then count and count_if algorithm cannot
return a value greater than 'difference_type'.

As an example, suppose maximum value of 'difference_type' is INT_MAX.
Then the corresponding maximum value of 'size_type' is UINT_MAX. Now
consider the declaration
vector<charv;
where 'v' contains UINT_MAX elements each of which is 'a'.

Now if I do
count(v.begin(), v.end(), 'a')
then the actual count would be UINT_MAX but it cannot be represented
in 'difference_type' which is the return type of count algorithm.

So' won't we get erroneous results from count algorithms ?

I understand that difference_type is same as ptrdiff_t for
pointers(belonging to the same array) and size_type if same as size_t.
This means that the count algorithms can never return values from
ptrdiff_t + 1 upto size_t.

Is this understanding correct ?

If what I have mentioned is really a problem with count algorithms,
how should I count the elements in a container satisfying the
condition for large number of elements(ie greater than
difference_type) ?

Kindly clarify.

Thanks
V.Subramanian
In theory, you may be right, but in practice that would be a vector that
contains 4 billion entries, with all the overhead that a vector has per
entry.

Pragmatically speaking, you won't have to worry about overflow problems.

--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>
Jun 27 '08 #2

P: n/a
Daniel Pitts wrote:
su**************@yahoo.com, India wrote:
>Below is my understanding about count algorithms.

Return type of count and count_if algorithms is
iterator_traits<InputIterator>::difference_type .

If the container contains more than 'difference_type' elements
satisfying the condition, then count and count_if algorithm cannot
return a value greater than 'difference_type'.

As an example, suppose maximum value of 'difference_type' is INT_MAX.
Then the corresponding maximum value of 'size_type' is UINT_MAX. Now
consider the declaration
vector<charv;
where 'v' contains UINT_MAX elements each of which is 'a'.

Now if I do
count(v.begin(), v.end(), 'a')
then the actual count would be UINT_MAX but it cannot be represented
in 'difference_type' which is the return type of count algorithm.

So' won't we get erroneous results from count algorithms ?

I understand that difference_type is same as ptrdiff_t for
pointers(belonging to the same array) and size_type if same as size_t.
This means that the count algorithms can never return values from
ptrdiff_t + 1 upto size_t.

Is this understanding correct ?

If what I have mentioned is really a problem with count algorithms,
how should I count the elements in a container satisfying the
condition for large number of elements(ie greater than
difference_type) ?

Kindly clarify.

Thanks
V.Subramanian
In theory, you may be right, but in practice that would be a vector that
contains 4 billion entries, with all the overhead that a vector has per
entry.

Actually, in all implementations I know, a vector has no overhead per entry.
It has a small constant overhead since it stores its size and the capacity.

Pragmatically speaking, you won't have to worry about overflow problems.
Agreed.
Best

Kai-Uwe Bux
Jun 27 '08 #3

P: n/a
On 2008-04-26 22:35:48 -0400, "su**************@yahoo.com, India"
<su**************@yahoo.comsaid:
Below is my understanding about count algorithms.

Return type of count and count_if algorithms is
iterator_traits<InputIterator>::difference_type.

If the container contains more than 'difference_type' elements
satisfying the condition, then count and count_if algorithm cannot
return a value greater than 'difference_type'.

As an example, suppose maximum value of 'difference_type' is INT_MAX.
Then the corresponding maximum value of 'size_type' is UINT_MAX. Now
consider the declaration
vector<charv;
where 'v' contains UINT_MAX elements each of which is 'a'.
You can always create unsigned integer values that have no
corresponding signed value. Similarly, you can always create signed
integer values that have no corresponding unsigned value. That's a
well-known and unsolvable limitation of size_t and ptrdiff_t, and
anything else that tries to talk about sizes and differences.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Jun 27 '08 #4

P: n/a
Pete Becker wrote:
On 2008-04-26 22:35:48 -0400, "su**************@yahoo.com, India"
<su**************@yahoo.comsaid:
>Below is my understanding about count algorithms.

Return type of count and count_if algorithms is
iterator_traits<InputIterator>::difference_type .

If the container contains more than 'difference_type' elements
satisfying the condition, then count and count_if algorithm cannot
return a value greater than 'difference_type'.

As an example, suppose maximum value of 'difference_type' is INT_MAX.
Then the corresponding maximum value of 'size_type' is UINT_MAX. Now
consider the declaration
vector<charv;
where 'v' contains UINT_MAX elements each of which is 'a'.

You can always create unsigned integer values that have no
corresponding signed value. Similarly, you can always create signed
integer values that have no corresponding unsigned value. That's a
well-known and unsolvable limitation of size_t and ptrdiff_t, and
anything else that tries to talk about sizes and differences.
In the special case of differences between iterators and sizes of
containers, the problem seems solvable: the standard could require (a) the
existence of a signed type sufficiently big to hold all differences between
iterators that can occur on the given implementation and (b) require that
ptrdiff_t and the various difference_type are typedefs for such signed
type.

On a related note: it is even guaranteed that push_back() on a vector cannot
make its size overflow to 0?
Best

Kai-Uwe Bux
Jun 27 '08 #5

P: n/a
On 2008-04-27 08:05:14 -0400, Kai-Uwe Bux <jk********@gmx.netsaid:
Pete Becker wrote:
>On 2008-04-26 22:35:48 -0400, "su**************@yahoo.com, India"
<su**************@yahoo.comsaid:
>>Below is my understanding about count algorithms.

Return type of count and count_if algorithms is
iterator_traits<InputIterator>::difference_typ e.

If the container contains more than 'difference_type' elements
satisfying the condition, then count and count_if algorithm cannot
return a value greater than 'difference_type'.

As an example, suppose maximum value of 'difference_type' is INT_MAX.
Then the corresponding maximum value of 'size_type' is UINT_MAX. Now
consider the declaration
vector<charv;
where 'v' contains UINT_MAX elements each of which is 'a'.

You can always create unsigned integer values that have no
corresponding signed value. Similarly, you can always create signed
integer values that have no corresponding unsigned value. That's a
well-known and unsolvable limitation of size_t and ptrdiff_t, and
anything else that tries to talk about sizes and differences.

In the special case of differences between iterators and sizes of
containers, the problem seems solvable: the standard could require (a) the
existence of a signed type sufficiently big to hold all differences between
iterators that can occur on the given implementation and (b) require that
ptrdiff_t and the various difference_type are typedefs for such signed
type.
What that amounts to is artificially limiting the size of a container
to the non-negative values of a signed type. If you do that, surely
someone will complain that they want their containers to be larger, and
don't care about difference_type.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Jun 27 '08 #6

P: n/a

"Pete Becker" <pe**@versatilecoding.comwrote in message
news:2008042710290416807-pete@versatilecodingcom...
On 2008-04-27 08:05:14 -0400, Kai-Uwe Bux <jk********@gmx.netsaid:
>Pete Becker wrote:
>>On 2008-04-26 22:35:48 -0400, "su**************@yahoo.com, India"
<su**************@yahoo.comsaid:

Below is my understanding about count algorithms.

Return type of count and count_if algorithms is
iterator_traits<InputIterator>::difference_type .

If the container contains more than 'difference_type' elements
satisfying the condition, then count and count_if algorithm cannot
return a value greater than 'difference_type'.

As an example, suppose maximum value of 'difference_type' is INT_MAX.
Then the corresponding maximum value of 'size_type' is UINT_MAX. Now
consider the declaration
vector<charv;
where 'v' contains UINT_MAX elements each of which is 'a'.

You can always create unsigned integer values that have no
corresponding signed value. Similarly, you can always create signed
integer values that have no corresponding unsigned value. That's a
well-known and unsolvable limitation of size_t and ptrdiff_t, and
anything else that tries to talk about sizes and differences.

In the special case of differences between iterators and sizes of
containers, the problem seems solvable: the standard could require (a)
the
existence of a signed type sufficiently big to hold all differences
between
iterators that can occur on the given implementation and (b) require that
ptrdiff_t and the various difference_type are typedefs for such signed
type.

What that amounts to is artificially limiting the size of a container to
the non-negative values of a signed type. If you do that, surely someone
will complain that they want their containers to be larger, and don't care
about difference_type.
The real problem is that containers have excessive requirements. arrays and
lookups dont in practise need to be iterable at all, but you can still write
generic algorithms. Remove the iterator paradigm and do for_each (container)
in parallel for example. STL was a major innovator for generic programming
but way way far from perfect.... Too many blooming iterators.

regards
Andy Little
Jun 27 '08 #7

P: n/a
On 2008-04-27 11:43:09 -0400, "kwikius" <an**@servocomm.freeserve.co.uksaid:
>
The real problem is that containers have excessive requirements. arrays and
lookups dont in practise need to be iterable at all, but you can still write
generic algorithms. Remove the iterator paradigm and do for_each (container)
in parallel for example. STL was a major innovator for generic programming
but way way far from perfect.... Too many blooming iterators.
Without iterators STL is something entirely different. Container-based
algorithms are nowhere near as flexible as iterator-based ones, because
the sequences that iterators refer to don't need to come from
containers.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Jun 27 '08 #8

P: n/a
Kai-Uwe Bux wrote:
Pete Becker wrote:
>On 2008-04-26 22:35:48 -0400, "su**************@yahoo.com, India"
<su**************@yahoo.comsaid:
>>Below is my understanding about count algorithms.

Return type of count and count_if algorithms is
iterator_traits<InputIterator>::difference_typ e.

If the container contains more than 'difference_type' elements
satisfying the condition, then count and count_if algorithm cannot
return a value greater than 'difference_type'.

As an example, suppose maximum value of 'difference_type' is
INT_MAX. Then the corresponding maximum value of 'size_type' is
UINT_MAX. Now consider the declaration
vector<charv;
where 'v' contains UINT_MAX elements each of which is 'a'.

You can always create unsigned integer values that have no
corresponding signed value. Similarly, you can always create signed
integer values that have no corresponding unsigned value. That's a
well-known and unsolvable limitation of size_t and ptrdiff_t, and
anything else that tries to talk about sizes and differences.

In the special case of differences between iterators and sizes of
containers, the problem seems solvable: the standard could require
(a) the existence of a signed type sufficiently big to hold all
differences between iterators that can occur on the given
implementation and (b) require that ptrdiff_t and the various
difference_type are typedefs for such signed type.
Isn't that what happens in practice?

On the most popular 32 bit systems, you would have to try extremely
hard to allocate a vector<charthat uses more than half the available
address space. For any other types, sizes and ptrdiff values are much
smaller.
Bo Persson


Jun 27 '08 #9

P: n/a

"Pete Becker" <pe**@versatilecoding.comwrote in message
news:2008042711481916807-pete@versatilecodingcom...
On 2008-04-27 11:43:09 -0400, "kwikius" <an**@servocomm.freeserve.co.uk>
said:
>>
The real problem is that containers have excessive requirements. arrays
and
lookups dont in practise need to be iterable at all, but you can still
write
generic algorithms. Remove the iterator paradigm and do for_each
(container)
in parallel for example. STL was a major innovator for generic
programming
but way way far from perfect.... Too many blooming iterators.

Without iterators STL is something entirely different. Container-based
algorithms are nowhere near as flexible as iterator-based ones, because
the sequences that iterators refer to don't need to come from containers.
Most containers arent sequences... except in STL "every container must be a
list". This is a viewpoint imposed by the iterator paradigm.
Custom conforming containers are very complicated to create as need to
provide full custom iterator shebang.

Rather than creating iterators it is just as possible to create views of
arbitrary entities as containers. You only need to create one container view
( as opposed to 2 iterators to create 2 ends of a container view ).

Containers can have simpler requirements... easier to fulfill requirements
and create custom containers

It is also possible to view one container type as another.
increases flexibility for the user and also enables the user to create
custom
containers more easily because each container type has fewer requirements.

But the most important point is it removes the necessity of exposing the
blooming iterators in every algorithm, which makes source code less
expressive :

fun(my_container, your_container , f) ; //overloaded or internally adapted

as against

fun(my_container.begin(), my_container.end(), my_container.begin(),
your_container.end(), f) ; //error?

The loser may be the library implementor of course, but library author to
user is one to many ratio so total code written reduced.
OTOH the library implementor has more scope to optimise because he doesnt
need necessarily to conform internal to the function to the iterator
paradigm.

regards
Andy Little



Jun 27 '08 #10

P: n/a
On 2008-04-27 17:59:37 -0400, "kwikius" <an**@servocomm.freeserve.co.uksaid:
>
"Pete Becker" <pe**@versatilecoding.comwrote in message
news:2008042711481916807-pete@versatilecodingcom...
>On 2008-04-27 11:43:09 -0400, "kwikius" <an**@servocomm.freeserve.co.uk>
said:
>>>
The real problem is that containers have excessive requirements. arrays
and
lookups dont in practise need to be iterable at all, but you can still
write
generic algorithms. Remove the iterator paradigm and do for_each
(container)
in parallel for example. STL was a major innovator for generic
programming
but way way far from perfect.... Too many blooming iterators.

Without iterators STL is something entirely different. Container-based
algorithms are nowhere near as flexible as iterator-based ones, because
the sequences that iterators refer to don't need to come from containers.

Most containers arent sequences... except in STL "every container must be a
list". This is a viewpoint imposed by the iterator paradigm.
This is a viewpoint imposed by the need to operate every element in a
container.
Custom conforming containers are very complicated to create as need to
provide full custom iterator shebang.

Rather than creating iterators it is just as possible to create views of
arbitrary entities as containers. You only need to create one container view
( as opposed to 2 iterators to create 2 ends of a container view ).

If you need to operate on every element in a container, you end up
treating the container as a sequence. Creating one container view
versus two iterators doesn't change that.

>
Containers can have simpler requirements... easier to fulfill requirements
and create custom containers
Since there is no description here of how this paradigm works, I can't
assess this assertion.
>
It is also possible to view one container type as another.
increases flexibility for the user and also enables the user to create
custom
containers more easily because each container type has fewer requirements.

But the most important point is it removes the necessity of exposing the
blooming iterators in every algorithm, which makes source code less
expressive :

fun(my_container, your_container , f) ; //overloaded or internally adapted

as against

fun(my_container.begin(), my_container.end(), my_container.begin(),
your_container.end(), f) ; //error?
At the cost of having to pretend that the keyboard, for example, is a
container rather than the source of a sequence of characters.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Jun 27 '08 #11

P: n/a

"Pete Becker" <pe**@versatilecoding.comwrote in message
news:2008042807104016807-pete@versatilecodingcom...
On 2008-04-27 17:59:37 -0400, "kwikius" <an**@servocomm.freeserve.co.uk>
said:
>>
"Pete Becker" <pe**@versatilecoding.comwrote in message
news:2008042711481916807-pete@versatilecodingcom...
>>On 2008-04-27 11:43:09 -0400, "kwikius" <an**@servocomm.freeserve.co.uk>
<...>
>>Without iterators STL is something entirely different. Container-based
algorithms are nowhere near as flexible as iterator-based ones, because
the sequences that iterators refer to don't need to come from
containers.

Most containers arent sequences... except in STL "every container must be
a
list". This is a viewpoint imposed by the iterator paradigm.

This is a viewpoint imposed by the need to operate every element in a
container.
Not all containers need to have every element operated on. lookup is an
example. You don't need to iterate a lookup for its main functionality . Now
when you need to iterate over a lookup you can create a listview, and not
just one but many, sorted by key , or key range sorted by value or value
range, arbitrary sort function, and even layer views. You can apply
functionality for the particular purpose when you need it and then discard
it.
> Custom conforming containers are very complicated to create as need to
provide full custom iterator shebang.

Rather than creating iterators it is just as possible to create views of
arbitrary entities as containers. You only need to create one container
view
( as opposed to 2 iterators to create 2 ends of a container view ).


If you need to operate on every element in a container, you end up
treating the container as a sequence. Creating one container view versus
two iterators doesn't change that.

>>
Containers can have simpler requirements... easier to fulfill
requirements
and create custom containers

Since there is no description here of how this paradigm works, I can't
assess this assertion.
Its about concepts dealing with one single functionality of a type in
expression rather than monolithic concepts covering large amounts of
functionality ( e.g HasPlus preferred to Arithmetic ) think of eg
ForEachableContainer CountIfAbleContainer etc, like the modern move from
"traits blob" to metafunctions.

<...>
>But the most important point is it removes the necessity of exposing the
blooming iterators in every algorithm, which makes source code less
expressive :

fun(my_container, your_container , f) ; //overloaded or internally
adapted

as against

fun(my_container.begin(), my_container.end(), my_container.begin(),
your_container.end(), f) ; //error?

At the cost of having to pretend that the keyboard, for example, is a
container rather than the source of a sequence of characters.
Oh don't get me started on istream.

Simple console input is a lazy list with customisable terminator... eg for
continuous multiline input, disc istream is array or tuple, database istream
a lookup etc...

regards
Andy Little
Jun 27 '08 #12

P: n/a
On 2008-04-28 13:31:34 -0400, "kwikius" <an**@servocomm.freeserve.co.uksaid:
>
Not all containers need to have every element operated on.
In which case they don't have to have iterators.
lookup is an
example. You don't need to iterate a lookup for its main functionality .
Sigh. You have to be able to look at every element, i.e. "operate on"
every element.
>>
Since there is no description here of how this paradigm works, I can't
assess this assertion.

Its about concepts dealing with one single functionality of a type in
expression rather than monolithic concepts covering large amounts of
functionality ( e.g HasPlus preferred to Arithmetic ) think of eg
ForEachableContainer CountIfAbleContainer etc, like the modern move from
"traits blob" to metafunctions.
I'm sorry, I have no idea what that means.
>
<...>
>>But the most important point is it removes the necessity of exposing the
blooming iterators in every algorithm, which makes source code less
expressive :

fun(my_container, your_container , f) ; //overloaded or internally
adapted

as against

fun(my_container.begin(), my_container.end(), my_container.begin(),
your_container.end(), f) ; //error?

At the cost of having to pretend that the keyboard, for example, is a
container rather than the source of a sequence of characters.

Oh don't get me started on istream.
Who said anything about istream? Regardless, you have to deal with the
issue of pretending that a keyboard is a container.
>
Simple console input is a lazy list with customisable terminator... eg for
continuous multiline input, disc istream is array or tuple, database istream
a lookup etc...
That's what I said: you have to pretend that the keyboard is a container.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Jun 27 '08 #13

This discussion thread is closed

Replies have been disabled for this discussion.