473,320 Members | 1,695 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Framework for iterating over product types?

Background: I have some structs containing std::strings and std::vectors of
other structs containing std::strings and std::vectors of .... I'd like to
make a std::vector of these. Unfortunately the overhead of the useless
copies made each time the vector is resized is too large for me to ignore. I
know that rvalue references fix this problem, but I don't think they'll be
widely available for years and I need something that works now. There's no
sensible value I can pass to reserve(). vector<shared_ptris faster, but
the per-item allocation overhead is still significant, and the interface is
different. I could wrap it with the interface I wanted, but that might be
error-prone since it would violate std::vector's memory layout guarantee.

I've seriously thought about writing a linear_vector which moves instead of
copying its elements, perhaps calling

relocate(T* dst, T& src)

which could be overloaded for each type. But writing relocate() for every
type I care about is a hassle; worse, there's no good place to put the
definitions. Putting them in the header with the struct itself feels like an
abstraction violation (why should classes need to anticipate that they might
be put in a linear_vector?), and putting it anywhere else is out of the
question since it will inevitably get out of sync with the struct, leading
to nasty bugs.

I would feel happier about writing boilerplate in each class if it were more
widely applicable. I thought about piggybacking on Boost's serialize()
functions, which often look like this:

struct Foo {
int a, b, c;
std::vector< std::map<std::string,int d;

// ...

template<class Archive>
void serialize(Archive& ar, unsigned version) {
ar & a & b & c & d;
}
};

but since you can only call serialize on one instance at once, you'd have to
make separate passes over the source struct and the destination memory. This
is unpleasant at the least, and I'm not sure it could be made to work at
all. But a static method supplying pointers-to-members avoids this problem:

struct Foo {
// ...

template<class T>
static void enum_children(T& has) {
has (&Foo::a) (&Foo::b) (&Foo::c) (&Foo::d);
}
};

Then I could write something like:

template<class T, class X>
void relocate_member(T* dst, T& src, X T::* mbr) {
relocate(&dst->*mbr, src.*mbr);
}

// SFINAE magic omitted
template<class T>
void relocate(T* dst, T& src) {
T::enum_children(bind(&relocate_member, dst, src));
}

and furthermore I could write generic implementations of a lot of other
useful functions, like

* a better default swap() than the standard library's

* componentwise operator==() and lexicographical operator<()

* Boost serialize()

* iostream inserters and extracters using (e.g.) a notation
resembling initializer lists

* the usual (deep) visitor pattern

all of which are pretty commonly needed and annoying to write by hand. A
similar idea is implemented in Haskell and described in the "scrap your
boilerplate" papers [1], which have additional motivating examples.

[1] http://research.microsoft.com/~simonpj/papers/hmap/

Of course, I don't want to invent my own private version of this technique;
I want to use a standardized library. I can't be the first person to propose
this for C++, but I can't find anything like it in Boost, to my surprise. Am
I missing something?

Addendum: you probably noticed (it took me annoyingly long to notice) that
my relocate() interface is not a very good one, since there is, to my
knowledge, no guarantee in the standard that any non-POD type can be
correctly relocated by relocating its members. I don't think the proposed
C++0x changes fix this. Couldn't there be a guarantee that a struct like

struct X { A a; B b; };

can be placement-constructed by placement-constructing its members,
regardless of the POD-ness of those members? This is a kind of "shallow POD"
as distinguished from the usual "deep" POD. Has this been discussed?

-- Ben
Nov 27 '07 #1
9 1669
On Tue, 27 Nov 2007 20:30:06 +0100, "Alf P. Steinbach" wrote:
>* Ben Rudiak-Gould:
>Background: I have some structs containing std::strings and std::vectors
of other structs containing std::strings and std::vectors of .... I'd
like to make a std::vector of these. Unfortunately the overhead of the
useless copies made each time the vector is resized is too large for me
to ignore.

Indirection.
Have you really understood what he means? I haven't. Or have you
merely read the first paragraph?
--
Roland Pibinger
"The best software is simple, elegant, and full of drama" - Grady Booch
Nov 27 '07 #2
Ben Rudiak-Gould wrote:
Background: I have some structs containing std::strings and std::vectors
of other structs containing std::strings and std::vectors of .... I'd like
to make a std::vector of these. Unfortunately the overhead of the useless
copies made each time the vector is resized is too large for me to ignore.
I know that rvalue references fix this problem, but I don't think they'll
be widely available for years and I need something that works now. There's
no sensible value I can pass to reserve(). vector<shared_ptris faster,
but the per-item allocation overhead is still significant, and the
interface is different. I could wrap it with the interface I wanted, but
that might be error-prone since it would violate std::vector's memory
layout guarantee.

I've seriously thought about writing a linear_vector which moves instead
of copying its elements, perhaps calling

relocate(T* dst, T& src)

which could be overloaded for each type. But writing relocate() for every
type I care about is a hassle; worse, there's no good place to put the
definitions. Putting them in the header with the struct itself feels like
an abstraction violation (why should classes need to anticipate that they
might be put in a linear_vector?),
Because that is a conceptual requirement of linear_vector. It makes perfect
sense to put the relocate function with the class so that it is found
through ADL. This is just the same as all other conceptual requirements
like CopyConstructible or Swappable (in the current draft for C++0X).

and putting it anywhere else is out of
the question since it will inevitably get out of sync with the struct,
leading to nasty bugs.
Agreed.
[snip]
Best

Kai-Uwe Bux
Nov 27 '07 #3
Roland Pibinger wrote:
On Tue, 27 Nov 2007 20:30:06 +0100, "Alf P. Steinbach" wrote:
>>* Ben Rudiak-Gould:
>>Background: I have some structs containing std::strings and std::vectors
of other structs containing std::strings and std::vectors of .... I'd
like to make a std::vector of these. Unfortunately the overhead of the
useless copies made each time the vector is resized is too large for me
to ignore.

Indirection.

Have you really understood what he means? I haven't. Or have you
merely read the first paragraph?
I read further, but the OP lost me somewhere around the lines

struct Foo {
// ...

template<class T>
static void enum_children(T& has) {
has (&Foo::a) (&Foo::b) (&Foo::c) (&Foo::d);
}
};
Best

Kai-Uwe Bux

Nov 27 '07 #4
The whole boilerplate idea looks suspiciously like runtime reflection to
me and while it may be challenging to develop a system within C++ only,
in practice it turns out to be just annoying (at least to me) and error
prone (again...me).

However, people had success with mining the required information from
source code itself. Generating code or data structures from the output
of gccxml is a very popular choice, for example in the SEAL-REFLEX
project: <http://seal-reflex.web.cern.ch/seal-reflex/index.html>

The boost::python binding library also enumerates the members (to be
used by python) and can also utliize gccxml to generate enumeration
code. You might want to have a look at it.

Regarding the vector-resize problem. I would simply build a custom
vector type, which holds a vector of pointers to small arrays of
constant size (say 16 or 256). Resizing such a vector of N elements,
only requires copying N/16 pointers, access by an integer index is a bit
more expensive of course (two indirections by [i/16] and [i%16]) but not
really much.
Nov 27 '07 #5
Ben Rudiak-Gould wrote:
Background: I have some structs containing std::strings and std::vectors
of other structs containing std::strings and std::vectors of .... I'd like
to make a std::vector of these. Unfortunately the overhead of the useless
copies made each time the vector is resized is too large for me to ignore.
Another thought: if you don't need the contiguity guarantee of std::vector,
you could use std::deque instead.
Best

Kai-Uwe Bux
Nov 27 '07 #6
Alf P. Steinbach wrote:
Neither have I, the article is not a great example of clarity.
I seriously regret writing it in the bottom-up way that I did. The initial
motivating example was almost irrelevant. Here's a shorter version.

There are many classes for which swap() can be defined like this:

swap(this->a, other.a);
swap(this->b, other.b);
swap(this->c, other.c);
// ...

and serialize() can be defined like this:

ar & a;
ar & b;
ar & c;
// ...

and operator==() can be defined like this:

if (!(this->a == other.a)) return false;
if (!(this->b == other.b)) return false;
if (!(this->c == other.c)) return false;
// ...

and operator<() can be defined like this:

if (this->a < other.a) return true;
if (other.a < this->a) return false;
if (this->b < other.b) return true;
if (other.b < this->b) return false;
if (this->c < other.c) return true;
if (other.c < this->c) return false;
// ...

and so on. It's a hassle to write this boilerplate for a large number of
user-defined classes. All of the above boilerplate functions can be
auto-generated from a single per-class boilerplate method looking something
like this:

template<class Visitor>
static void enum_children(Visitor& has) {
has (&Foo::a) (&Foo::b) (&Foo::c) /* ... */ ;
}

for example, here's how to do it for swap():

template<class T>
class swap_member {
public:
swap_member(T& a, T& b) : a(a), b(b) {}

template<class X>
void operator()(X T::* p) {
swap(a.*p, b.*p);
}

private:
T& a; T& b;
};

template<class T>
void generic_swap(T& a, T& b) {
T::enum_children(swap_member(a,b));
}

For a system with a lot of boilerplate methods of this kind, this technique
looks like it could save a lot of time and avoid a lot of bugs. I was
surprised not to find support for it in Boost. I can write my own
implementation, but I'd rather use someone else's, or at least use a
standardized name for the method (instead of "enum_children") for future
compatibility. So, has anyone implemented this before or chosen a naming
convention for it?

-- Ben
Nov 28 '07 #7
Ben Rudiak-Gould wrote:
[..]
There are many classes for which swap() can be defined like this:

swap(this->a, other.a);
swap(this->b, other.b);
swap(this->c, other.c);
// ...

and serialize() can be defined like this:

ar & a;
ar & b;
ar & c;
// ...

and operator==() can be defined like this:

if (!(this->a == other.a)) return false;
if (!(this->b == other.b)) return false;
if (!(this->c == other.c)) return false;
// ...

and operator<() can be defined like this:

if (this->a < other.a) return true;
if (other.a < this->a) return false;
if (this->b < other.b) return true;
if (other.b < this->b) return false;
if (this->c < other.c) return true;
if (other.c < this->c) return false;
// ...

and so on. It's a hassle to write this boilerplate for a large number
of user-defined classes. All of the above boilerplate functions can be
auto-generated from a single per-class boilerplate method looking
something like this:

template<class Visitor>
static void enum_children(Visitor& has) {
has (&Foo::a) (&Foo::b) (&Foo::c) /* ... */ ;
}

for example, here's how to do it for swap():

template<class T>
class swap_member {
public:
swap_member(T& a, T& b) : a(a), b(b) {}

template<class X>
void operator()(X T::* p) {
swap(a.*p, b.*p);
}

private:
T& a; T& b;
};

template<class T>
void generic_swap(T& a, T& b) {
T::enum_children(swap_member(a,b));
}

For a system with a lot of boilerplate methods of this kind, this
technique looks like it could save a lot of time and avoid a lot of
bugs. I was surprised not to find support for it in Boost. I can
write my own implementation, but I'd rather use someone else's, or at
least use a standardized name for the method (instead of
"enum_children") for future compatibility. So, has anyone implemented
this before or chosen a naming convention for it?
In all of my programming career the need for a struct/class like that
have not arisen even once. Forgive my bluntness, but it looks like
an exemplary representation of a pure numerical (and rather table-
oriented) data chunk. For such data it is advised NOT to write
operator<, operator==, etc., as member functions. Use stand-alone
functions instead.

I honestly doubt that you have "a large number of user-defined classes"
that all look like the one you presented here. Two, three, maybe, but
"a large number"?

As to whether it's been done before, I am not sure, but if it hasn't,
write it (since you claim you can) and submit it to Boost. That's
how Boost grows. You will undoubtedly find those who also need the
functionality; they might even be interested in helping you write it,
and if not, no big deal, they'll tell you what's wrong with yours.
If you don't find people who need something like that, well, you may
be alone, but they you're stuck implementing it for yourself anyway.

And to simplify the task I'd probably put those 'a', 'b', 'c', etc.
members in arrays of the respective types (hopefully they are all
pretty much the same, aren't they?), and to access them I'd have the
member functions that would simply returned the array elements (by
ref or by value)...

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Nov 28 '07 #8
Victor Bazarov wrote:
In all of my programming career the need for a struct/class like that
have not arisen even once.
What kind of struct/class do you mean? Obviously I don't have a class with
members named 'a', 'b', 'c', etc. and all of those functions defined as
members. The letters of the alphabet were just placeholders for actual
member names. All that's relevant for the purposes of this idiom is the
pattern of code that repeats once for each data member of a class. It
doesn't matter if we're talking about member operator< or nonmember
operator< or member "less" or nonmember "less". I'm also not saying that I
have a bunch of classes defining every single one of the above methods, just
that I often find myself writing function bodies that look like this.
As to whether it's been done before, I am not sure, but if it hasn't,
write it (since you claim you can) and submit it to Boost.
I don't think I can write it well. Writing generic code that seems to work
most of the time is a lot easier than writing generic code that works. Also,
I just have a hard time believing that no one has formalized this pattern in
decades of C++ use. But if it really is new, I might indeed submit it to
Boost and hope their code review catches my mistakes. It would be a good
learning experience.

-- Ben
Nov 28 '07 #9
On Nov 28, 12:59 pm, Ben Rudiak-Gould <br276delet...@cam.ac.ukwrote:
Alf P. Steinbach wrote:
Neither have I, the article is not a great example of clarity.

I seriously regret writing it in the bottom-up way that I did. The initial
motivating example was almost irrelevant. Here's a shorter version.

There are many classes for which swap() can be defined like this:

swap(this->a, other.a);
swap(this->b, other.b);
swap(this->c, other.c);
// ...

and serialize() can be defined like this:

ar & a;
ar & b;
ar & c;
// ...

and operator==() can be defined like this:

if (!(this->a == other.a)) return false;
if (!(this->b == other.b)) return false;
if (!(this->c == other.c)) return false;
// ...

and operator<() can be defined like this:

if (this->a < other.a) return true;
if (other.a < this->a) return false;
if (this->b < other.b) return true;
if (other.b < this->b) return false;
if (this->c < other.c) return true;
if (other.c < this->c) return false;
// ...

and so on. It's a hassle to write this boilerplate for a large number of
user-defined classes. All of the above boilerplate functions can be
auto-generated from a single per-class boilerplate method looking something
like this:

template<class Visitor>
static void enum_children(Visitor& has) {
has (&Foo::a) (&Foo::b) (&Foo::c) /* ... */ ;
}
Thanks for reworking the post... I guess by the above
you mean a function will be applied to each member?

That would probably work in some cases, but what if you
want to swap all the members but not marshall them all?
I think that comes up sometimes.

In my efforts to address similar problems,
www.webebenezer.net, I want to avoid users having to
duplicate members even once. In my opinion, the
language/compilers have to change to better support
the automation of functions like you mention.

Brian Wood
Ebenezer Enterprises
Nov 29 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Jim Wood | last post by:
Hi. I don't do any software development, but had to download and install dot-net Framework 1.1 in order to install a Lantronics device-installer utility. I see that Framework has added itself to...
2
by: Tomas Vera | last post by:
Hello All, I'm running into a problem that I don't know how to solve (properly, at least). Our web servers are running Win2K and have Framework v1.0.3705 running on them. We have a DLL...
2
by: Nick | last post by:
Hi all, Just a quick question. I have a class that exposes a number of fields (which are themselves custom types) through public properties. At run time, I have an object whom I'd like to...
2
by: g | last post by:
we are evaluating CRM packages and are looking for information on the differences/simliarities, pros and cons of these architectures used by various CRM Vendors. We have been searching for more...
23
by: walterbyrd | last post by:
Way back when, I got a lot of training and experience in highly structued software development. These days, I dabble with web-development, but I may become more serious. I consider php to be an...
0
by: innovasys | last post by:
TORQUAY, DEVON, UK - Innovasys announced the release of Document! X 5, the fifth version of the documentation solution of choice for developers using Microsoft Visual Studio or the .NET Framework....
3
by: Anibal David Acosta F. | last post by:
Currently is not possible to remove and item while iterating between them, so ... I suggest something like this.. foreach (MyType CurrentItem in MyCollection) {...
25
by: Blasting Cap | last post by:
I keep getting errors that pop up when I am trying to convert an application from dotnet framework 1.1 to framework 2.0. The old project was saved in sourcesafe from Visual Studio 2003, and I have...
3
by: =?Utf-8?B?YWJjZA==?= | last post by:
I have upgraded my VS 2008 with SP1. It has automatically upgraded .NET framework to SP1 too. Now I will be deleloping my product using .NET Framework 3.5 SP1. Can this developed product work...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.