I am writing a program which needs to include a large amount of data.
Basically, the data are p values for different possible outcomes from
trials with different number of observations (the p values are
necessarily based on slow simulations rather than on a standard
function, so I estimated them once and want the program to include
this information). Currently, I have this stored as a vector of
vectors of varying sizes (first vector is indexed by number of
observations for the trial; for each number of observations, there is
a vector containing a p value for different numbers of successes, with
these vectors getting longer as the number of observations (and
therefore possible successes) increases). I created a class containing
this vector of vectors; my program, on starting, creates an object of
this class. However, the file containing just this class is ~50,000
lines long and 10 MB in size, and takes a great deal of time to
compile, especially with optimization turned on. Is there a better way
of building large amounts of data into C++ programs? I could just
include a separate datafile, and have the program call it upon
starting, but then that would require having the program know where
the file is, even when I distribute it. In case this helps, I am
already using the GNU Scientific Library in the program, so using any
functions there is an easy option. My apologies if this question has
an obvious, standard solution I should already know about.
Excerpt from class file (CDFvectorholde r) containing vector of
vectors:
vector<vector<d ouble CDFvectorholder ::Initialize() {
vector<vector<d ouble CDFvectorconten ts;
vector<doubleco ntentsofrow;
contentsofrow.p ush_back(0.3329 8);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=3
contentsofrow.c lear();
contentsofrow.p ush_back(0.0735 2);
contentsofrow.p ush_back(0.1473 3);
contentsofrow.p ush_back(0.3339 3);
contentsofrow.p ush_back(0.7801 9);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=4
contentsofrow.c lear();
contentsofrow.p ush_back(0.0120 9);
contentsofrow.p ush_back(0.0329 2);
contentsofrow.p ush_back(0.0420 2);
contentsofrow.p ush_back(0.0767 );
contentsofrow.p ush_back(0.1331 4);
contentsofrow.p ush_back(0.2341 7);
contentsofrow.p ush_back(0.4092 1);
contentsofrow.p ush_back(0.5893 4);
contentsofrow.p ush_back(0.8223 9);
contentsofrow.p ush_back(0.9853 7);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=5
//ETC
return CDFvectorconten ts;
}
and the main program file, initializing the vector of vectors:
vector<vector<d ouble CDFvector;
CDFvectorholder bob;
CDFvector=bob.I nitialize();
and using it:
double cdfundermodel=C DFvector[integerB][integerA];
Thank you,
Brian O'Meara 4 1927
"bc******@ucdav is.edu" <om**********@g mail.comwrote in message
news:11******** *************@l 77g2000hsb.goog legroups.com...
>I am writing a program which needs to include a large amount of data.
Basically, the data are p values for different possible outcomes from
trials with different number of observations (the p values are
necessarily based on slow simulations rather than on a standard
function, so I estimated them once and want the program to include
this information). Currently, I have this stored as a vector of
vectors of varying sizes (first vector is indexed by number of
observations for the trial; for each number of observations, there is
a vector containing a p value for different numbers of successes, with
these vectors getting longer as the number of observations (and
therefore possible successes) increases). I created a class containing
this vector of vectors; my program, on starting, creates an object of
this class. However, the file containing just this class is ~50,000
lines long and 10 MB in size, and takes a great deal of time to
compile, especially with optimization turned on. Is there a better way
of building large amounts of data into C++ programs? I could just
include a separate datafile, and have the program call it upon
starting, but then that would require having the program know where
the file is, even when I distribute it. In case this helps, I am
already using the GNU Scientific Library in the program, so using any
functions there is an easy option. My apologies if this question has
an obvious, standard solution I should already know about.
Excerpt from class file (CDFvectorholde r) containing vector of
vectors:
vector<vector<d ouble CDFvectorholder ::Initialize() {
vector<vector<d ouble CDFvectorconten ts;
vector<doubleco ntentsofrow;
contentsofrow.p ush_back(0.3329 8);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=3
contentsofrow.c lear();
contentsofrow.p ush_back(0.0735 2);
contentsofrow.p ush_back(0.1473 3);
contentsofrow.p ush_back(0.3339 3);
contentsofrow.p ush_back(0.7801 9);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=4
contentsofrow.c lear();
contentsofrow.p ush_back(0.0120 9);
contentsofrow.p ush_back(0.0329 2);
contentsofrow.p ush_back(0.0420 2);
contentsofrow.p ush_back(0.0767 );
contentsofrow.p ush_back(0.1331 4);
contentsofrow.p ush_back(0.2341 7);
contentsofrow.p ush_back(0.4092 1);
contentsofrow.p ush_back(0.5893 4);
contentsofrow.p ush_back(0.8223 9);
contentsofrow.p ush_back(0.9853 7);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=5
//ETC
return CDFvectorconten ts;
}
and the main program file, initializing the vector of vectors:
vector<vector<d ouble CDFvector;
CDFvectorholder bob;
CDFvector=bob.I nitialize();
and using it:
double cdfundermodel=C DFvector[integerB][integerA];
Data does not belong in code. The data should go in a seperate file.
Normally this data file would be in the same directory as the executable.
If you really think the ueer will lose the data file, you can do the trick
of adding it to the end of the executable (if your OS allows it). bc******@ucdavi s.edu wrote:
I am writing a program which needs to include a large amount of data.
Basically, the data are p values for different possible outcomes from
trials with different number of observations (the p values are
necessarily based on slow simulations rather than on a standard
function, so I estimated them once and want the program to include
this information).
I sincerely hope that the data reside in a separate, include-able
source file, which is generated by some other program somehow, instead
of being typed in by a human reading some other print-out or protocol
of some experiment...
Currently, I have this stored as a vector of
vectors of varying sizes (first vector is indexed by number of
observations for the trial; for each number of observations, there is
a vector containing a p value for different numbers of successes, with
these vectors getting longer as the number of observations (and
therefore possible successes) increases). I created a class containing
this vector of vectors; my program, on starting, creates an object of
this class. However, the file containing just this class is ~50,000
lines long and 10 MB in size, and takes a great deal of time to
compile, especially with optimization turned on. Is there a better way
of building large amounts of data into C++ programs?
Something like
------------------- experiments.cpp (generated)
namespace DATA {
double data_000[5] = { 0.0, 1., 2.2, 3.33, 4.444 };
double data_001[7] = { 0.0, 1.1, 2.222, 3.3333, 4.44444, 5.55, 6.66 };
....
double data_042[3] = { 1.1, 2.22, 3.333 };
std::vector<dou bledata[] = {
std::vector<dou ble>(data_000,
data_000 + sizeof(data_000 ) / sizeof(double)) ,
std::vector<dou ble>(data_001,
data_001 + sizeof(data_001 ) / sizeof(double)) ,
...
std::vector<dou ble>(data_042,
data_042 + sizeof(data_042 ) / sizeof(double)) ,
};
} // namespace DATA
------------------- my_vectors.cpp
#include <experiments.cp p>
std::vector<std ::vector<double
CDFvectorconten ts(data.begin() , data.end());
-----------------------------------
?
I could just
include a separate datafile, and have the program call it upon
starting, but then that would require having the program know where
the file is, even when I distribute it. In case this helps, I am
already using the GNU Scientific Library in the program, so using any
functions there is an easy option. My apologies if this question has
an obvious, standard solution I should already know about.
Excerpt from class file (CDFvectorholde r) containing vector of
vectors:
vector<vector<d ouble CDFvectorholder ::Initialize() {
vector<vector<d ouble CDFvectorconten ts;
vector<doubleco ntentsofrow;
contentsofrow.p ush_back(0.3329 8);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=3
contentsofrow.c lear();
contentsofrow.p ush_back(0.0735 2);
contentsofrow.p ush_back(0.1473 3);
contentsofrow.p ush_back(0.3339 3);
contentsofrow.p ush_back(0.7801 9);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=4
contentsofrow.c lear();
contentsofrow.p ush_back(0.0120 9);
contentsofrow.p ush_back(0.0329 2);
contentsofrow.p ush_back(0.0420 2);
contentsofrow.p ush_back(0.0767 );
contentsofrow.p ush_back(0.1331 4);
contentsofrow.p ush_back(0.2341 7);
contentsofrow.p ush_back(0.4092 1);
contentsofrow.p ush_back(0.5893 4);
contentsofrow.p ush_back(0.8223 9);
contentsofrow.p ush_back(0.9853 7);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=5
//ETC
return CDFvectorconten ts;
}
and the main program file, initializing the vector of vectors:
vector<vector<d ouble CDFvector;
CDFvectorholder bob;
CDFvector=bob.I nitialize();
and using it:
double cdfundermodel=C DFvector[integerB][integerA];
Thank you,
Brian O'Meara
V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask bc******@ucdavi s.edu wrote:
I am writing a program which needs to include a large amount of data.
Basically, the data are p values for different possible outcomes from
trials with different number of observations (the p values are
necessarily based on slow simulations rather than on a standard
function, so I estimated them once and want the program to include
this information). Currently, I have this stored as a vector of
vectors of varying sizes (first vector is indexed by number of
observations for the trial; for each number of observations, there is
a vector containing a p value for different numbers of successes, with
these vectors getting longer as the number of observations (and
therefore possible successes) increases). I created a class containing
this vector of vectors; my program, on starting, creates an object of
this class. However, the file containing just this class is ~50,000
lines long and 10 MB in size, and takes a great deal of time to
compile, especially with optimization turned on. Is there a better way
of building large amounts of data into C++ programs? I could just
include a separate datafile, and have the program call it upon
starting, but then that would require having the program know where
the file is, even when I distribute it. In case this helps, I am
already using the GNU Scientific Library in the program, so using any
functions there is an easy option. My apologies if this question has
an obvious, standard solution I should already know about.
Excerpt from class file (CDFvectorholde r) containing vector of
vectors:
vector<vector<d ouble CDFvectorholder ::Initialize() {
vector<vector<d ouble CDFvectorconten ts;
vector<doubleco ntentsofrow;
contentsofrow.p ush_back(0.3329 8);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=3
contentsofrow.c lear();
contentsofrow.p ush_back(0.0735 2);
contentsofrow.p ush_back(0.1473 3);
contentsofrow.p ush_back(0.3339 3);
contentsofrow.p ush_back(0.7801 9);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=4
contentsofrow.c lear();
contentsofrow.p ush_back(0.0120 9);
contentsofrow.p ush_back(0.0329 2);
contentsofrow.p ush_back(0.0420 2);
contentsofrow.p ush_back(0.0767 );
contentsofrow.p ush_back(0.1331 4);
contentsofrow.p ush_back(0.2341 7);
contentsofrow.p ush_back(0.4092 1);
contentsofrow.p ush_back(0.5893 4);
contentsofrow.p ush_back(0.8223 9);
contentsofrow.p ush_back(0.9853 7);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=5
//ETC
return CDFvectorconten ts;
}
and the main program file, initializing the vector of vectors:
vector<vector<d ouble CDFvector;
CDFvectorholder bob;
CDFvector=bob.I nitialize();
and using it:
double cdfundermodel=C DFvector[integerB][integerA];
If it is truly a "large" amount of data (say >4meg compiled) then you
can think of using a container that can be statically initialized.
i.e.
// in header
struct datatype
{
double coeffs1[5];
double coeffs2[100];
double coeffs3[20];
};
extern datatype data;
// in data file
datatype data = {
{ 0.2, 0.4, 0.6 },
{ 1.1, 1.2 },
{ 0.1, 0.2 }
};
You could write a wrapper class that "looks" like a const std::vector
that wraps either a std::vector or a regular array so that you don't
need to make copies of the data you have.
On Apr 9, 11:17 pm, "bcome...@ucdav is.edu" <omeara.br...@g mail.com>
wrote:
I am writing a program which needs to include a large amount of data.
Basically, the data are p values for different possible outcomes from
trials with different number of observations (the p values are
necessarily based on slow simulations rather than on a standard
function, so I estimated them once and want the program to include
this information). Currently, I have this stored as a vector of
vectors of varying sizes (first vector is indexed by number of
observations for the trial; for each number of observations, there is
a vector containing a p value for different numbers of successes, with
these vectors getting longer as the number of observations (and
therefore possible successes) increases). I created a class containing
this vector of vectors; my program, on starting, creates an object of
this class. However, the file containing just this class is ~50,000
lines long and 10 MB in size, and takes a great deal of time to
compile, especially with optimization turned on.
If it's just data, optimization should make no difference.
Is there a better way
of building large amounts of data into C++ programs? I could just
include a separate datafile, and have the program call it upon
starting, but then that would require having the program know where
the file is, even when I distribute it. In case this helps, I am
already using the GNU Scientific Library in the program, so using any
functions there is an easy option. My apologies if this question has
an obvious, standard solution I should already know about.
Excerpt from class file (CDFvectorholde r) containing vector of
vectors:
vector<vector<d ouble CDFvectorholder ::Initialize() {
vector<vector<d ouble CDFvectorconten ts;
vector<doubleco ntentsofrow;
contentsofrow.p ush_back(0.3329 8);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=3
contentsofrow.c lear();
contentsofrow.p ush_back(0.0735 2);
contentsofrow.p ush_back(0.1473 3);
contentsofrow.p ush_back(0.3339 3);
contentsofrow.p ush_back(0.7801 9);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=4
contentsofrow.c lear();
contentsofrow.p ush_back(0.0120 9);
contentsofrow.p ush_back(0.0329 2);
contentsofrow.p ush_back(0.0420 2);
contentsofrow.p ush_back(0.0767 );
contentsofrow.p ush_back(0.1331 4);
contentsofrow.p ush_back(0.2341 7);
contentsofrow.p ush_back(0.4092 1);
contentsofrow.p ush_back(0.5893 4);
contentsofrow.p ush_back(0.8223 9);
contentsofrow.p ush_back(0.9853 7);
contentsofrow.p ush_back(1);
CDFvectorconten ts.push_back(co ntentsofrow); //comparison where
ntax=5
//ETC
return CDFvectorconten ts;
}
And this is called at program start-up? Start-up isn't going to
be very fast.
and the main program file, initializing the vector of vectors:
vector<vector<d ouble CDFvector;
CDFvectorholder bob;
CDFvector=bob.I nitialize();
and using it:
double cdfundermodel=C DFvector[integerB][integerA];
I'd say that this is one case I'd use C style arrays, and static
initialization. It will still take some time to compile it, but
no where near as much as if you call a function on a templated
class for each element. And start-up time will be effectively
zero.
If you do need some of the additional features of std::vector,
then you can still use the static, C-style array to initialize
it, e.g.
std::vector( startAddress, endAddress ) ;
(Whatever code generates the C-style array can also be used to
generate the startAddress and endAddress variables.)
--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34 This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: michaaal |
last post by:
If I use a form to pass data (say, for example, through a textbox) the data
seems to be limited to somewhat smaller amounts. What should I do if I want
to pass a large amount of data? For example a list of 200 items?
|
by: Digety |
last post by:
We are looking to store a large amount of user data that will be
changed and accessed daily by a large number of people. We expect
around 6-8 million subscribers to our service with each record being
approximately 2000-2500 bytes. The system needs to be running 24/7
and therefore cannot be shut down. What is the best way to implement
this? We were thinking of setting up a cluster of servers to hold the
information and another cluster...
|
by: oshanahan |
last post by:
Does anyone have ideas on the best way to move large amounts of data
between tables? I am doing several simple insert/select statements
from a staging table to several holding tables, but because of the
volume it is taking an extraordinary amount of time. I considered
using cursors but have read that may not be the best thing for this
situation. Any thoughts?
--
Posted using the http://www.dbforumz.com interface, at author's request...
|
by: CSN |
last post by:
Is it possible to iterate over an array in plpgsql?
Something like:
function insert_stuff (rel_ids int)
....
foreach rel_ids as id
insert into table (rel_id, val)
values (id, 5);
|
by: Bart |
last post by:
Dear all,
I would like to encrypt a large amount of data by using public/private keys,
but I read on MSDN:
"Symmetric encryption is performed on streams and is therefore useful to
encrypt large amounts of data. Asymmetric encryption is performed on a small
number of bytes and is therefore only useful for small amounts of data."
There is not possibility to do it? I have tried to encrypt a 300kB file by
RSA Algorithm, but I received...
| |
by: David Helgason |
last post by:
I think those best practices threads are a treat to follow (might even
consider archiving some of them in a sort of best-practices faq), so
here's one more.
In coding an game asset server I want to keep a large number of file
revisions of varying sizes (1Kb-50Mb) inside the database.
Naturally I want to avoid having to allocate whole buffers of 50Mb too
often.
|
by: gauravkhanna |
last post by:
Hi All
I need some help for the below problem:
Scenario
We need to send large binary files (audio file of about 10 MB or so)
from the client machine (.Net Windows based application, located
outside the home network) to the Web Server and then retrieve the file
back from the web server to the client.
|
by: Asaf |
last post by:
Hi,
I am developing a windows forms client application that will send a large
XML data to a web server using Web Services.
I saw this article http://www.codeproject.com/soap/MTOMWebServices.asp for
uploading files using MTOM.
Is there any way to transfer large data from the client to the web server
using WSE3 with chunks but instead of using files, using the memory?
|
by: =?Utf-8?B?TW9iaWxlTWFu?= |
last post by:
Hello everyone:
I am looking for everyone's thoughts on moving large amounts (actually, not
very large, but large enough that I'm throwing exceptions using the default
configurations).
We're doing a proof-of-concept on WCF whereby we have a Windows form client
and a Server. Our server is a middle-tier that interfaces with our SQL 05
database server.
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |