Darsant wrote:
wk****@yahoo.com wrote: I think there's a function like "seek" or "fseek" or something
to seek to the end of file. Either the seek function returns
a byte offset, or you call a "position" or "pos" function to
get the byte offset, which is effectively the size of the file
in bytes. You then resize the vector to
sum_of_file_sizes / sizeof(float). You should then get reasonable
performance reading in the floats one at a time using buffered
I/O.
That's what I'm doing right now.
For each file, I read the header of information to a temporary header
structure. The header tells me the amount of data in each channel. I
then reserve size+newsamples for the vector, then push_back the new
readings from the temporary read buffer.
In a typical implementation of the "vector" template, I think
there is, in an instance of vector<float>, a private pointer to
float. This private pointer points to an array in dynamically
allocated storage whose number of (float) elements is greater
than or equal to the value returned by vector<float>::size().
What happens when you call push_back, and the number of floats in
the dynamic array is exactly equal to size()? push_back will:
1) dynamically allocate another array of floats whose dimension
is size() + Delta (Delta being a positive integral constant chosen
by the STL implementation).
2) copy the old array of floats into the newly allocated array of
floats.
3) copy the float being pushed_back to the entry at offset size()
in the newly allocated array, and set size() to size()+1.
4) free the old array of floats, and set the private pointer to the
new array of floats.
the resize() member function does something similar, except that
the caller controls the new size of the private dynamically allocated
array. So that is why it help alot to get the sizes of all the files,
add them up, resize the vector to the total size, then copy the floats
in one at a time. You may avoid alot of repeated unnecessary heap
allocation (which can be slow) and copying of the private array.
To put it in high level terms, I think that the Standard says that
the worst-case time complexity of both resize and push_back is
O(n). So it makes sense to do one resize rather than n push_backs.