Hi,
I've been wrestling with a problem for some time that ought to be fairly simple, but turns out to be very difficult for me to solve. Maybe someone here knows the answer.
What I try to do is sort the records in a plain-text index file based on certain columns. The index file consists of records and fields within the records. The individual fields are separated by semicolons, the records by newlines. The index file is loaded into memory and the only thing I actually have is a char* to the beginning of the index file.
It is possible to access the individual records of a file by (say) record.getPos() and record.getLen() methods that return the position and the length of the field in that record. A record is simply a row of the index file and is loosely defined in a separate class. I am also able to retrieve the length of a complete record (which is fixed for every index file generated) so I could define new records if I want to.
Now what I would like to do is sort this index file using STL sort() and a vector containing some representation of the records in the index file. Because this index file is very large, I think pointers should be used in one way or another. Does anyone have an idea of how I can write a comparison function? I've been trying to do this, but no success. The problem is that I want the comparison function to loop through a vector of sortfields. The sortfields define the columns to be sorted on.
My idea now is to break the complete index file that is in memory (just a char* to the beginning of the file is all we have, we have to jump through the index file manually by adding the length of the record every time). That means, breaking up the file in either separate char*, record* or records... BEFORE sorting it! I guess pointers would be fastest.
The index file looks approximately like this (fields can be empty, don't worry about the semantics of the file, there are more fields but I left them out for space reasons):
1; 55000; 55000; 193400080001; 1934000800010001; ; ;1
1; 55100; 55000; 193400230001; 1934002300010001; ; ;1
1; 55120; 55000; 193400440001; 1934004400010001; ; ;1
But now: How do I define the comparison function for the STL sort. Suppose we break the file into record*'s we get something like sort(vector.begin(), vector.end(), vector_comparison)
With vector comparison being a sort of functor like
(pseudocode)
class vector_comparison
{
bool operator() (record* A, record* B)
{
if (columnA of recordA < columnA of recordB)
return true;
if (columnB of recordA < columnB of recordB)
return true;
...
else
return false; // the records are equal
}
}
The biggest problem here is that this comparison function will need to access external resources such as a list iterator over the 'columns to be sorted on', which is only made at runtime, the vector of indexfile representation record*/char*/records, which is made at runtime... I cannot write this comparison function because it needs a lot of things which are created at runtime... These are not known at compile time... Maybe I understand something incorrectly about functors and I don't even know if this problem can be solved in this way. I am just trying, it is my last effort before I give up on this problem :)
I hope my question is clear, if you need more information please let me know! I'll check back tomorrow