Connecting Tech Pros Worldwide Forums | Help | Site Map

What is the data structure & algorithm that is fastest searching & sorting.

yee young han
Guest
 
Posts: n/a
#1: Jul 23 '05
I need a fast data structure and algorithm like below condition.

(1) this data structure contain only 10,000 data entry.
(2) data structure's one entry is like below

typedef struct _DataEntry_
{
char szInput[4];
char szOutput[4];
int iSum;
} DataEntry;

(3) input data length is 100,000,000.
(4) if there is a same szInput & szOutput data in data structure when input
data, only sum iSum into found data.
(5) this data structure have only 10,000 data entry that has large number of
iSum.

if I insert one data to data structure, code like below is executed.

select data from list where szInput, szOutput;
if exist
update iSum of szInput, szOutput
sort
else
insert DataEntry
sort
end if

What is the data structure & algorithm that is fastest searching & sorting.

please~ help me~





Uenal Mutlu
Guest
 
Posts: n/a
#2: Jul 23 '05

re: What is the data structure & algorithm that is fastest searching & sorting.


"yee young han" wrote[color=blue]
> I need a fast data structure and algorithm like below condition.
>
> (1) this data structure contain only 10,000 data entry.
> (2) data structure's one entry is like below
>
> typedef struct _DataEntry_
> {
> char szInput[4];
> char szOutput[4];
> int iSum;
> } DataEntry;
>
> (3) input data length is 100,000,000.
> (4) if there is a same szInput & szOutput data in data structure when input
> data, only sum iSum into found data.
> (5) this data structure have only 10,000 data entry that has large number of
> iSum.
>
> if I insert one data to data structure, code like below is executed.
>
> select data from list where szInput, szOutput;
> if exist
> update iSum of szInput, szOutput
> sort
> else
> insert DataEntry
> sort
> end if
>
> What is the data structure & algorithm that is fastest searching & sorting.[/color]

Your current method above is indeed very inefficient.
The slowest part therein is how often you make use of sort.
I would do the sorting only once after reading all items, or
add (or update) the items directly into a binary tree (for example
an STL unique sorted associative container like a set), in both
cases after checking/handling the duplicates.




Jerry Coffin
Guest
 
Posts: n/a
#3: Jul 23 '05

re: What is the data structure & algorithm that is fastest searching & sorting.


yee young han wrote:[color=blue]
> I need a fast data structure and algorithm like below condition.
>
> (1) this data structure contain only 10,000 data entry.
> (2) data structure's one entry is like below
>
> typedef struct _DataEntry_[/color]

As an aside, names starting with an underscore, followed by a capital
letter are reserved for the implementation, so you really want to use a
different name. In C++ there's also no reason to use a typedef for this
situation anyway -- you can just use 'struct X' and be done with it.
[color=blue]
> {
> char szInput[4];
> char szOutput[4];
> int iSum;
> } DataEntry;
>
> (3) input data length is 100,000,000.
> (4) if there is a same szInput & szOutput data in data structure when
> input data, only sum iSum into found data.
> (5) this data structure have only 10,000 data entry that has large
> number of iSum.[/color]

So far, what you're really describing is what you've done so far to
attempt to solve the problem -- but you haven't really told us what
problem you're trying to solve.

As such, I'm only taking a rather wild guess at the problem, but if I'm
right, I'd structure things a bit differently:

struct key {
std::string input,
output;

key(std::string in1, std::string in2) : input(in1), output(in2) {}
bool operator<(key const &other) {
if (input >= other.input)
return false;
return output < other.output;
}

friend std::ostream &operator<<(std::ostream &os, key const &k) {
os << k.input << " " << k.output;
}
};


std::map<key, int> values;

Now, let's assume (for example) that you have a file full of records,
one record per line, formatted like:

input output number

We can then process the data like:

std::string in1, in2;
int value;

while (stream >> in1) {
stream >> in2;
stream >> value;

values[key(in1, in2)] += value;
}

Since you describe keeping things sorted, it sounds like you probably
want to be able to print out the sums in order:

typedef std::pair<key, int> Vmap;

std::ostream &operator<<(std::ostream &os, Vmap const &v) {
return os << v.first << " " << v.second;
}

std::copy(values.begin(), values.end(),
std::ostream_iterator<Vmap>(std::cout, "\n"));

--
Later,
Jerry.

The universe is a figment of its own imagination.

Closed Thread